432 research outputs found
Transfer Learning for Speech and Language Processing
Transfer learning is a vital technique that generalizes models trained for
one setting or task to other settings or tasks. For example in speech
recognition, an acoustic model trained for one language can be used to
recognize speech in another language, with little or no re-training data.
Transfer learning is closely related to multi-task learning (cross-lingual vs.
multilingual), and is traditionally studied in the name of `model adaptation'.
Recent advance in deep learning shows that transfer learning becomes much
easier and more effective with high-level abstract features learned by deep
models, and the `transfer' can be conducted not only between data distributions
and data types, but also between model structures (e.g., shallow nets and deep
nets) or even model types (e.g., Bayesian models and neural models). This
review paper summarizes some recent prominent research towards this direction,
particularly for speech and language processing. We also report some results
from our group and highlight the potential of this very interesting research
field.Comment: 13 pages, APSIPA 201
Multi-task Learning by Leveraging the Semantic Information
One crucial objective of multi-task learning is to align distributions across
tasks so that the information between them can be transferred and shared.
However, existing approaches only focused on matching the marginal feature
distribution while ignoring the semantic information, which may hinder the
learning performance. To address this issue, we propose to leverage the label
information in multi-task learning by exploring the semantic conditional
relations among tasks. We first theoretically analyze the generalization bound
of multi-task learning based on the notion of Jensen-Shannon divergence, which
provides new insights into the value of label information in multi-task
learning. Our analysis also leads to a concrete algorithm that jointly matches
the semantic distribution and controls label distribution divergence. To
confirm the effectiveness of the proposed method, we first compare the
algorithm with several baselines on some benchmarks and then test the
algorithms under label space shift conditions. Empirical results demonstrate
that the proposed method could outperform most baselines and achieve
state-of-the-art performance, particularly showing the benefits under the label
shift conditions
CoNet: Collaborative Cross Networks for Cross-Domain Recommendation
The cross-domain recommendation technique is an effective way of alleviating
the data sparse issue in recommender systems by leveraging the knowledge from
relevant domains. Transfer learning is a class of algorithms underlying these
techniques. In this paper, we propose a novel transfer learning approach for
cross-domain recommendation by using neural networks as the base model. In
contrast to the matrix factorization based cross-domain techniques, our method
is deep transfer learning, which can learn complex user-item interaction
relationships. We assume that hidden layers in two base networks are connected
by cross mappings, leading to the collaborative cross networks (CoNet). CoNet
enables dual knowledge transfer across domains by introducing cross connections
from one base network to another and vice versa. CoNet is achieved in
multi-layer feedforward networks by adding dual connections and joint loss
functions, which can be trained efficiently by back-propagation. The proposed
model is thoroughly evaluated on two large real-world datasets. It outperforms
baselines by relative improvements of 7.84\% in NDCG. We demonstrate the
necessity of adaptively selecting representations to transfer. Our model can
reduce tens of thousands training examples comparing with non-transfer methods
and still has the competitive performance with them.Comment: Deep transfer learning for recommender system
Joint learning from multiple information sources for biological problems
Thanks to technological advancements, more and more biological data havebeen generated in recent years. Data availability offers unprecedented opportunities to look at the same problem from multiple aspects. It also unveils a more global view of the problem that takes into account the intricated inter-play between the involved molecules/entities. Nevertheless, biological datasets are biased, limited in quantity, and contain many false-positive samples. Such challenges often drastically downgrade the performance of a predictive model on unseen data and, thus, limit its applicability in real biological studies.
Human learning is a multi-stage process in which we usually start with simple things. Through the accumulated knowledge over time, our cognition ability extends to more complex concepts. Children learn to speak simple words before being able to formulate sentences. Similarly, being able to speak correct sentences supports our learning to speak correct and meaningful paragraphs, etc. Generally, knowledge acquired from related learning tasks would help boost our learning capability in the current task. Motivated by such a phenomenon, in this thesis, we study supervised machine learning models for bioinformatics problems that can improve their performance through exploiting multiple related knowledge sources. More specifically, we concern with ways to enrich the supervised models’ knowledge base with publicly available related data to enhance the computational models’ prediction performance.
Our work shares commonality with existing works in multimodal learning, multi-task learning, and transfer learning. Nevertheless, there are certain differences in some cases. Besides the proposed architectures, we present large-scale experiment setups with consensus evaluation metrics along with the creation and release of large datasets to showcase our approaches’ superiority. Moreover, we add case studies with detailed analyses in which we place no simplified assumptions to demonstrate the systems’ utilities in realistic application scenarios. Finally, we develop and make available an easy-to-use website for non-expert users to query the model’s generated prediction results to facilitate field experts’ assessments and adaptation. We believe that our work serves as one of the first steps in bridging the gap between “Computer Science” and “Biology” that will open a new era of fruitful collaboration between computer scientists and biological field experts
- …