45,617 research outputs found
Recent Advances in Transfer Learning for Cross-Dataset Visual Recognition: A Problem-Oriented Perspective
This paper takes a problem-oriented perspective and presents a comprehensive
review of transfer learning methods, both shallow and deep, for cross-dataset
visual recognition. Specifically, it categorises the cross-dataset recognition
into seventeen problems based on a set of carefully chosen data and label
attributes. Such a problem-oriented taxonomy has allowed us to examine how
different transfer learning approaches tackle each problem and how well each
problem has been researched to date. The comprehensive problem-oriented review
of the advances in transfer learning with respect to the problem has not only
revealed the challenges in transfer learning for visual recognition, but also
the problems (e.g. eight of the seventeen problems) that have been scarcely
studied. This survey not only presents an up-to-date technical review for
researchers, but also a systematic approach and a reference for a machine
learning practitioner to categorise a real problem and to look up for a
possible solution accordingly
Aligned Image-Word Representations Improve Inductive Transfer Across Vision-Language Tasks
An important goal of computer vision is to build systems that learn visual
representations over time that can be applied to many tasks. In this paper, we
investigate a vision-language embedding as a core representation and show that
it leads to better cross-task transfer than standard multi-task learning. In
particular, the task of visual recognition is aligned to the task of visual
question answering by forcing each to use the same word-region embeddings. We
show this leads to greater inductive transfer from recognition to VQA than
standard multitask learning. Visual recognition also improves, especially for
categories that have relatively few recognition training labels but appear
often in the VQA setting. Thus, our paper takes a small step towards creating
more general vision systems by showing the benefit of interpretable, flexible,
and trainable core representations.Comment: Accepted in ICCV 2017. The arxiv version has an extra analysis on
correlation with human attentio
Non-native children speech recognition through transfer learning
This work deals with non-native children's speech and investigates both
multi-task and transfer learning approaches to adapt a multi-language Deep
Neural Network (DNN) to speakers, specifically children, learning a foreign
language. The application scenario is characterized by young students learning
English and German and reading sentences in these second-languages, as well as
in their mother language. The paper analyzes and discusses techniques for
training effective DNN-based acoustic models starting from children native
speech and performing adaptation with limited non-native audio material. A
multi-lingual model is adopted as baseline, where a common phonetic lexicon,
defined in terms of the units of the International Phonetic Alphabet (IPA), is
shared across the three languages at hand (Italian, German and English); DNN
adaptation methods based on transfer learning are evaluated on significant
non-native evaluation sets. Results show that the resulting non-native models
allow a significant improvement with respect to a mono-lingual system adapted
to speakers of the target language
Listening between the Lines: Learning Personal Attributes from Conversations
Open-domain dialogue agents must be able to converse about many topics while
incorporating knowledge about the user into the conversation. In this work we
address the acquisition of such knowledge, for personalization in downstream
Web applications, by extracting personal attributes from conversations. This
problem is more challenging than the established task of information extraction
from scientific publications or Wikipedia articles, because dialogues often
give merely implicit cues about the speaker. We propose methods for inferring
personal attributes, such as profession, age or family status, from
conversations using deep learning. Specifically, we propose several Hidden
Attribute Models, which are neural networks leveraging attention mechanisms and
embeddings. Our methods are trained on a per-predicate basis to output rankings
of object values for a given subject-predicate combination (e.g., ranking the
doctor and nurse professions high when speakers talk about patients, emergency
rooms, etc). Experiments with various conversational texts including Reddit
discussions, movie scripts and a collection of crowdsourced personal dialogues
demonstrate the viability of our methods and their superior performance
compared to state-of-the-art baselines.Comment: published in WWW'1
Unsupervised Controllable Text Formalization
We propose a novel framework for controllable natural language
transformation. Realizing that the requirement of parallel corpus is
practically unsustainable for controllable generation tasks, an unsupervised
training scheme is introduced. The crux of the framework is a deep neural
encoder-decoder that is reinforced with text-transformation knowledge through
auxiliary modules (called scorers). The scorers, based on off-the-shelf
language processing tools, decide the learning scheme of the encoder-decoder
based on its actions. We apply this framework for the text-transformation task
of formalizing an input text by improving its readability grade; the degree of
required formalization can be controlled by the user at run-time. Experiments
on public datasets demonstrate the efficacy of our model towards: (a)
transforming a given text to a more formal style, and (b) introducing
appropriate amount of formalness in the output text pertaining to the input
control. Our code and datasets are released for academic use.Comment: AAA
- …