1,010,970 research outputs found
Cross-Task Transfer for Geotagged Audiovisual Aerial Scene Recognition
Aerial scene recognition is a fundamental task in remote sensing and has
recently received increased interest. While the visual information from
overhead images with powerful models and efficient algorithms yields
considerable performance on scene recognition, it still suffers from the
variation of ground objects, lighting conditions etc. Inspired by the
multi-channel perception theory in cognition science, in this paper, for
improving the performance on the aerial scene recognition, we explore a novel
audiovisual aerial scene recognition task using both images and sounds as
input. Based on an observation that some specific sound events are more likely
to be heard at a given geographic location, we propose to exploit the knowledge
from the sound events to improve the performance on the aerial scene
recognition. For this purpose, we have constructed a new dataset named AuDio
Visual Aerial sceNe reCognition datasEt (ADVANCE). With the help of this
dataset, we evaluate three proposed approaches for transferring the sound event
knowledge to the aerial scene recognition task in a multimodal learning
framework, and show the benefit of exploiting the audio information for the
aerial scene recognition. The source code is publicly available for
reproducibility purposes.Comment: ECCV 202
Transfer Learning for Speech and Language Processing
Transfer learning is a vital technique that generalizes models trained for
one setting or task to other settings or tasks. For example in speech
recognition, an acoustic model trained for one language can be used to
recognize speech in another language, with little or no re-training data.
Transfer learning is closely related to multi-task learning (cross-lingual vs.
multilingual), and is traditionally studied in the name of `model adaptation'.
Recent advance in deep learning shows that transfer learning becomes much
easier and more effective with high-level abstract features learned by deep
models, and the `transfer' can be conducted not only between data distributions
and data types, but also between model structures (e.g., shallow nets and deep
nets) or even model types (e.g., Bayesian models and neural models). This
review paper summarizes some recent prominent research towards this direction,
particularly for speech and language processing. We also report some results
from our group and highlight the potential of this very interesting research
field.Comment: 13 pages, APSIPA 201
Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT
Pretrained contextual representation models (Peters et al., 2018; Devlin et
al., 2018) have pushed forward the state-of-the-art on many NLP tasks. A new
release of BERT (Devlin, 2018) includes a model simultaneously pretrained on
104 languages with impressive performance for zero-shot cross-lingual transfer
on a natural language inference task. This paper explores the broader
cross-lingual potential of mBERT (multilingual) as a zero shot language
transfer model on 5 NLP tasks covering a total of 39 languages from various
language families: NLI, document classification, NER, POS tagging, and
dependency parsing. We compare mBERT with the best-published methods for
zero-shot cross-lingual transfer and find mBERT competitive on each task.
Additionally, we investigate the most effective strategy for utilizing mBERT in
this manner, determine to what extent mBERT generalizes away from language
specific features, and measure factors that influence cross-lingual transfer.Comment: EMNLP 2019 Camera Read
- …
