6,487 research outputs found
Cross-language Learning with Adversarial Neural Networks: Application to Community Question Answering
We address the problem of cross-language adaptation for question-question
similarity reranking in community question answering, with the objective to
port a system trained on one input language to another input language given
labeled training data for the first language and only unlabeled data for the
second language. In particular, we propose to use adversarial training of
neural networks to learn high-level features that are discriminative for the
main learning task, and at the same time are invariant across the input
languages. The evaluation results show sizable improvements for our
cross-language adversarial neural network (CLANN) model over a strong
non-adversarial system.Comment: CoNLL-2017: The SIGNLL Conference on Computational Natural Language
Learning; cross-language adversarial neural network (CLANN) model;
adversarial training; cross-language adaptation; community question
answering; question-question similarit
A Call for More Rigor in Unsupervised Cross-lingual Learning
We review motivations, definition, approaches, and methodology for
unsupervised cross-lingual learning and call for a more rigorous position in
each of them. An existing rationale for such research is based on the lack of
parallel data for many of the world's languages. However, we argue that a
scenario without any parallel data and abundant monolingual data is unrealistic
in practice. We also discuss different training signals that have been used in
previous work, which depart from the pure unsupervised setting. We then
describe common methodological issues in tuning and evaluation of unsupervised
cross-lingual models and present best practices. Finally, we provide a unified
outlook for different types of research in this area (i.e., cross-lingual word
embeddings, deep multilingual pretraining, and unsupervised machine
translation) and argue for comparable evaluation of these models.Comment: ACL 202
Enhancing Answer Boundary Detection for Multilingual Machine Reading Comprehension
Multilingual pre-trained models could leverage the training data from a rich
source language (such as English) to improve performance on low resource
languages. However, the transfer quality for multilingual Machine Reading
Comprehension (MRC) is significantly worse than sentence classification tasks
mainly due to the requirement of MRC to detect the word level answer boundary.
In this paper, we propose two auxiliary tasks in the fine-tuning stage to
create additional phrase boundary supervision: (1) A mixed MRC task, which
translates the question or passage to other languages and builds cross-lingual
question-passage pairs; (2) A language-agnostic knowledge masking task by
leveraging knowledge phrases mined from web. Besides, extensive experiments on
two cross-lingual MRC datasets show the effectiveness of our proposed approach.Comment: Accepted to ACL 202
Comparative Study of Machine Learning Models and BERT on SQuAD
This study aims to provide a comparative analysis of performance of certain
models popular in machine learning and the BERT model on the Stanford Question
Answering Dataset (SQuAD). The analysis shows that the BERT model, which was
once state-of-the-art on SQuAD, gives higher accuracy in comparison to other
models. However, BERT requires a greater execution time even when only 100
samples are used. This shows that with increasing accuracy more amount of time
is invested in training the data. Whereas in case of preliminary machine
learning models, execution time for full data is lower but accuracy is
compromised
Neural Machine Translation for Query Construction and Composition
Research on question answering with knowledge base has recently seen an
increasing use of deep architectures. In this extended abstract, we study the
application of the neural machine translation paradigm for question parsing. We
employ a sequence-to-sequence model to learn graph patterns in the SPARQL graph
query language and their compositions. Instead of inducing the programs through
question-answer pairs, we expect a semi-supervised approach, where alignments
between questions and queries are built through templates. We argue that the
coverage of language utterances can be expanded using late notable works in
natural language generation.Comment: ICML workshop on Neural Abstract Machines & Program Induction v2
(NAMPI), extended abstrac
Multilingual Extractive Reading Comprehension by Runtime Machine Translation
Despite recent work in Reading Comprehension (RC), progress has been mostly
limited to English due to the lack of large-scale datasets in other languages.
In this work, we introduce the first RC system for languages without RC
training data. Given a target language without RC training data and a pivot
language with RC training data (e.g. English), our method leverages existing RC
resources in the pivot language by combining a competitive RC model in the
pivot language with an attentive Neural Machine Translation (NMT) model. We
first translate the data from the target to the pivot language, and then obtain
an answer using the RC model in the pivot language. Finally, we recover the
corresponding answer in the original language using soft-alignment attention
scores from the NMT model. We create evaluation sets of RC data in two
non-English languages, namely Japanese and French, to evaluate our method.
Experimental results on these datasets show that our method significantly
outperforms a back-translation baseline of a state-of-the-art product-level
machine translation system
Vision as an Interlingua: Learning Multilingual Semantic Embeddings of Untranscribed Speech
In this paper, we explore the learning of neural network embeddings for
natural images and speech waveforms describing the content of those images.
These embeddings are learned directly from the waveforms without the use of
linguistic transcriptions or conventional speech recognition technology. While
prior work has investigated this setting in the monolingual case using English
speech data, this work represents the first effort to apply these techniques to
languages beyond English. Using spoken captions collected in English and Hindi,
we show that the same model architecture can be successfully applied to both
languages. Further, we demonstrate that training a multilingual model
simultaneously on both languages offers improved performance over the
monolingual models. Finally, we show that these models are capable of
performing semantic cross-lingual speech-to-speech retrieval.Comment: to appear at ICASSP 201
A Joint Model for Question Answering and Question Generation
We propose a generative machine comprehension model that learns jointly to
ask and answer questions based on documents. The proposed model uses a
sequence-to-sequence framework that encodes the document and generates a
question (answer) given an answer (question). Significant improvement in model
performance is observed empirically on the SQuAD corpus, confirming our
hypothesis that the model benefits from jointly learning to perform both tasks.
We believe the joint model's novelty offers a new perspective on machine
comprehension beyond architectural engineering, and serves as a first step
towards autonomous information seeking
Learning to Represent Words in Context with Multilingual Supervision
We present a neural network architecture based on bidirectional LSTMs to
compute representations of words in the sentential contexts. These
context-sensitive word representations are suitable for, e.g., distinguishing
different word senses and other context-modulated variations in meaning. To
learn the parameters of our model, we use cross-lingual supervision,
hypothesizing that a good representation of a word in context will be one that
is sufficient for selecting the correct translation into a second language. We
evaluate the quality of our representations as features in three downstream
tasks: prediction of semantic supersenses (which assign nouns and verbs into a
few dozen semantic classes), low resource machine translation, and a lexical
substitution task, and obtain state-of-the-art results on all of these
Deep Learning for Sentiment Analysis : A Survey
Deep learning has emerged as a powerful machine learning technique that
learns multiple layers of representations or features of the data and produces
state-of-the-art prediction results. Along with the success of deep learning in
many other application domains, deep learning is also popularly used in
sentiment analysis in recent years. This paper first gives an overview of deep
learning and then provides a comprehensive survey of its current applications
in sentiment analysis.Comment: 34 pages, 9 figures, 2 table
- …