9,562 research outputs found
Exploring Multilingual Syntactic Sentence Representations
We study methods for learning sentence embeddings with syntactic structure.
We focus on methods of learning syntactic sentence-embeddings by using a
multilingual parallel-corpus augmented by Universal Parts-of-Speech tags. We
evaluate the quality of the learned embeddings by examining sentence-level
nearest neighbours and functional dissimilarity in the embedding space. We also
evaluate the ability of the method to learn syntactic sentence-embeddings for
low-resource languages and demonstrate strong evidence for transfer learning.
Our results show that syntactic sentence-embeddings can be learned while using
less training data, fewer model parameters, and resulting in better evaluation
metrics than state-of-the-art language models
Encodings of Source Syntax: Similarities in NMT Representations Across Target Languages
We train neural machine translation (NMT) models from English to six target
languages, using NMT encoder representations to predict ancestor constituent
labels of source language words. We find that NMT encoders learn similar source
syntax regardless of NMT target language, relying on explicit morphosyntactic
cues to extract syntactic features from source sentences. Furthermore, the NMT
encoders outperform RNNs trained directly on several of the constituent label
prediction tasks, suggesting that NMT encoder representations can be used
effectively for natural language tasks involving syntax. However, both the NMT
encoders and the directly-trained RNNs learn substantially different syntactic
information from a probabilistic context-free grammar (PCFG) parser. Despite
lower overall accuracy scores, the PCFG often performs well on sentences for
which the RNN-based models perform poorly, suggesting that RNN architectures
are constrained in the types of syntax they can learn.Comment: To appear at the 5th Workshop on Representation Learning for NL
LINSPECTOR: Multilingual Probing Tasks for Word Representations
Despite an ever growing number of word representation models introduced for a
large number of languages, there is a lack of a standardized technique to
provide insights into what is captured by these models. Such insights would
help the community to get an estimate of the downstream task performance, as
well as to design more informed neural architectures, while avoiding extensive
experimentation which requires substantial computational resources not all
researchers have access to. A recent development in NLP is to use simple
classification tasks, also called probing tasks, that test for a single
linguistic feature such as part-of-speech. Existing studies mostly focus on
exploring the linguistic information encoded by the continuous representations
of English text. However, from a typological perspective the morphologically
poor English is rather an outlier: the information encoded by the word order
and function words in English is often stored on a morphological level in other
languages. To address this, we introduce 15 type-level probing tasks such as
case marking, possession, word length, morphological tag count and pseudoword
identification for 24 languages. We present a reusable methodology for creation
and evaluation of such tests in a multilingual setting. We then present
experiments on several diverse multilingual word embedding models, in which we
relate the probing task performance for a diverse set of languages to a range
of five classic NLP tasks: POS-tagging, dependency parsing, semantic role
labeling, named entity recognition and natural language inference. We find that
a number of probing tests have significantly high positive correlation to the
downstream tasks, especially for morphologically rich languages. We show that
our tests can be used to explore word embeddings or black-box neural models for
linguistic cues in a multilingual setting.Comment: Demo is available from:
https://linspector.ukp.informatik.tu-darmstadt.de
Multilingual Chart-based Constituency Parse Extraction from Pre-trained Language Models
As it has been unveiled that pre-trained language models (PLMs) are to some
extent capable of recognizing syntactic concepts in natural language, much
effort has been made to develop a method for extracting complete (binary)
parses from PLMs without training separate parsers. We improve upon this
paradigm by proposing a novel chart-based method and an effective top-K
ensemble technique. Moreover, we demonstrate that we can broaden the scope of
application of the approach into multilingual settings. Specifically, we show
that by applying our method on multilingual PLMs, it becomes possible to induce
non-trivial parses for sentences from nine languages in an integrated and
language-agnostic manner, attaining performance superior or comparable to that
of unsupervised PCFGs. We also verify that our approach is robust to
cross-lingual transfer. Finally, we provide analyses on the inner workings of
our method. For instance, we discover universal attention heads which are
consistently sensitive to syntactic information irrespective of the input
language.Comment: preprin
Dependency-based Hybrid Trees for Semantic Parsing
We propose a novel dependency-based hybrid tree model for semantic parsing,
which converts natural language utterance into machine interpretable meaning
representations. Unlike previous state-of-the-art models, the semantic
information is interpreted as the latent dependency between the natural
language words in our joint representation. Such dependency information can
capture the interactions between the semantics and natural language words. We
integrate a neural component into our model and propose an efficient
dynamic-programming algorithm to perform tractable inference. Through extensive
experiments on the standard multilingual GeoQuery dataset with eight languages,
we demonstrate that our proposed approach is able to achieve state-of-the-art
performance across several languages. Analysis also justifies the effectiveness
of using our new dependency-based representation.Comment: Accepted by EMNLP 201
NLG vs. Templates
One of the most important questions in applied NLG is what benefits (or
`value-added', in business-speak) NLG technology offers over template-based
approaches. Despite the importance of this question to the applied NLG
community, however, it has not been discussed much in the research NLG
community, which I think is a pity. In this paper, I try to summarize the
issues involved and recap current thinking on this topic. My goal is not to
answer this question (I don't think we know enough to be able to do so), but
rather to increase the visibility of this issue in the research community, in
the hope of getting some input and ideas on this very important question. I
conclude with a list of specific research areas I would like to see more work
in, because I think they would increase the `value-added' of NLG over
templates.Comment: Uuencoded compressed tar file, containing LaTeX source and a style
file. This paper will appear in the 1995 European NL Generation Worksho
A Cross-Architecture Instruction Embedding Model for Natural Language Processing-Inspired Binary Code Analysis
Given a closed-source program, such as most of proprietary software and
viruses, binary code analysis is indispensable for many tasks, such as code
plagiarism detection and malware analysis. Today, source code is very often
compiled for various architectures, making cross-architecture binary code
analysis increasingly important. A binary, after being disassembled, is
expressed in an assembly languages. Thus, recent work starts exploring Natural
Language Processing (NLP) inspired binary code analysis. In NLP, words are
usually represented in high-dimensional vectors (i.e., embeddings) to
facilitate further processing, which is one of the most common and critical
steps in many NLP tasks. We regard instructions as words in NLP-inspired binary
code analysis, and aim to represent instructions as embeddings as well.
To facilitate cross-architecture binary code analysis, our goal is that
similar instructions, regardless of their architectures, have embeddings close
to each other. To this end, we propose a joint learning approach to generating
instruction embeddings that capture not only the semantics of instructions
within an architecture, but also their semantic relationships across
architectures. To the best of our knowledge, this is the first work on building
cross-architecture instruction embedding model. As a showcase, we apply the
model to resolving one of the most fundamental problems for binary code
similarity comparison---semantics-based basic block comparison, and the
solution outperforms the code statistics based approach. It demonstrates that
it is promising to apply the model to other cross-architecture binary code
analysis tasks.Comment: 8 pages, 5 figure
Beneath the Tip of the Iceberg: Current Challenges and New Directions in Sentiment Analysis Research
Sentiment analysis as a field has come a long way since it was first
introduced as a task nearly 20 years ago. It has widespread commercial
applications in various domains like marketing, risk management, market
research, and politics, to name a few. Given its saturation in specific
subtasks -- such as sentiment polarity classification -- and datasets, there is
an underlying perception that this field has reached its maturity. In this
article, we discuss this perception by pointing out the shortcomings and
under-explored, yet key aspects of this field that are necessary to attain true
sentiment understanding. We analyze the significant leaps responsible for its
current relevance. Further, we attempt to chart a possible course for this
field that covers many overlooked and unanswered questions.Comment: Published in the IEEE Transactions on Affective Computing (TAFFC
Deep Learning for Sentiment Analysis : A Survey
Deep learning has emerged as a powerful machine learning technique that
learns multiple layers of representations or features of the data and produces
state-of-the-art prediction results. Along with the success of deep learning in
many other application domains, deep learning is also popularly used in
sentiment analysis in recent years. This paper first gives an overview of deep
learning and then provides a comprehensive survey of its current applications
in sentiment analysis.Comment: 34 pages, 9 figures, 2 table
A Brief Survey of Multilingual Neural Machine Translation
We present a survey on multilingual neural machine translation (MNMT), which
has gained a lot of traction in the recent years. MNMT has been useful in
improving translation quality as a result of knowledge transfer. MNMT is more
promising and interesting than its statistical machine translation counterpart
because end-to-end modeling and distributed representations open new avenues.
Many approaches have been proposed in order to exploit multilingual parallel
corpora for improving translation quality. However, the lack of a comprehensive
survey makes it difficult to determine which approaches are promising and hence
deserve further exploration. In this paper, we present an in-depth survey of
existing literature on MNMT. We categorize various approaches based on the
resource scenarios as well as underlying modeling principles. We hope this
paper will serve as a starting point for researchers and engineers interested
in MNMT.Comment: We have substantially expanded this paper for a journal submission to
computing surveys [arXiv:2001.01115
- …