347,398 research outputs found
Incremental generation of plural descriptions : similarity and partitioning
Approaches to plural reference generation
emphasise descriptive brevity, but often lack
empirical backing. This paper describes
a corpus-based study of plural descriptions,
and proposes a psycholinguisticallymotivated
algorithm for plural reference
generation. The descriptive strategy is based
on partitioning and incorporates corpusderived
heuristics. An exhaustive evaluation
shows that the output closely matches human
data.peer-reviewe
Morphological Priors for Probabilistic Neural Word Embeddings
Word embeddings allow natural language processing systems to share
statistical information across related words. These embeddings are typically
based on distributional statistics, making it difficult for them to generalize
to rare or unseen words. We propose to improve word embeddings by incorporating
morphological information, capturing shared sub-word features. Unlike previous
work that constructs word embeddings directly from morphemes, we combine
morphological and distributional information in a unified probabilistic
framework, in which the word embedding is a latent variable. The morphological
information provides a prior distribution on the latent word embeddings, which
in turn condition a likelihood function over an observed corpus. This approach
yields improvements on intrinsic word similarity evaluations, and also in the
downstream task of part-of-speech tagging.Comment: Appeared at the Conference on Empirical Methods in Natural Language
Processing (EMNLP 2016, Austin
LangPro: Natural Language Theorem Prover
LangPro is an automated theorem prover for natural language
(https://github.com/kovvalsky/LangPro). Given a set of premises and a
hypothesis, it is able to prove semantic relations between them. The prover is
based on a version of analytic tableau method specially designed for natural
logic. The proof procedure operates on logical forms that preserve linguistic
expressions to a large extent. %This property makes the logical forms easily
obtainable from syntactic trees. %, in particular, Combinatory Categorial
Grammar derivation trees. The nature of proofs is deductive and transparent. On
the FraCaS and SICK textual entailment datasets, the prover achieves high
results comparable to state-of-the-art.Comment: 6 pages, 8 figures, Conference on Empirical Methods in Natural
Language Processing (EMNLP) 201
A Framework for Comparing Groups of Documents
We present a general framework for comparing multiple groups of documents. A
bipartite graph model is proposed where document groups are represented as one
node set and the comparison criteria are represented as the other node set.
Using this model, we present basic algorithms to extract insights into
similarities and differences among the document groups. Finally, we demonstrate
the versatility of our framework through an analysis of NSF funding programs
for basic research.Comment: 6 pages; 2015 Conference on Empirical Methods in Natural Language
Processing (EMNLP '15
Continuous Representation of Location for Geolocation and Lexical Dialectology using Mixture Density Networks
We propose a method for embedding two-dimensional locations in a continuous
vector space using a neural network-based model incorporating mixtures of
Gaussian distributions, presenting two model variants for text-based
geolocation and lexical dialectology. Evaluated over Twitter data, the proposed
model outperforms conventional regression-based geolocation and provides a
better estimate of uncertainty. We also show the effectiveness of the
representation for predicting words from location in lexical dialectology, and
evaluate it using the DARE dataset.Comment: Conference on Empirical Methods in Natural Language Processing (EMNLP
2017) September 2017, Copenhagen, Denmar
Deep Joint Entity Disambiguation with Local Neural Attention
We propose a novel deep learning model for joint document-level entity
disambiguation, which leverages learned neural representations. Key components
are entity embeddings, a neural attention mechanism over local context windows,
and a differentiable joint inference stage for disambiguation. Our approach
thereby combines benefits of deep learning with more traditional approaches
such as graphical models and probabilistic mention-entity maps. Extensive
experiments show that we are able to obtain competitive or state-of-the-art
accuracy at moderate computational costs.Comment: Conference on Empirical Methods in Natural Language Processing
(EMNLP) 2017 long pape
- …