4 research outputs found
Contextual compositionality detection with external knowledge bases and word embeddings
When the meaning of a phrase cannot be inferred from the individual meanings of its words (e.g., hot dog), that phrase is said to be non-compositional. Automatic compositionality detection in multiword phrases is critical in any application of semantic processing, such as search engines [9]; failing to detect non-compositional phrases can hurt system effectiveness notably. Existing research treats phrases as either compositional or non-compositional in a deterministic manner. In this paper, we operationalize the viewpoint that compositionality is contextual rather than deterministic, i.e., that whether a phrase is compositional or non-compositional depends on its context. For example, the phrase \ufffdgreen card\ufffd is compositional when referring to a green colored card, whereas it is non-compositional when meaning permanent residence authorization. We address the challenge of detecting this type of contextual compositionality as follows: given a multi-word phrase, we enrich the word embedding representing its semantics with evidence about its global context (terms it often collocates with) as well as its local context (narratives where that phrase is used, which we call usage scenarios). We further extend this representation with information extracted from external knowledge bases. The resulting representation incorporates both localized context and more general usage of the phrase and allows to detect its compositionality in a non-deterministic and contextual way. Empirical evaluation of our model on a dataset of phrase compositionality1, manually collected by crowdsourcing contextual compositionality assessments, shows that our model outperforms state-of-the-art baselines notably on detecting phrase compositionality
To Phrase or Not to Phrase – Impact of User versus System Term Dependence upon Retrieval
When submitting queries to information retrieval (IR) systems, users often
have the option of specifying which, if any, of the query terms are heavily
dependent on each other and should be treated as a fixed phrase, for instance
by placing them between quotes. In addition to such cases where users specify
term dependence, automatic ways also exist for IR systems to detect dependent
terms in queries. Most IR systems use both user and algorithmic approaches. It
is not however clear whether and to what extent user-defined term dependence
agrees with algorithmic estimates of term dependence, nor which of the two may
fetch higher performance gains. Simply put, is it better to trust users or the
system to detect term dependence in queries? To answer this question, we
experiment with 101 crowdsourced search engine users and 334 queries (52 train
and 282 test TREC queries) and we record 10 assessments per query. We find that
(i) user assessments of term dependence differ significantly from algorithmic
assessments of term dependence (their overlap is approximately 30%); (ii) there
is little agreement among users about term dependence in queries, and this
disagreement increases as queries become longer; (iii) the potential retrieval
gain that can be fetched by treating term dependence (both user- and
system-defined) over a bag of words baseline is reserved to a small subset
(approxi-mately 8%) of the queries, and is much higher for low-depth than deep
preci-sion measures. Points (ii) and (iii) constitute novel insights into term
dependence
Semantic Representation and Inference for NLP
Semantic representation and inference is essential for Natural Language
Processing (NLP). The state of the art for semantic representation and
inference is deep learning, and particularly Recurrent Neural Networks (RNNs),
Convolutional Neural Networks (CNNs), and transformer Self-Attention models.
This thesis investigates the use of deep learning for novel semantic
representation and inference, and makes contributions in the following three
areas: creating training data, improving semantic representations and extending
inference learning. In terms of creating training data, we contribute the
largest publicly available dataset of real-life factual claims for the purpose
of automatic claim verification (MultiFC), and we present a novel inference
model composed of multi-scale CNNs with different kernel sizes that learn from
external sources to infer fact checking labels. In terms of improving semantic
representations, we contribute a novel model that captures non-compositional
semantic indicators. By definition, the meaning of a non-compositional phrase
cannot be inferred from the individual meanings of its composing words (e.g.,
hot dog). Motivated by this, we operationalize the compositionality of a phrase
contextually by enriching the phrase representation with external word
embeddings and knowledge graphs. Finally, in terms of inference learning, we
propose a series of novel deep learning architectures that improve inference by
using syntactic dependencies, by ensembling role guided attention heads,
incorporating gating layers, and concatenating multiple heads in novel and
effective ways. This thesis consists of seven publications (five published and
two under review).Comment: PhD thesis, the University of Copenhage