175 research outputs found
Lightweight and Efficient Neural Natural Language Processing with Quaternion Networks
Many state-of-the-art neural models for NLP are heavily parameterized and
thus memory inefficient. This paper proposes a series of lightweight and memory
efficient neural architectures for a potpourri of natural language processing
(NLP) tasks. To this end, our models exploit computation using Quaternion
algebra and hypercomplex spaces, enabling not only expressive inter-component
interactions but also significantly () reduced parameter size due to
lesser degrees of freedom in the Hamilton product. We propose Quaternion
variants of models, giving rise to new architectures such as the Quaternion
attention Model and Quaternion Transformer. Extensive experiments on a battery
of NLP tasks demonstrates the utility of proposed Quaternion-inspired models,
enabling up to reduction in parameter size without significant loss in
performance.Comment: ACL 201
Attention in Natural Language Processing
Attention is an increasingly popular mechanism used in a wide range of neural architectures. The mechanism itself has been realized in a variety of formats. However, because of the fast-paced advances in this domain, a systematic overview of attention is still missing. In this article, we define a unified model for attention architectures in natural language processing, with a focus on those designed to work with vector representations of the textual data. We propose a taxonomy of attention models according to four dimensions: the representation of the input, the compatibility function, the distribution function, and the multiplicity of the input and/or output. We present the examples of how prior information can be exploited in attention models and discuss ongoing research efforts and open challenges in the area, providing the first extensive categorization of the vast body of literature in this exciting domain
Entity-Assisted Language Models for Identifying Check-worthy Sentences
We propose a new uniform framework for text classification and ranking that
can automate the process of identifying check-worthy sentences in political
debates and speech transcripts. Our framework combines the semantic analysis of
the sentences, with additional entity embeddings obtained through the
identified entities within the sentences. In particular, we analyse the
semantic meaning of each sentence using state-of-the-art neural language models
such as BERT, ALBERT, and RoBERTa, while embeddings for entities are obtained
from knowledge graph (KG) embedding models. Specifically, we instantiate our
framework using five different language models, entity embeddings obtained from
six different KG embedding models, as well as two combination methods leading
to several Entity-Assisted neural language models. We extensively evaluate the
effectiveness of our framework using two publicly available datasets from the
CLEF' 2019 & 2020 CheckThat! Labs. Our results show that the neural language
models significantly outperform traditional TF.IDF and LSTM methods. In
addition, we show that the ALBERT model is consistently the most effective
model among all the tested neural language models. Our entity embeddings
significantly outperform other existing approaches from the literature that are
based on similarity and relatedness scores between the entities in a sentence,
when used alongside a KG embedding.Comment: 22 pages, 15 tables, 3 figure
- …