8 research outputs found
Neural Speed Reading with Structural-Jump-LSTM
Recurrent neural networks (RNNs) can model natural language by sequentially
'reading' input tokens and outputting a distributed representation of each
token. Due to the sequential nature of RNNs, inference time is linearly
dependent on the input length, and all inputs are read regardless of their
importance. Efforts to speed up this inference, known as 'neural speed
reading', either ignore or skim over part of the input. We present
Structural-Jump-LSTM: the first neural speed reading model to both skip and
jump text during inference. The model consists of a standard LSTM and two
agents: one capable of skipping single words when reading, and one capable of
exploiting punctuation structure (sub-sentence separators (,:), sentence end
symbols (.!?), or end of text markers) to jump ahead after reading a word. A
comprehensive experimental evaluation of our model against all five
state-of-the-art neural reading models shows that Structural-Jump-LSTM achieves
the best overall floating point operations (FLOP) reduction (hence is faster),
while keeping the same accuracy or even improving it compared to a vanilla LSTM
that reads the whole text.Comment: 10 page
Plugin Speech Enhancement: A Universal Speech Enhancement Framework Inspired by Dynamic Neural Network
The expectation to deploy a universal neural network for speech enhancement,
with the aim of improving noise robustness across diverse speech processing
tasks, faces challenges due to the existing lack of awareness within static
speech enhancement frameworks regarding the expected speech in downstream
modules. These limitations impede the effectiveness of static speech
enhancement approaches in achieving optimal performance for a range of speech
processing tasks, thereby challenging the notion of universal applicability.
The fundamental issue in achieving universal speech enhancement lies in
effectively informing the speech enhancement module about the features of
downstream modules. In this study, we present a novel weighting prediction
approach, which explicitly learns the task relationships from downstream
training information to address the core challenge of universal speech
enhancement. We found the role of deciding whether to employ data augmentation
techniques as crucial downstream training information. This decision
significantly impacts the expected speech and the performance of the speech
enhancement module. Moreover, we introduce a novel speech enhancement network,
the Plugin Speech Enhancement (Plugin-SE). The Plugin-SE is a dynamic neural
network that includes the speech enhancement module, gate module, and weight
prediction module. Experimental results demonstrate that the proposed Plugin-SE
approach is competitive or superior to other joint training methods across
various downstream tasks
Contextual compositionality detection with external knowledge bases and word embeddings
When the meaning of a phrase cannot be inferred from the individual meanings of its words (e.g., hot dog), that phrase is said to be non-compositional. Automatic compositionality detection in multiword phrases is critical in any application of semantic processing, such as search engines [9]; failing to detect non-compositional phrases can hurt system effectiveness notably. Existing research treats phrases as either compositional or non-compositional in a deterministic manner. In this paper, we operationalize the viewpoint that compositionality is contextual rather than deterministic, i.e., that whether a phrase is compositional or non-compositional depends on its context. For example, the phrase \ufffdgreen card\ufffd is compositional when referring to a green colored card, whereas it is non-compositional when meaning permanent residence authorization. We address the challenge of detecting this type of contextual compositionality as follows: given a multi-word phrase, we enrich the word embedding representing its semantics with evidence about its global context (terms it often collocates with) as well as its local context (narratives where that phrase is used, which we call usage scenarios). We further extend this representation with information extracted from external knowledge bases. The resulting representation incorporates both localized context and more general usage of the phrase and allows to detect its compositionality in a non-deterministic and contextual way. Empirical evaluation of our model on a dataset of phrase compositionality1, manually collected by crowdsourcing contextual compositionality assessments, shows that our model outperforms state-of-the-art baselines notably on detecting phrase compositionality
Interactive Machine Comprehension with Information Seeking Agents
Existing machine reading comprehension (MRC) models do not scale effectively
to real-world applications like web-level information retrieval and question
answering (QA). We argue that this stems from the nature of MRC datasets: most
of these are static environments wherein the supporting documents and all
necessary information are fully observed. In this paper, we propose a simple
method that reframes existing MRC datasets as interactive, partially observable
environments. Specifically, we "occlude" the majority of a document's text and
add context-sensitive commands that reveal "glimpses" of the hidden text to a
model. We repurpose SQuAD and NewsQA as an initial case study, and then show
how the interactive corpora can be used to train a model that seeks relevant
information through sequential decision making. We believe that this setting
can contribute in scaling models to web-level QA scenarios.Comment: ACL202
Semantic Representation and Inference for NLP
Semantic representation and inference is essential for Natural Language
Processing (NLP). The state of the art for semantic representation and
inference is deep learning, and particularly Recurrent Neural Networks (RNNs),
Convolutional Neural Networks (CNNs), and transformer Self-Attention models.
This thesis investigates the use of deep learning for novel semantic
representation and inference, and makes contributions in the following three
areas: creating training data, improving semantic representations and extending
inference learning. In terms of creating training data, we contribute the
largest publicly available dataset of real-life factual claims for the purpose
of automatic claim verification (MultiFC), and we present a novel inference
model composed of multi-scale CNNs with different kernel sizes that learn from
external sources to infer fact checking labels. In terms of improving semantic
representations, we contribute a novel model that captures non-compositional
semantic indicators. By definition, the meaning of a non-compositional phrase
cannot be inferred from the individual meanings of its composing words (e.g.,
hot dog). Motivated by this, we operationalize the compositionality of a phrase
contextually by enriching the phrase representation with external word
embeddings and knowledge graphs. Finally, in terms of inference learning, we
propose a series of novel deep learning architectures that improve inference by
using syntactic dependencies, by ensembling role guided attention heads,
incorporating gating layers, and concatenating multiple heads in novel and
effective ways. This thesis consists of seven publications (five published and
two under review).Comment: PhD thesis, the University of Copenhage