61 research outputs found
NPRF: A Neural Pseudo Relevance Feedback Framework for Ad-hoc Information Retrieval
Pseudo-relevance feedback (PRF) is commonly used to boost the performance of
traditional information retrieval (IR) models by using top-ranked documents to
identify and weight new query terms, thereby reducing the effect of
query-document vocabulary mismatches. While neural retrieval models have
recently demonstrated strong results for ad-hoc retrieval, combining them with
PRF is not straightforward due to incompatibilities between existing PRF
approaches and neural architectures. To bridge this gap, we propose an
end-to-end neural PRF framework that can be used with existing neural IR models
by embedding different neural models as building blocks. Extensive experiments
on two standard test collections confirm the effectiveness of the proposed NPRF
framework in improving the performance of two state-of-the-art neural IR
models.Comment: Full paper in EMNLP 201
Understanding Differential Search Index for Text Retrieval
The Differentiable Search Index (DSI) is a novel information retrieval (IR)
framework that utilizes a differentiable function to generate a sorted list of
document identifiers in response to a given query. However, due to the
black-box nature of the end-to-end neural architecture, it remains to be
understood to what extent DSI possesses the basic indexing and retrieval
abilities. To mitigate this gap, in this study, we define and examine three
important abilities that a functioning IR framework should possess, namely,
exclusivity, completeness, and relevance ordering. Our analytical
experimentation shows that while DSI demonstrates proficiency in memorizing the
unidirectional mapping from pseudo queries to document identifiers, it falls
short in distinguishing relevant documents from random ones, thereby negatively
impacting its retrieval effectiveness. To address this issue, we propose a
multi-task distillation approach to enhance the retrieval quality without
altering the structure of the model and successfully endow it with improved
indexing abilities. Through experiments conducted on various datasets, we
demonstrate that our proposed method outperforms previous DSI baselines.Comment: Accepted to Findings of ACL 202
TreeGen: A Tree-Based Transformer Architecture for Code Generation
A code generation system generates programming language code based on an
input natural language description. State-of-the-art approaches rely on neural
networks for code generation. However, these code generators suffer from two
problems. One is the long dependency problem, where a code element often
depends on another far-away code element. A variable reference, for example,
depends on its definition, which may appear quite a few lines before. The other
problem is structure modeling, as programs contain rich structural information.
In this paper, we propose a novel tree-based neural architecture, TreeGen, for
code generation. TreeGen uses the attention mechanism of Transformers to
alleviate the long-dependency problem, and introduces a novel AST reader
(encoder) to incorporate grammar rules and AST structures into the network. We
evaluated TreeGen on a Python benchmark, HearthStone, and two semantic parsing
benchmarks, ATIS and GEO. TreeGen outperformed the previous state-of-the-art
approach by 4.5 percentage points on HearthStone, and achieved the best
accuracy among neural network-based approaches on ATIS (89.1%) and GEO (89.6%).
We also conducted an ablation test to better understand each component of our
model
Generalized Equivariance and Preferential Labeling for GNN Node Classification
Existing graph neural networks (GNNs) largely rely on node embeddings, which
represent a node as a vector by its identity, type, or content. However, graphs
with unattributed nodes widely exist in real-world applications (e.g.,
anonymized social networks). Previous GNNs either assign random labels to nodes
(which introduces artefacts to the GNN) or assign one embedding to all nodes
(which fails to explicitly distinguish one node from another). Further, when
these GNNs are applied to unattributed node classification problems, they have
an undesired equivariance property, which are fundamentally unable to address
the data with multiple possible outputs. In this paper, we analyze the
limitation of existing approaches to node classification problems. Inspired by
our analysis, we propose a generalized equivariance property and a Preferential
Labeling technique that satisfies the desired property asymptotically.
Experimental results show that we achieve high performance in several
unattributed node classification tasks
Improving machine translation systems via isotopic replacement
Machine translation plays an essential role in people’s daily international communication. However, machine translation systems are far from perfect. To tackle this problem, researchers have proposed several approaches to testing machine translation. A promising trend among these approaches is to use word replacement, where only one word in the original sentence is replaced with another word to form a sentence pair. However, precise control of the impact of word replacement remains an outstanding issue in these approaches.
To address this issue, we propose CAT, a novel word-replacement-based approach, whose basic idea is to identify word replacement with controlled impact (referred to as isotopic replacement). To achieve this purpose, we use a neural-based language model to encode the sentence context, and design a neural-network-based algorithm to evaluate context-aware semantic similarity between two words. Furthermore, similar to TransRepair, a state-of-the-art word-replacement-based approach, CAT also provides automatic fixing of revealed bugs without model retraining.
Our evaluation on Google Translate and Transformer indicates that CAT achieves significant improvements over TransRepair. In particular, 1) CAT detects seven more types of bugs than TransRepair; 2) CAT detects 129% more translation bugs than TransRepair; 3) CAT repairs twice more bugs than TransRepair, many of which may bring serious consequences if left unfixed; and 4) CAT has better efficiency than TransRepair in input generation (0.01s v.s. 0.41s) and comparable efficiency with TransRepair in bug repair (1.92s v.s. 1.34s)
- …