4,493 research outputs found
GNN-SL: Sequence Labeling Based on Nearest Examples via GNN
To better handle long-tail cases in the sequence labeling (SL) task, in this
work, we introduce graph neural networks sequence labeling (GNN-SL), which
augments the vanilla SL model output with similar tagging examples retrieved
from the whole training set. Since not all the retrieved tagging examples
benefit the model prediction, we construct a heterogeneous graph, and leverage
graph neural networks (GNNs) to transfer information between the retrieved
tagging examples and the input word sequence. The augmented node which
aggregates information from neighbors is used to do prediction. This strategy
enables the model to directly acquire similar tagging examples and improves the
general quality of predictions. We conduct a variety of experiments on three
typical sequence labeling tasks: Named Entity Recognition (NER), Part of Speech
Tagging (POS), and Chinese Word Segmentation (CWS) to show the significant
performance of our GNN-SL. Notably, GNN-SL achieves SOTA results of 96.9 (+0.2)
on PKU, 98.3 (+0.4) on CITYU, 98.5 (+0.2) on MSR, and 96.9 (+0.2) on AS for the
CWS task, and results comparable to SOTA performances on NER datasets, and POS
datasets.Comment: preprin
Recommended from our members
ANSWER SIMILARITY GROUPING AND DIVERSIFICATION IN QUESTION ANSWERING SYSTEMS
The rise in popularity of mobile and voice search has led to a shift in IR from document to passage retrieval for non-factoid questions. Various datasets such as MSMarco, as well as efficient retrieval models have been developed to identify single best answer passages for this task. However, such models do not specifically address questions which could have multiple or alternative answers. In this dissertation, we focus on this new research area that involves studying answer passage relationships and how this could be applied to passage retrieval tasks.
We first create a high quality dataset for the answer passage similarity task in the context of question answering. Manual annotation of passage pairs is performed to set the similarity labels, from which answer group information is automatically generated. We next investigate different types of representations, which could be used to create effective clusters. We experiment with various unsupervised representations and show that distributional representations outperform term based representations for this task. Next, weak supervision is leveraged to further improve the cluster modeling performance. We use BERT as the underlying model for training and show the relative performance of various weak signals such as GloVe and term-based Language Modeling for this task. In order to apply these clusters to the answer passage retrieval task for multi-answer questions, we use a modified version of the Maximal Marginal Relevance (MMR) diversification model. We demonstrate that answers retrieved using this model are more diverse i.e, cover more answer types with low redundancy as well as maximize relevance, with respect to the baselines. So far, we used passage clustering as a means to identify answer groups corresponding to a question and apply them in a question answering task. We extend this a step further by looking at related questions within a conversation. For this purpose, we expand the definition of Reciprocal Rank Fusion (RRF) and use this to identify pertinent history passages for such questions. Updated question rewrites generated using these passages are then used to improve the conversational search task. In addition to being the first work that looks at answer relationships, our specific contributions can be summarized as follows: (1) Creation of new datasets with passage similarity and answer type information; (2) Effective passage similarity clustering models using unsupervised representations and weak supervision methods; (3) Applying the passage similarity/clustering information to diversification framework; (4) Identifying good response history candidates using answer passage clustering for the conversational search task
A proposal for a coordinated effort for the determination of brainwide neuroanatomical connectivity in model organisms at a mesoscopic scale
In this era of complete genomes, our knowledge of neuroanatomical circuitry
remains surprisingly sparse. Such knowledge is however critical both for basic
and clinical research into brain function. Here we advocate for a concerted
effort to fill this gap, through systematic, experimental mapping of neural
circuits at a mesoscopic scale of resolution suitable for comprehensive,
brain-wide coverage, using injections of tracers or viral vectors. We detail
the scientific and medical rationale and briefly review existing knowledge and
experimental techniques. We define a set of desiderata, including brain-wide
coverage; validated and extensible experimental techniques suitable for
standardization and automation; centralized, open access data repository;
compatibility with existing resources, and tractability with current
informatics technology. We discuss a hypothetical but tractable plan for mouse,
additional efforts for the macaque, and technique development for human. We
estimate that the mouse connectivity project could be completed within five
years with a comparatively modest budget.Comment: 41 page
Pretrained Transformers for Text Ranking: BERT and Beyond
The goal of text ranking is to generate an ordered list of texts retrieved
from a corpus in response to a query. Although the most common formulation of
text ranking is search, instances of the task can also be found in many natural
language processing applications. This survey provides an overview of text
ranking with neural network architectures known as transformers, of which BERT
is the best-known example. The combination of transformers and self-supervised
pretraining has been responsible for a paradigm shift in natural language
processing (NLP), information retrieval (IR), and beyond. In this survey, we
provide a synthesis of existing work as a single point of entry for
practitioners who wish to gain a better understanding of how to apply
transformers to text ranking problems and researchers who wish to pursue work
in this area. We cover a wide range of modern techniques, grouped into two
high-level categories: transformer models that perform reranking in multi-stage
architectures and dense retrieval techniques that perform ranking directly.
There are two themes that pervade our survey: techniques for handling long
documents, beyond typical sentence-by-sentence processing in NLP, and
techniques for addressing the tradeoff between effectiveness (i.e., result
quality) and efficiency (e.g., query latency, model and index size). Although
transformer architectures and pretraining techniques are recent innovations,
many aspects of how they are applied to text ranking are relatively well
understood and represent mature techniques. However, there remain many open
research questions, and thus in addition to laying out the foundations of
pretrained transformers for text ranking, this survey also attempts to
prognosticate where the field is heading
A Unified Encoder-Decoder Framework with Entity Memory
Entities, as important carriers of real-world knowledge, play a key role in
many NLP tasks. We focus on incorporating entity knowledge into an
encoder-decoder framework for informative text generation. Existing approaches
tried to index, retrieve, and read external documents as evidence, but they
suffered from a large computational overhead. In this work, we propose an
encoder-decoder framework with an entity memory, namely EDMem. The entity
knowledge is stored in the memory as latent representations, and the memory is
pre-trained on Wikipedia along with encoder-decoder parameters. To precisely
generate entity names, we design three decoding methods to constrain entity
generation by linking entities in the memory. EDMem is a unified framework that
can be used on various entity-intensive question answering and generation
tasks. Extensive experimental results show that EDMem outperforms both
memory-based auto-encoder models and non-memory encoder-decoder models.Comment: Accepted by the 2022 Conference on Empirical Methods in Natural
Language Processing (EMNLP 2022
- …