404 research outputs found
Question Answering with Subgraph Embeddings
This paper presents a system which learns to answer questions on a broad
range of topics from a knowledge base using few hand-crafted features. Our
model learns low-dimensional embeddings of words and knowledge base
constituents; these representations are used to score natural language
questions against candidate answers. Training our system using pairs of
questions and structured representations of their answers, and pairs of
question paraphrases, yields competitive results on a competitive benchmark of
the literature
Memory Networks
We describe a new class of learning models called memory networks. Memory
networks reason with inference components combined with a long-term memory
component; they learn how to use these jointly. The long-term memory can be
read and written to, with the goal of using it for prediction. We investigate
these models in the context of question answering (QA) where the long-term
memory effectively acts as a (dynamic) knowledge base, and the output is a
textual response. We evaluate them on a large-scale QA task, and a smaller, but
more complex, toy task generated from a simulated world. In the latter, we show
the reasoning power of such models by chaining multiple supporting sentences to
answer questions that require understanding the intension of verbs
Affinity Weighted Embedding
Supervised (linear) embedding models like Wsabie and PSI have proven
successful at ranking, recommendation and annotation tasks. However, despite
being scalable to large datasets they do not take full advantage of the extra
data due to their linear nature, and typically underfit. We propose a new class
of models which aim to provide improved performance while retaining many of the
benefits of the existing class of embedding models. Our new approach works by
iteratively learning a linear embedding model where the next iteration's
features and labels are reweighted as a function of the previous iteration. We
describe several variants of the family, and give some initial results
A Neural Attention Model for Abstractive Sentence Summarization
Summarization based on text extraction is inherently limited, but
generation-style abstractive methods have proven challenging to build. In this
work, we propose a fully data-driven approach to abstractive sentence
summarization. Our method utilizes a local attention-based model that generates
each word of the summary conditioned on the input sentence. While the model is
structurally simple, it can easily be trained end-to-end and scales to a large
amount of training data. The model shows significant performance gains on the
DUC-2004 shared task compared with several strong baselines.Comment: Proceedings of EMNLP 201
Retrieve and Refine: Improved Sequence Generation Models For Dialogue
Sequence generation models for dialogue are known to have several problems:
they tend to produce short, generic sentences that are uninformative and
unengaging. Retrieval models on the other hand can surface interesting
responses, but are restricted to the given retrieval set leading to erroneous
replies that cannot be tuned to the specific context. In this work we develop a
model that combines the two approaches to avoid both their deficiencies: first
retrieve a response and then refine it -- the final sequence generator treating
the retrieval as additional context. We show on the recent CONVAI2 challenge
task our approach produces responses superior to both standard retrieval and
generation models in human evaluations
Reading Wikipedia to Answer Open-Domain Questions
This paper proposes to tackle open- domain question answering using Wikipedia
as the unique knowledge source: the answer to any factoid question is a text
span in a Wikipedia article. This task of machine reading at scale combines the
challenges of document retrieval (finding the relevant articles) with that of
machine comprehension of text (identifying the answer spans from those
articles). Our approach combines a search component based on bigram hashing and
TF-IDF matching with a multi-layer recurrent neural network model trained to
detect answers in Wikipedia paragraphs. Our experiments on multiple existing QA
datasets indicate that (1) both modules are highly competitive with respect to
existing counterparts and (2) multitask learning using distant supervision on
their combination is an effective complete system on this challenging task.Comment: ACL2017, 10 page
- …