39 research outputs found
Efficient Natural Language Response Suggestion for Smart Reply
This paper presents a computationally efficient machine-learned method for
natural language response suggestion. Feed-forward neural networks using n-gram
embedding features encode messages into vectors which are optimized to give
message-response pairs a high dot-product value. An optimized search finds
response suggestions. The method is evaluated in a large-scale commercial
e-mail application, Inbox by Gmail. Compared to a sequence-to-sequence
approach, the new system achieves the same quality at a small fraction of the
computational requirements and latency
Improv Chat: Second Response Generation for Chatbot
Existing research on response generation for chatbot focuses on \textbf{First
Response Generation} which aims to teach the chatbot to say the first response
(e.g. a sentence) appropriate to the conversation context (e.g. the user's
query). In this paper, we introduce a new task \textbf{Second Response
Generation}, termed as Improv chat, which aims to teach the chatbot to say the
second response after saying the first response with respect the conversation
context, so as to lighten the burden on the user to keep the conversation
going. Specifically, we propose a general learning based framework and develop
a retrieval based system which can generate the second responses with the
users' query and the chatbot's first response as input. We present the approach
to building the conversation corpus for Improv chat from public forums and
social networks, as well as the neural networks based models for response
matching and ranking. We include the preliminary experiments and results in
this paper. This work could be further advanced with better deep matching
models for retrieval base systems or generative models for generation based
systems as well as extensive evaluations in real-life applications
End-to-End Retrieval in Continuous Space
Most text-based information retrieval (IR) systems index objects by words or
phrases. These discrete systems have been augmented by models that use
embeddings to measure similarity in continuous space. But continuous-space
models are typically used just to re-rank the top candidates. We consider the
problem of end-to-end continuous retrieval, where standard approximate nearest
neighbor (ANN) search replaces the usual discrete inverted index, and rely
entirely on distances between learned embeddings. By training simple models
specifically for retrieval, with an appropriate model architecture, we improve
on a discrete baseline by 8% and 26% (MAP) on two similar-question retrieval
tasks. We also discuss the problem of evaluation for retrieval systems, and
show how to modify existing pairwise similarity datasets for this purpose
Universal Sentence Encoder
We present models for encoding sentences into embedding vectors that
specifically target transfer learning to other NLP tasks. The models are
efficient and result in accurate performance on diverse transfer tasks. Two
variants of the encoding models allow for trade-offs between accuracy and
compute resources. For both variants, we investigate and report the
relationship between model complexity, resource consumption, the availability
of transfer task training data, and task performance. Comparisons are made with
baselines that use word level transfer learning via pretrained word embeddings
as well as baselines do not use any transfer learning. We find that transfer
learning using sentence embeddings tends to outperform word level transfer.
With transfer learning via sentence embeddings, we observe surprisingly good
performance with minimal amounts of supervised training data for a transfer
task. We obtain encouraging results on Word Embedding Association Tests (WEAT)
targeted at detecting model bias. Our pre-trained sentence encoding models are
made freely available for download and on TF Hub.Comment: 7 pages; fixed module URL in Listing
Learning Semantic Textual Similarity from Conversations
We present a novel approach to learn representations for sentence-level
semantic similarity using conversational data. Our method trains an
unsupervised model to predict conversational input-response pairs. The
resulting sentence embeddings perform well on the semantic textual similarity
(STS) benchmark and SemEval 2017's Community Question Answering (CQA) question
similarity subtask. Performance is further improved by introducing multitask
training combining the conversational input-response prediction task and a
natural language inference task. Extensive experiments show the proposed model
achieves the best performance among all neural models on the STS benchmark and
is competitive with the state-of-the-art feature engineered and mixed systems
in both tasks.Comment: 10 pages, 8 Figures, 6 Table
Lucene for Approximate Nearest-Neighbors Search on Arbitrary Dense Vectors
We demonstrate three approaches for adapting the open-source Lucene search
library to perform approximate nearest-neighbor search on arbitrary dense
vectors, using similarity search on word embeddings as a case study. At its
core, Lucene is built around inverted indexes of a document collection's
(sparse) term-document matrix, which is incompatible with the lower-dimensional
dense vectors that are common in deep learning applications. We evaluate three
techniques to overcome these challenges that can all be natively integrated
into Lucene: the creation of documents populated with fake words, LSH applied
to lexical realizations of dense vectors, and k-d trees coupled with
dimensionality reduction. Experiments show that the "fake words" approach
represents the best balance between effectiveness and efficiency. These
techniques are integrated into the Anserini open-source toolkit and made
available to the community
Unsupervised Machine Commenting with Neural Variational Topic Model
Article comments can provide supplementary opinions and facts for readers,
thereby increase the attraction and engagement of articles. Therefore,
automatically commenting is helpful in improving the activeness of the
community, such as online forums and news websites. Previous work shows that
training an automatic commenting system requires large parallel corpora.
Although part of articles are naturally paired with the comments on some
websites, most articles and comments are unpaired on the Internet. To fully
exploit the unpaired data, we completely remove the need for parallel data and
propose a novel unsupervised approach to train an automatic article commenting
model, relying on nothing but unpaired articles and comments. Our model is
based on a retrieval-based commenting framework, which uses news to retrieve
comments based on the similarity of their topics. The topic representation is
obtained from a neural variational topic model, which is trained in an
unsupervised manner. We evaluate our model on a news comment dataset.
Experiments show that our proposed topic-based approach significantly
outperforms previous lexicon-based models. The model also profits from paired
corpora and achieves state-of-the-art performance under semi-supervised
scenarios
Predicting Floor-Level for 911 Calls with Neural Networks and Smartphone Sensor Data
In cities with tall buildings, emergency responders need an accurate floor
level location to find 911 callers quickly. We introduce a system to estimate a
victim's floor level via their mobile device's sensor data in a two-step
process. First, we train a neural network to determine when a smartphone enters
or exits a building via GPS signal changes. Second, we use a barometer equipped
smartphone to measure the change in barometric pressure from the entrance of
the building to the victim's indoor location. Unlike impractical previous
approaches, our system is the first that does not require the use of beacons,
prior knowledge of the building infrastructure, or knowledge of user behavior.
We demonstrate real-world feasibility through 63 experiments across five
different tall buildings throughout New York City where our system predicted
the correct floor level with 100% accuracy.Comment: International Conference on Learning Representations (ICLR 2018
Deep Learning
Deep learning (DL) is a high dimensional data reduction technique for
constructing high-dimensional predictors in input-output models. DL is a form
of machine learning that uses hierarchical layers of latent features. In this
article, we review the state-of-the-art of deep learning from a modeling and
algorithmic perspective. We provide a list of successful areas of applications
in Artificial Intelligence (AI), Image Processing, Robotics and Automation.
Deep learning is predictive in its nature rather then inferential and can be
viewed as a black-box methodology for high-dimensional function estimation.Comment: arXiv admin note: text overlap with arXiv:1602.0656
Learning Cross-Lingual Sentence Representations via a Multi-task Dual-Encoder Model
A significant roadblock in multilingual neural language modeling is the lack
of labeled non-English data. One potential method for overcoming this issue is
learning cross-lingual text representations that can be used to transfer the
performance from training on English tasks to non-English tasks, despite little
to no task-specific non-English data. In this paper, we explore a natural setup
for learning cross-lingual sentence representations: the dual-encoder. We
provide a comprehensive evaluation of our cross-lingual representations on a
number of monolingual, cross-lingual, and zero-shot/few-shot learning tasks,
and also give an analysis of different learned cross-lingual embedding spaces.Comment: Accepted at the 4th Workshop on Representation Learning for NLP
(RepL4NLP-2019