Search CORE

39 research outputs found

Efficient Natural Language Response Suggestion for Smart Reply

Author: Al-Rfou Rami
Guo Ruiqi
Henderson Matthew
Kumar Sanjiv
Kurzweil Ray
Lukacs Laszlo
Miklos Balint
Strope Brian
Sung Yun-hsuan
Publication venue
Publication date: 01/05/2017
Field of study

This paper presents a computationally efficient machine-learned method for natural language response suggestion. Feed-forward neural networks using n-gram embedding features encode messages into vectors which are optimized to give message-response pairs a high dot-product value. An optimized search finds response suggestions. The method is evaluated in a large-scale commercial e-mail application, Inbox by Gmail. Compared to a sequence-to-sequence approach, the new system achieves the same quality at a small fraction of the computational requirements and latency

arXiv.org e-Print Archive

Improv Chat: Second Response Generation for Chatbot

Author: Wei Furu
Publication venue
Publication date: 10/05/2018
Field of study

Existing research on response generation for chatbot focuses on \textbf{First Response Generation} which aims to teach the chatbot to say the first response (e.g. a sentence) appropriate to the conversation context (e.g. the user's query). In this paper, we introduce a new task \textbf{Second Response Generation}, termed as Improv chat, which aims to teach the chatbot to say the second response after saying the first response with respect the conversation context, so as to lighten the burden on the user to keep the conversation going. Specifically, we propose a general learning based framework and develop a retrieval based system which can generate the second responses with the users' query and the chatbot's first response as input. We present the approach to building the conversation corpus for Improv chat from public forums and social networks, as well as the neural networks based models for response matching and ranking. We include the preliminary experiments and results in this paper. This work could be further advanced with better deep matching models for retrieval base systems or generative models for generation based systems as well as extensive evaluations in real-life applications

arXiv.org e-Print Archive

End-to-End Retrieval in Continuous Space

Author: Gillick Daniel
Presta Alessandro
Tomar Gaurav Singh
Publication venue
Publication date: 19/11/2018
Field of study

Most text-based information retrieval (IR) systems index objects by words or phrases. These discrete systems have been augmented by models that use embeddings to measure similarity in continuous space. But continuous-space models are typically used just to re-rank the top candidates. We consider the problem of end-to-end continuous retrieval, where standard approximate nearest neighbor (ANN) search replaces the usual discrete inverted index, and rely entirely on distances between learned embeddings. By training simple models specifically for retrieval, with an appropriate model architecture, we improve on a discrete baseline by 8% and 26% (MAP) on two similar-question retrieval tasks. We also discuss the problem of evaluation for retrieval systems, and show how to modify existing pairwise similarity datasets for this purpose

arXiv.org e-Print Archive

Universal Sentence Encoder

Author: Cer Daniel
Constant Noah
Guajardo-Cespedes Mario
Hua Nan
John Rhomni St.
Kong Sheng-yi
Kurzweil Ray
Limtiaco Nicole
Strope Brian
Sung Yun-Hsuan
Tar Chris
Yang Yinfei
Yuan Steve
Publication venue
Publication date: 12/04/2018
Field of study

We present models for encoding sentences into embedding vectors that specifically target transfer learning to other NLP tasks. The models are efficient and result in accurate performance on diverse transfer tasks. Two variants of the encoding models allow for trade-offs between accuracy and compute resources. For both variants, we investigate and report the relationship between model complexity, resource consumption, the availability of transfer task training data, and task performance. Comparisons are made with baselines that use word level transfer learning via pretrained word embeddings as well as baselines do not use any transfer learning. We find that transfer learning using sentence embeddings tends to outperform word level transfer. With transfer learning via sentence embeddings, we observe surprisingly good performance with minimal amounts of supervised training data for a transfer task. We obtain encouraging results on Word Embedding Association Tests (WEAT) targeted at detecting model bias. Our pre-trained sentence encoding models are made freely available for download and on TF Hub.Comment: 7 pages; fixed module URL in Listing

arXiv.org e-Print Archive

Learning Semantic Textual Similarity from Conversations

Author: Cer Daniel
Constant Noah
Ge Heming
Kong Sheng-yi
Kurzweil Ray
Pilar Petr
Strope Brian
Sung Yun-Hsuan
Yang Yinfei
Yuan Steve
Publication venue
Publication date: 20/04/2018
Field of study

We present a novel approach to learn representations for sentence-level semantic similarity using conversational data. Our method trains an unsupervised model to predict conversational input-response pairs. The resulting sentence embeddings perform well on the semantic textual similarity (STS) benchmark and SemEval 2017's Community Question Answering (CQA) question similarity subtask. Performance is further improved by introducing multitask training combining the conversational input-response prediction task and a natural language inference task. Extensive experiments show the proposed model achieves the best performance among all neural models on the STS benchmark and is competitive with the state-of-the-art feature engineered and mixed systems in both tasks.Comment: 10 pages, 8 Figures, 6 Table

arXiv.org e-Print Archive

Lucene for Approximate Nearest-Neighbors Search on Arbitrary Dense Vectors

Author: Lin Jimmy
Teofili Tommaso
Publication venue
Publication date: 27/10/2019
Field of study

We demonstrate three approaches for adapting the open-source Lucene search library to perform approximate nearest-neighbor search on arbitrary dense vectors, using similarity search on word embeddings as a case study. At its core, Lucene is built around inverted indexes of a document collection's (sparse) term-document matrix, which is incompatible with the lower-dimensional dense vectors that are common in deep learning applications. We evaluate three techniques to overcome these challenges that can all be natively integrated into Lucene: the creation of documents populated with fake words, LSH applied to lexical realizations of dense vectors, and k-d trees coupled with dimensionality reduction. Experiments show that the "fake words" approach represents the best balance between effectiveness and efficiency. These techniques are integrated into the Anserini open-source toolkit and made available to the community

arXiv.org e-Print Archive

Unsupervised Machine Commenting with Neural Variational Topic Model

Author: Cui Lei
Ma Shuming
Sun Xu
Wei Furu
Publication venue
Publication date: 13/09/2018
Field of study

Article comments can provide supplementary opinions and facts for readers, thereby increase the attraction and engagement of articles. Therefore, automatically commenting is helpful in improving the activeness of the community, such as online forums and news websites. Previous work shows that training an automatic commenting system requires large parallel corpora. Although part of articles are naturally paired with the comments on some websites, most articles and comments are unpaired on the Internet. To fully exploit the unpaired data, we completely remove the need for parallel data and propose a novel unsupervised approach to train an automatic article commenting model, relying on nothing but unpaired articles and comments. Our model is based on a retrieval-based commenting framework, which uses news to retrieve comments based on the similarity of their topics. The topic representation is obtained from a neural variational topic model, which is trained in an unsupervised manner. We evaluate our model on a news comment dataset. Experiments show that our proposed topic-based approach significantly outperforms previous lexicon-based models. The model also profits from paired corpora and achieves state-of-the-art performance under semi-supervised scenarios

arXiv.org e-Print Archive

Predicting Floor-Level for 911 Calls with Neural Networks and Smartphone Sensor Data

Author: Falcon William
Schulzrinne Henning
Publication venue
Publication date: 15/09/2018
Field of study

In cities with tall buildings, emergency responders need an accurate floor level location to find 911 callers quickly. We introduce a system to estimate a victim's floor level via their mobile device's sensor data in a two-step process. First, we train a neural network to determine when a smartphone enters or exits a building via GPS signal changes. Second, we use a barometer equipped smartphone to measure the change in barometric pressure from the entrance of the building to the victim's indoor location. Unlike impractical previous approaches, our system is the first that does not require the use of beacons, prior knowledge of the building infrastructure, or knowledge of user behavior. We demonstrate real-world feasibility through 63 experiments across five different tall buildings throughout New York City where our system predicted the correct floor level with 100% accuracy.Comment: International Conference on Learning Representations (ICLR 2018

arXiv.org e-Print Archive

Deep Learning

Author: Polson Nicholas G.
Sokolov Vadim O.
Publication venue
Publication date: 03/08/2018
Field of study

Deep learning (DL) is a high dimensional data reduction technique for constructing high-dimensional predictors in input-output models. DL is a form of machine learning that uses hierarchical layers of latent features. In this article, we review the state-of-the-art of deep learning from a modeling and algorithmic perspective. We provide a list of successful areas of applications in Artificial Intelligence (AI), Image Processing, Robotics and Automation. Deep learning is predictive in its nature rather then inferential and can be viewed as a black-box methodology for high-dimensional function estimation.Comment: arXiv admin note: text overlap with arXiv:1602.0656

arXiv.org e-Print Archive

Learning Cross-Lingual Sentence Representations via a Multi-task Dual-Encoder Model

Author: Cer Daniel
Chidambaram Muthuraman
Kurzweil Ray
Strope Brian
Sung Yun-Hsuan
Yang Yinfei
Yuan Steve
Publication venue
Publication date: 01/08/2019
Field of study

A significant roadblock in multilingual neural language modeling is the lack of labeled non-English data. One potential method for overcoming this issue is learning cross-lingual text representations that can be used to transfer the performance from training on English tasks to non-English tasks, despite little to no task-specific non-English data. In this paper, we explore a natural setup for learning cross-lingual sentence representations: the dual-encoder. We provide a comprehensive evaluation of our cross-lingual representations on a number of monolingual, cross-lingual, and zero-shot/few-shot learning tasks, and also give an analysis of different learned cross-lingual embedding spaces.Comment: Accepted at the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019

arXiv.org e-Print Archive