Search CORE

10,889 research outputs found

Deep Short Text Classification with Knowledge Powered Attention

Author: Chen Jindong
Hu Yizhou
Jiang Haiyun
Liu Jingping
Xiao Yanghua
Publication venue
Publication date: 21/02/2019
Field of study

Short text classification is one of important tasks in Natural Language Processing (NLP). Unlike paragraphs or documents, short texts are more ambiguous since they have not enough contextual information, which poses a great challenge for classification. In this paper, we retrieve knowledge from external knowledge source to enhance the semantic representation of short texts. We take conceptual information as a kind of knowledge and incorporate it into deep neural networks. For the purpose of measuring the importance of knowledge, we introduce attention mechanisms and propose deep Short Text Classification with Knowledge powered Attention (STCKA). We utilize Concept towards Short Text (C- ST) attention and Concept towards Concept Set (C-CS) attention to acquire the weight of concepts from two aspects. And we classify a short text with the help of conceptual information. Unlike traditional approaches, our model acts like a human being who has intrinsic ability to make decisions based on observation (i.e., training data for machines) and pays more attention to important knowledge. We also conduct extensive experiments on four public datasets for different tasks. The experimental results and case studies show that our model outperforms the state-of-the-art methods, justifying the effectiveness of knowledge powered attention

arXiv.org e-Print Archive

KNET: A General Framework for Learning Word Embedding using Morphological Knowledge

Author: Bian Jiang
Cui Qing
Gao Bin
Liu Tie-Yan
Qiu Siyu
Publication venue
Publication date: 05/09/2014
Field of study

Neural network techniques are widely applied to obtain high-quality distributed representations of words, i.e., word embeddings, to address text mining, information retrieval, and natural language processing tasks. Recently, efficient methods have been proposed to learn word embeddings from context that captures both semantic and syntactic relationships between words. However, it is challenging to handle unseen words or rare words with insufficient context. In this paper, inspired by the study on word recognition process in cognitive psychology, we propose to take advantage of seemingly less obvious but essentially important morphological knowledge to address these challenges. In particular, we introduce a novel neural network architecture called KNET that leverages both contextual information and morphological word similarity built based on morphological knowledge to learn word embeddings. Meanwhile, the learning architecture is also able to refine the pre-defined morphological knowledge and obtain more accurate word similarity. Experiments on an analogical reasoning task and a word similarity task both demonstrate that the proposed KNET framework can greatly enhance the effectiveness of word embeddings

arXiv.org e-Print Archive

Solving Verbal Comprehension Questions in IQ Test by Knowledge-Powered Word Embedding

Author: Bian Jiang
Gao Bin
Liu Tie-Yan
Tian Fei
Wang Huazheng
Publication venue
Publication date: 26/04/2016
Field of study

Intelligence Quotient (IQ) Test is a set of standardized questions designed to evaluate human intelligence. Verbal comprehension questions appear very frequently in IQ tests, which measure human's verbal ability including the understanding of the words with multiple senses, the synonyms and antonyms, and the analogies among words. In this work, we explore whether such tests can be solved automatically by artificial intelligence technologies, especially the deep learning technologies that are recently developed and successfully applied in a number of fields. However, we found that the task was quite challenging, and simply applying existing technologies (e.g., word embedding) could not achieve a good performance, mainly due to the multiple senses of words and the complex relations among words. To tackle these challenges, we propose a novel framework consisting of three components. First, we build a classifier to recognize the specific type of a verbal question (e.g., analogy, classification, synonym, or antonym). Second, we obtain distributed representations of words and relations by leveraging a novel word embedding method that considers the multi-sense nature of words and the relational knowledge among words (or their senses) contained in dictionaries. Third, for each type of questions, we propose a specific solver based on the obtained distributed word representations and relation representations. Experimental results have shown that the proposed framework can not only outperform existing methods for solving verbal comprehension questions but also exceed the average performance of the Amazon Mechanical Turk workers involved in the study. The results indicate that with appropriate uses of the deep learning technologies we might be a further step closer to the human intelligence

arXiv.org e-Print Archive

Unsupervised Transfer Learning for Spoken Language Understanding in Intelligent Agents

Author: Goyal Anuj
Metallinou Angeliki
Siddhant Aditya
Publication venue
Publication date: 13/11/2018
Field of study

User interaction with voice-powered agents generates large amounts of unlabeled utterances. In this paper, we explore techniques to efficiently transfer the knowledge from these unlabeled utterances to improve model performance on Spoken Language Understanding (SLU) tasks. We use Embeddings from Language Model (ELMo) to take advantage of unlabeled data by learning contextualized word representations. Additionally, we propose ELMo-Light (ELMoL), a faster and simpler unsupervised pre-training method for SLU. Our findings suggest unsupervised pre-training on a large corpora of unlabeled utterances leads to significantly better SLU performance compared to training from scratch and it can even outperform conventional supervised transfer. Additionally, we show that the gains from unsupervised transfer techniques can be further improved by supervised transfer. The improvements are more pronounced in low resource settings and when using only 1000 labeled in-domain samples, our techniques match the performance of training from scratch on 10-15x more labeled in-domain data.Comment: To appear at AAAI 201

arXiv.org e-Print Archive

Programming Bots by Synthesizing Natural Language Expressions into API Invocations

Author: Barukh Moshe Chai
Benatallah Boualem
Casati Fabio
Rodriguez Carlos
Zamanirad Shayan
Publication venue
Publication date: 15/11/2017
Field of study

At present, bots are still in their preliminary stages of development. Many are relatively simple, or developed ad-hoc for a very specific use-case. For this reason, they are typically programmed manually, or utilize machine-learning classifiers to interpret a fixed set of user utterances. In reality, real world conversations with humans require support for dynamically capturing users expressions. Moreover, bots will derive immeasurable value by programming them to invoke APIs for their results. Today, within the Web and Mobile development community, complex applications are being stringed together with a few lines of code -- all made possible by APIs. Yet, developers today are not as empowered to program bots in much the same way. To overcome this, we introduce BotBase, a bot programming platform that dynamically synthesizes natural language user expressions into API invocations. Our solution is two faceted: Firstly, we construct an API knowledge graph to encode and evolve APIs; secondly, leveraging the above we apply techniques in NLP, ML and Entity Recognition to perform the required synthesis from natural language user expressions into API calls.Comment: The paper is published at ASE 2017 (The 32nd IEEE/ACM International Conference on Automated Software Engineering

arXiv.org e-Print Archive

Which Emoji Talks Best for My Picture?

Author: Illendula Anurag
Manohar Kv
Yedulla Manish Reddy
Publication venue
Publication date: 27/08/2018
Field of study

Emojis have evolved as complementary sources for expressing emotion in social-media platforms where posts are mostly composed of texts and images. In order to increase the expressiveness of the social media posts, users associate relevant emojis with their posts. Incorporating domain knowledge has improved machine understanding of text. In this paper, we investigate whether domain knowledge for emoji can improve the accuracy of emoji recommendation task in case of multimedia posts composed of image and text. Our emoji recommendation can suggest accurate emojis by exploiting both visual and textual content from social media posts as well as domain knowledge from Emojinet. Experimental results using pre-trained image classifiers and pre-trained word embedding models on Twitter dataset show that our results outperform the current state-of-the-art by 9.6\%. We also present a user study evaluation of our recommendation system on a set of images chosen from MSCOCO dataset.Comment: Accepted at the 2018 IEEE/WIC/ACM International Conference on Web Intelligence (WI '18), December 3-6, 2018, Santiago de Chil

arXiv.org e-Print Archive

WordRep: A Benchmark for Research on Learning Word Representations

Author: Bian Jiang
Gao Bin
Liu Tie-Yan
Publication venue
Publication date: 07/07/2014
Field of study

WordRep is a benchmark collection for the research on learning distributed word representations (or word embeddings), released by Microsoft Research. In this paper, we describe the details of the WordRep collection and show how to use it in different types of machine learning research related to word embedding. Specifically, we describe how the evaluation tasks in WordRep are selected, how the data are sampled, and how the evaluation tool is built. We then compare several state-of-the-art word representations on WordRep, report their evaluation performance, and make discussions on the results. After that, we discuss new potential research topics that can be supported by WordRep, in addition to algorithm comparison. We hope that this paper can help people gain deeper understanding of WordRep, and enable more interesting research on learning distributed word representations and related topics

arXiv.org e-Print Archive

Building Memory with Concept Learning Capabilities from Large-scale Knowledge Base

Author: Shi Jiaxin
Zhu Jun
Publication venue
Publication date: 03/12/2015
Field of study

We present a new perspective on neural knowledge base (KB) embeddings, from which we build a framework that can model symbolic knowledge in the KB together with its learning process. We show that this framework well regularizes previous neural KB embedding model for superior performance in reasoning tasks, while having the capabilities of dealing with unseen entities, that is, to learn their embeddings from natural language descriptions, which is very like human's behavior of learning semantic concepts.Comment: Accepted to NIPS 2015 Cognitive Computation workshop (CoCo@NIPS 2015

arXiv.org e-Print Archive

CFO: Conditional Focused Neural Question Answering with Large-scale Knowledge Bases

Author: Dai Zihang
Li Lei
Xu Wei
Publication venue
Publication date: 03/07/2016
Field of study

How can we enable computers to automatically answer questions like "Who created the character Harry Potter"? Carefully built knowledge bases provide rich sources of facts. However, it remains a challenge to answer factoid questions raised in natural language due to numerous expressions of one question. In particular, we focus on the most common questions --- ones that can be answered with a single fact in the knowledge base. We propose CFO, a Conditional Focused neural-network-based approach to answering factoid questions with knowledge bases. Our approach first zooms in a question to find more probable candidate subject mentions, and infers the final answers with a unified conditional probabilistic framework. Powered by deep recurrent neural networks and neural embeddings, our proposed CFO achieves an accuracy of 75.7% on a dataset of 108k questions - the largest public one to date. It outperforms the current state of the art by an absolute margin of 11.8%.Comment: Accepted by ACL 201

arXiv.org e-Print Archive

SubGram: Extending Skip-gram Word Representation with Substrings

Author: Bojar Ondřej
Kocmi Tom
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 18/06/2018
Field of study

Skip-gram (word2vec) is a recent method for creating vector representations of words ("distributed word representations") using a neural network. The representation gained popularity in various areas of natural language processing, because it seems to capture syntactic and semantic information about words without any explicit supervision in this respect. We propose SubGram, a refinement of the Skip-gram model to consider also the word structure during the training process, achieving large gains on the Skip-gram original test set.Comment: Published at TSD 201

arXiv.org e-Print Archive