61,837 research outputs found

    Deep Short Text Classification with Knowledge Powered Attention

    Full text link
    Short text classification is one of important tasks in Natural Language Processing (NLP). Unlike paragraphs or documents, short texts are more ambiguous since they have not enough contextual information, which poses a great challenge for classification. In this paper, we retrieve knowledge from external knowledge source to enhance the semantic representation of short texts. We take conceptual information as a kind of knowledge and incorporate it into deep neural networks. For the purpose of measuring the importance of knowledge, we introduce attention mechanisms and propose deep Short Text Classification with Knowledge powered Attention (STCKA). We utilize Concept towards Short Text (C- ST) attention and Concept towards Concept Set (C-CS) attention to acquire the weight of concepts from two aspects. And we classify a short text with the help of conceptual information. Unlike traditional approaches, our model acts like a human being who has intrinsic ability to make decisions based on observation (i.e., training data for machines) and pays more attention to important knowledge. We also conduct extensive experiments on four public datasets for different tasks. The experimental results and case studies show that our model outperforms the state-of-the-art methods, justifying the effectiveness of knowledge powered attention

    Short Text Topic Modeling Techniques, Applications, and Performance: A Survey

    Full text link
    Analyzing short texts infers discriminative and coherent latent topics that is a critical and fundamental task since many real-world applications require semantic understanding of short texts. Traditional long text topic modeling algorithms (e.g., PLSA and LDA) based on word co-occurrences cannot solve this problem very well since only very limited word co-occurrence information is available in short texts. Therefore, short text topic modeling has already attracted much attention from the machine learning research community in recent years, which aims at overcoming the problem of sparseness in short texts. In this survey, we conduct a comprehensive review of various short text topic modeling techniques proposed in the literature. We present three categories of methods based on Dirichlet multinomial mixture, global word co-occurrences, and self-aggregation, with example of representative approaches in each category and analysis of their performance on various tasks. We develop the first comprehensive open-source library, called STTM, for use in Java that integrates all surveyed algorithms within a unified interface, benchmark datasets, to facilitate the expansion of new methods in this research field. Finally, we evaluate these state-of-the-art methods on many real-world datasets and compare their performance against one another and versus long text topic modeling algorithm.Comment: arXiv admin note: text overlap with arXiv:1808.02215 by other author

    ConCET: Entity-Aware Topic Classification for Open-Domain Conversational Agents

    Full text link
    Identifying the topic (domain) of each user's utterance in open-domain conversational systems is a crucial step for all subsequent language understanding and response tasks. In particular, for complex domains, an utterance is often routed to a single component responsible for that domain. Thus, correctly mapping a user utterance to the right domain is critical. To address this problem, we introduce ConCET: a Concurrent Entity-aware conversational Topic classifier, which incorporates entity-type information together with the utterance content features. Specifically, ConCET utilizes entity information to enrich the utterance representation, combining character, word, and entity-type embeddings into a single representation. However, for rich domains with millions of available entities, unrealistic amounts of labeled training data would be required. To complement our model, we propose a simple and effective method for generating synthetic training data, to augment the typically limited amounts of labeled training data, using commonly available knowledge bases to generate additional labeled utterances. We extensively evaluate ConCET and our proposed training method first on an openly available human-human conversational dataset called Self-Dialogue, to calibrate our approach against previous state-of-the-art methods; second, we evaluate ConCET on a large dataset of human-machine conversations with real users, collected as part of the Amazon Alexa Prize. Our results show that ConCET significantly improves topic classification performance on both datasets, including 8-10% improvements over state-of-the-art deep learning methods. We complement our quantitative results with detailed analysis of system performance, which could be used for further improvements of conversational agents.Comment: CIKM 201

    Applying Social Media Intelligence for Predicting and Identifying On-line Radicalization and Civil Unrest Oriented Threats

    Full text link
    Research shows that various social media platforms on Internet such as Twitter, Tumblr (micro-blogging websites), Facebook (a popular social networking website), YouTube (largest video sharing and hosting website), Blogs and discussion forums are being misused by extremist groups for spreading their beliefs and ideologies, promoting radicalization, recruiting members and creating online virtual communities sharing a common agenda. Popular microblogging websites such as Twitter are being used as a real-time platform for information sharing and communication during planning and mobilization if civil unrest related events. Applying social media intelligence for predicting and identifying online radicalization and civil unrest oriented threats is an area that has attracted several researchers' attention over past 10 years. There are several algorithms, techniques and tools that have been proposed in existing literature to counter and combat cyber-extremism and predicting protest related events in much advance. In this paper, we conduct a literature review of all these existing techniques and do a comprehensive analysis to understand state-of-the-art, trends and research gaps. We present a one class classification approach to collect scholarly articles targeting the topics and subtopics of our research scope. We perform characterization, classification and an in-depth meta analysis meta-anlaysis of about 100 conference and journal papers to gain a better understanding of existing literature.Comment: 18 pages, 16 figures, 4 tables. This paper is a comprehensive and detailed literature survey to understand current state-of-the-art of Online Social Media Intelligence to counter and combat ISI related threat

    A Survey of Document Grounded Dialogue Systems (DGDS)

    Full text link
    Dialogue system (DS) attracts great attention from industry and academia because of its wide application prospects. Researchers usually divide the DS according to the function. However, many conversations require the DS to switch between different functions. For example, movie discussion can change from chit-chat to QA, the conversational recommendation can transform from chit-chat to recommendation, etc. Therefore, classification according to functions may not be enough to help us appreciate the current development trend. We classify the DS based on background knowledge. Specifically, study the latest DS based on the unstructured document(s). We define Document Grounded Dialogue System (DGDS) as the DS that the dialogues are centering on the given document(s). The DGDS can be used in scenarios such as talking over merchandise against product Manual, commenting on news reports, etc. We believe that extracting unstructured document(s) information is the future trend of the DS because a great amount of human knowledge lies in these document(s). The research of the DGDS not only possesses a broad application prospect but also facilitates AI to better understand human knowledge and natural language. We analyze the classification, architecture, datasets, models, and future development trends of the DGDS, hoping to help researchers in this field.Comment: 30 pages, 4 figures, 13 table

    End-to-end Learning for Short Text Expansion

    Full text link
    Effectively making sense of short texts is a critical task for many real world applications such as search engines, social media services, and recommender systems. The task is particularly challenging as a short text contains very sparse information, often too sparse for a machine learning algorithm to pick up useful signals. A common practice for analyzing short text is to first expand it with external information, which is usually harvested from a large collection of longer texts. In literature, short text expansion has been done with all kinds of heuristics. We propose an end-to-end solution that automatically learns how to expand short text to optimize a given learning task. A novel deep memory network is proposed to automatically find relevant information from a collection of longer documents and reformulate the short text through a gating mechanism. Using short text classification as a demonstrating task, we show that the deep memory network significantly outperforms classical text expansion methods with comprehensive experiments on real world data sets.Comment: KDD'201

    RubyStar: A Non-Task-Oriented Mixture Model Dialog System

    Full text link
    RubyStar is a dialog system designed to create "human-like" conversation by combining different response generation strategies. RubyStar conducts a non-task-oriented conversation on general topics by using an ensemble of rule-based, retrieval-based and generative methods. Topic detection, engagement monitoring, and context tracking are used for managing interaction. Predictable elements of conversation, such as the bot's backstory and simple question answering are handled by separate modules. We describe a rating scheme we developed for evaluating response generation. We find that character-level RNN is an effective generation model for general responses, with proper parameter settings; however other kinds of conversation topics might benefit from using other models

    Machine Learning with World Knowledge: The Position and Survey

    Full text link
    Machine learning has become pervasive in multiple domains, impacting a wide variety of applications, such as knowledge discovery and data mining, natural language processing, information retrieval, computer vision, social and health informatics, ubiquitous computing, etc. Two essential problems of machine learning are how to generate features and how to acquire labels for machines to learn. Particularly, labeling large amount of data for each domain-specific problem can be very time consuming and costly. It has become a key obstacle in making learning protocols realistic in applications. In this paper, we will discuss how to use the existing general-purpose world knowledge to enhance machine learning processes, by enriching the features or reducing the labeling work. We start from the comparison of world knowledge with domain-specific knowledge, and then introduce three key problems in using world knowledge in learning processes, i.e., explicit and implicit feature representation, inference for knowledge linking and disambiguation, and learning with direct or indirect supervision. Finally we discuss the future directions of this research topic

    Which Emoji Talks Best for My Picture?

    Full text link
    Emojis have evolved as complementary sources for expressing emotion in social-media platforms where posts are mostly composed of texts and images. In order to increase the expressiveness of the social media posts, users associate relevant emojis with their posts. Incorporating domain knowledge has improved machine understanding of text. In this paper, we investigate whether domain knowledge for emoji can improve the accuracy of emoji recommendation task in case of multimedia posts composed of image and text. Our emoji recommendation can suggest accurate emojis by exploiting both visual and textual content from social media posts as well as domain knowledge from Emojinet. Experimental results using pre-trained image classifiers and pre-trained word embedding models on Twitter dataset show that our results outperform the current state-of-the-art by 9.6\%. We also present a user study evaluation of our recommendation system on a set of images chosen from MSCOCO dataset.Comment: Accepted at the 2018 IEEE/WIC/ACM International Conference on Web Intelligence (WI '18), December 3-6, 2018, Santiago de Chil

    How did the discussion go: Discourse act classification in social media conversations

    Full text link
    We propose a novel attention based hierarchical LSTM model to classify discourse act sequences in social media conversations, aimed at mining data from online discussion using textual meanings beyond sentence level. The very uniqueness of the task is the complete categorization of possible pragmatic roles in informal textual discussions, contrary to extraction of question-answers, stance detection or sarcasm identification which are very much role specific tasks. Early attempt was made on a Reddit discussion dataset. We train our model on the same data, and present test results on two different datasets, one from Reddit and one from Facebook. Our proposed model outperformed the previous one in terms of domain independence; without using platform-dependent structural features, our hierarchical LSTM with word relevance attention mechanism achieved F1-scores of 71\% and 66\% respectively to predict discourse roles of comments in Reddit and Facebook discussions. Efficiency of recurrent and convolutional architectures in order to learn discursive representation on the same task has been presented and analyzed, with different word and comment embedding schemes. Our attention mechanism enables us to inquire into relevance ordering of text segments according to their roles in discourse. We present a human annotator experiment to unveil important observations about modeling and data annotation. Equipped with our text-based discourse identification model, we inquire into how heterogeneous non-textual features like location, time, leaning of information etc. play their roles in charaterizing online discussions on Facebook
    • …
    corecore