61,837 research outputs found
Deep Short Text Classification with Knowledge Powered Attention
Short text classification is one of important tasks in Natural Language
Processing (NLP). Unlike paragraphs or documents, short texts are more
ambiguous since they have not enough contextual information, which poses a
great challenge for classification. In this paper, we retrieve knowledge from
external knowledge source to enhance the semantic representation of short
texts. We take conceptual information as a kind of knowledge and incorporate it
into deep neural networks. For the purpose of measuring the importance of
knowledge, we introduce attention mechanisms and propose deep Short Text
Classification with Knowledge powered Attention (STCKA). We utilize Concept
towards Short Text (C- ST) attention and Concept towards Concept Set (C-CS)
attention to acquire the weight of concepts from two aspects. And we classify a
short text with the help of conceptual information. Unlike traditional
approaches, our model acts like a human being who has intrinsic ability to make
decisions based on observation (i.e., training data for machines) and pays more
attention to important knowledge. We also conduct extensive experiments on four
public datasets for different tasks. The experimental results and case studies
show that our model outperforms the state-of-the-art methods, justifying the
effectiveness of knowledge powered attention
Short Text Topic Modeling Techniques, Applications, and Performance: A Survey
Analyzing short texts infers discriminative and coherent latent topics that
is a critical and fundamental task since many real-world applications require
semantic understanding of short texts. Traditional long text topic modeling
algorithms (e.g., PLSA and LDA) based on word co-occurrences cannot solve this
problem very well since only very limited word co-occurrence information is
available in short texts. Therefore, short text topic modeling has already
attracted much attention from the machine learning research community in recent
years, which aims at overcoming the problem of sparseness in short texts. In
this survey, we conduct a comprehensive review of various short text topic
modeling techniques proposed in the literature. We present three categories of
methods based on Dirichlet multinomial mixture, global word co-occurrences, and
self-aggregation, with example of representative approaches in each category
and analysis of their performance on various tasks. We develop the first
comprehensive open-source library, called STTM, for use in Java that integrates
all surveyed algorithms within a unified interface, benchmark datasets, to
facilitate the expansion of new methods in this research field. Finally, we
evaluate these state-of-the-art methods on many real-world datasets and compare
their performance against one another and versus long text topic modeling
algorithm.Comment: arXiv admin note: text overlap with arXiv:1808.02215 by other author
ConCET: Entity-Aware Topic Classification for Open-Domain Conversational Agents
Identifying the topic (domain) of each user's utterance in open-domain
conversational systems is a crucial step for all subsequent language
understanding and response tasks. In particular, for complex domains, an
utterance is often routed to a single component responsible for that domain.
Thus, correctly mapping a user utterance to the right domain is critical. To
address this problem, we introduce ConCET: a Concurrent Entity-aware
conversational Topic classifier, which incorporates entity-type information
together with the utterance content features. Specifically, ConCET utilizes
entity information to enrich the utterance representation, combining character,
word, and entity-type embeddings into a single representation. However, for
rich domains with millions of available entities, unrealistic amounts of
labeled training data would be required. To complement our model, we propose a
simple and effective method for generating synthetic training data, to augment
the typically limited amounts of labeled training data, using commonly
available knowledge bases to generate additional labeled utterances. We
extensively evaluate ConCET and our proposed training method first on an openly
available human-human conversational dataset called Self-Dialogue, to calibrate
our approach against previous state-of-the-art methods; second, we evaluate
ConCET on a large dataset of human-machine conversations with real users,
collected as part of the Amazon Alexa Prize. Our results show that ConCET
significantly improves topic classification performance on both datasets,
including 8-10% improvements over state-of-the-art deep learning methods. We
complement our quantitative results with detailed analysis of system
performance, which could be used for further improvements of conversational
agents.Comment: CIKM 201
Applying Social Media Intelligence for Predicting and Identifying On-line Radicalization and Civil Unrest Oriented Threats
Research shows that various social media platforms on Internet such as
Twitter, Tumblr (micro-blogging websites), Facebook (a popular social
networking website), YouTube (largest video sharing and hosting website), Blogs
and discussion forums are being misused by extremist groups for spreading their
beliefs and ideologies, promoting radicalization, recruiting members and
creating online virtual communities sharing a common agenda. Popular
microblogging websites such as Twitter are being used as a real-time platform
for information sharing and communication during planning and mobilization if
civil unrest related events. Applying social media intelligence for predicting
and identifying online radicalization and civil unrest oriented threats is an
area that has attracted several researchers' attention over past 10 years.
There are several algorithms, techniques and tools that have been proposed in
existing literature to counter and combat cyber-extremism and predicting
protest related events in much advance. In this paper, we conduct a literature
review of all these existing techniques and do a comprehensive analysis to
understand state-of-the-art, trends and research gaps. We present a one class
classification approach to collect scholarly articles targeting the topics and
subtopics of our research scope. We perform characterization, classification
and an in-depth meta analysis meta-anlaysis of about 100 conference and journal
papers to gain a better understanding of existing literature.Comment: 18 pages, 16 figures, 4 tables. This paper is a comprehensive and
detailed literature survey to understand current state-of-the-art of Online
Social Media Intelligence to counter and combat ISI related threat
A Survey of Document Grounded Dialogue Systems (DGDS)
Dialogue system (DS) attracts great attention from industry and academia
because of its wide application prospects. Researchers usually divide the DS
according to the function. However, many conversations require the DS to switch
between different functions. For example, movie discussion can change from
chit-chat to QA, the conversational recommendation can transform from chit-chat
to recommendation, etc. Therefore, classification according to functions may
not be enough to help us appreciate the current development trend. We classify
the DS based on background knowledge. Specifically, study the latest DS based
on the unstructured document(s). We define Document Grounded Dialogue System
(DGDS) as the DS that the dialogues are centering on the given document(s). The
DGDS can be used in scenarios such as talking over merchandise against product
Manual, commenting on news reports, etc. We believe that extracting
unstructured document(s) information is the future trend of the DS because a
great amount of human knowledge lies in these document(s). The research of the
DGDS not only possesses a broad application prospect but also facilitates AI to
better understand human knowledge and natural language. We analyze the
classification, architecture, datasets, models, and future development trends
of the DGDS, hoping to help researchers in this field.Comment: 30 pages, 4 figures, 13 table
End-to-end Learning for Short Text Expansion
Effectively making sense of short texts is a critical task for many real
world applications such as search engines, social media services, and
recommender systems. The task is particularly challenging as a short text
contains very sparse information, often too sparse for a machine learning
algorithm to pick up useful signals. A common practice for analyzing short text
is to first expand it with external information, which is usually harvested
from a large collection of longer texts. In literature, short text expansion
has been done with all kinds of heuristics. We propose an end-to-end solution
that automatically learns how to expand short text to optimize a given learning
task. A novel deep memory network is proposed to automatically find relevant
information from a collection of longer documents and reformulate the short
text through a gating mechanism. Using short text classification as a
demonstrating task, we show that the deep memory network significantly
outperforms classical text expansion methods with comprehensive experiments on
real world data sets.Comment: KDD'201
RubyStar: A Non-Task-Oriented Mixture Model Dialog System
RubyStar is a dialog system designed to create "human-like" conversation by
combining different response generation strategies. RubyStar conducts a
non-task-oriented conversation on general topics by using an ensemble of
rule-based, retrieval-based and generative methods. Topic detection, engagement
monitoring, and context tracking are used for managing interaction. Predictable
elements of conversation, such as the bot's backstory and simple question
answering are handled by separate modules. We describe a rating scheme we
developed for evaluating response generation. We find that character-level RNN
is an effective generation model for general responses, with proper parameter
settings; however other kinds of conversation topics might benefit from using
other models
Machine Learning with World Knowledge: The Position and Survey
Machine learning has become pervasive in multiple domains, impacting a wide
variety of applications, such as knowledge discovery and data mining, natural
language processing, information retrieval, computer vision, social and health
informatics, ubiquitous computing, etc. Two essential problems of machine
learning are how to generate features and how to acquire labels for machines to
learn. Particularly, labeling large amount of data for each domain-specific
problem can be very time consuming and costly. It has become a key obstacle in
making learning protocols realistic in applications. In this paper, we will
discuss how to use the existing general-purpose world knowledge to enhance
machine learning processes, by enriching the features or reducing the labeling
work. We start from the comparison of world knowledge with domain-specific
knowledge, and then introduce three key problems in using world knowledge in
learning processes, i.e., explicit and implicit feature representation,
inference for knowledge linking and disambiguation, and learning with direct or
indirect supervision. Finally we discuss the future directions of this research
topic
Which Emoji Talks Best for My Picture?
Emojis have evolved as complementary sources for expressing emotion in
social-media platforms where posts are mostly composed of texts and images. In
order to increase the expressiveness of the social media posts, users associate
relevant emojis with their posts. Incorporating domain knowledge has improved
machine understanding of text. In this paper, we investigate whether domain
knowledge for emoji can improve the accuracy of emoji recommendation task in
case of multimedia posts composed of image and text. Our emoji recommendation
can suggest accurate emojis by exploiting both visual and textual content from
social media posts as well as domain knowledge from Emojinet. Experimental
results using pre-trained image classifiers and pre-trained word embedding
models on Twitter dataset show that our results outperform the current
state-of-the-art by 9.6\%. We also present a user study evaluation of our
recommendation system on a set of images chosen from MSCOCO dataset.Comment: Accepted at the 2018 IEEE/WIC/ACM International Conference on Web
Intelligence (WI '18), December 3-6, 2018, Santiago de Chil
How did the discussion go: Discourse act classification in social media conversations
We propose a novel attention based hierarchical LSTM model to classify
discourse act sequences in social media conversations, aimed at mining data
from online discussion using textual meanings beyond sentence level. The very
uniqueness of the task is the complete categorization of possible pragmatic
roles in informal textual discussions, contrary to extraction of
question-answers, stance detection or sarcasm identification which are very
much role specific tasks. Early attempt was made on a Reddit discussion
dataset. We train our model on the same data, and present test results on two
different datasets, one from Reddit and one from Facebook. Our proposed model
outperformed the previous one in terms of domain independence; without using
platform-dependent structural features, our hierarchical LSTM with word
relevance attention mechanism achieved F1-scores of 71\% and 66\% respectively
to predict discourse roles of comments in Reddit and Facebook discussions.
Efficiency of recurrent and convolutional architectures in order to learn
discursive representation on the same task has been presented and analyzed,
with different word and comment embedding schemes. Our attention mechanism
enables us to inquire into relevance ordering of text segments according to
their roles in discourse. We present a human annotator experiment to unveil
important observations about modeling and data annotation. Equipped with our
text-based discourse identification model, we inquire into how heterogeneous
non-textual features like location, time, leaning of information etc. play
their roles in charaterizing online discussions on Facebook
- …