19,826 research outputs found
Exploratory Analysis of Highly Heterogeneous Document Collections
We present an effective multifaceted system for exploratory analysis of
highly heterogeneous document collections. Our system is based on intelligently
tagging individual documents in a purely automated fashion and exploiting these
tags in a powerful faceted browsing framework. Tagging strategies employed
include both unsupervised and supervised approaches based on machine learning
and natural language processing. As one of our key tagging strategies, we
introduce the KERA algorithm (Keyword Extraction for Reports and Articles).
KERA extracts topic-representative terms from individual documents in a purely
unsupervised fashion and is revealed to be significantly more effective than
state-of-the-art methods. Finally, we evaluate our system in its ability to
help users locate documents pertaining to military critical technologies buried
deep in a large heterogeneous sea of information.Comment: 9 pages; KDD 2013: 19th ACM SIGKDD Conference on Knowledge Discovery
and Data Minin
Computational Content Analysis of Negative Tweets for Obesity, Diet, Diabetes, and Exercise
Social media based digital epidemiology has the potential to support faster
response and deeper understanding of public health related threats. This study
proposes a new framework to analyze unstructured health related textual data
via Twitter users' post (tweets) to characterize the negative health sentiments
and non-health related concerns in relations to the corpus of negative
sentiments, regarding Diet Diabetes Exercise, and Obesity (DDEO). Through the
collection of 6 million Tweets for one month, this study identified the
prominent topics of users as it relates to the negative sentiments. Our
proposed framework uses two text mining methods, sentiment analysis and topic
modeling, to discover negative topics. The negative sentiments of Twitter users
support the literature narratives and the many morbidity issues that are
associated with DDEO and the linkage between obesity and diabetes. The
framework offers a potential method to understand the publics' opinions and
sentiments regarding DDEO. More importantly, this research provides new
opportunities for computational social scientists, medical experts, and public
health professionals to collectively address DDEO-related issues.Comment: The 2017 Annual Meeting of the Association for Information Science
and Technology (ASIST
Is That Twitter Hashtag Worth Reading
Online social media such as Twitter, Facebook, Wikis and Linkedin have made a
great impact on the way we consume information in our day to day life. Now it
has become increasingly important that we come across appropriate content from
the social media to avoid information explosion. In case of Twitter, popular
information can be tracked using hashtags. Studying the characteristics of
tweets containing hashtags becomes important for a number of tasks, such as
breaking news detection, personalized message recommendation, friends
recommendation, and sentiment analysis among others.
In this paper, we have analyzed Twitter data based on trending hashtags,
which is widely used nowadays. We have used event based hashtags to know users'
thoughts on those events and to decide whether the rest of the users might find
it interesting or not. We have used topic modeling, which reveals the hidden
thematic structure of the documents (tweets in this case) in addition to
sentiment analysis in exploring and summarizing the content of the documents. A
technique to find the interestingness of event based twitter hashtag and the
associated sentiment has been proposed. The proposed technique helps twitter
follower to read, relevant and interesting hashtag.Comment: 10 pages, 6 figures, Presented at the Third International Symposium
on Women in Computing and Informatics (WCI-2015
- …