6,312 research outputs found
Enhancing Twitter Data Analysis with Simple Semantic Filtering: Example in Tracking Influenza-Like Illnesses
Systems that exploit publicly available user generated content such as
Twitter messages have been successful in tracking seasonal influenza. We
developed a novel filtering method for Influenza-Like-Illnesses (ILI)-related
messages using 587 million messages from Twitter micro-blogs. We first filtered
messages based on syndrome keywords from the BioCaster Ontology, an extant
knowledge model of laymen's terms. We then filtered the messages according to
semantic features such as negation, hashtags, emoticons, humor and geography.
The data covered 36 weeks for the US 2009 influenza season from 30th August
2009 to 8th May 2010. Results showed that our system achieved the highest
Pearson correlation coefficient of 98.46% (p-value<2.2e-16), an improvement of
3.98% over the previous state-of-the-art method. The results indicate that
simple NLP-based enhancements to existing approaches to mine Twitter data can
increase the value of this inexpensive resource.Comment: 10 pages, 5 figures, IEEE HISB 2012 conference, Sept 27-28, 2012, La
Jolla, California, U
Using Twitter to learn about the autism community
Considering the raising socio-economic burden of autism spectrum disorder
(ASD), timely and evidence-driven public policy decision making and
communication of the latest guidelines pertaining to the treatment and
management of the disorder is crucial. Yet evidence suggests that policy makers
and medical practitioners do not always have a good understanding of the
practices and relevant beliefs of ASD-afflicted individuals' carers who often
follow questionable recommendations and adopt advice poorly supported by
scientific data. The key goal of the present work is to explore the idea that
Twitter, as a highly popular platform for information exchange, could be used
as a data-mining source to learn about the population affected by ASD -- their
behaviour, concerns, needs etc. To this end, using a large data set of over 11
million harvested tweets as the basis for our investigation, we describe a
series of experiments which examine a range of linguistic and semantic aspects
of messages posted by individuals interested in ASD. Our findings, the first of
their nature in the published scientific literature, strongly motivate
additional research on this topic and present a methodological basis for
further work.Comment: Social Network Analysis and Mining, 201
- …