34 research outputs found
OMG U got flu? Analysis of shared health messages for bio-surveillance
Background: Micro-blogging services such as Twitter offer the potential to
crowdsource epidemics in real-time. However, Twitter posts ('tweets') are often
ambiguous and reactive to media trends. In order to ground user messages in
epidemic response we focused on tracking reports of self-protective behaviour
such as avoiding public gatherings or increased sanitation as the basis for
further risk analysis. Results: We created guidelines for tagging self
protective behaviour based on Jones and Salath\'e (2009)'s behaviour response
survey. Applying the guidelines to a corpus of 5283 Twitter messages related to
influenza like illness showed a high level of inter-annotator agreement (kappa
0.86). We employed supervised learning using unigrams, bigrams and regular
expressions as features with two supervised classifiers (SVM and Naive Bayes)
to classify tweets into 4 self-reported protective behaviour categories plus a
self-reported diagnosis. In addition to classification performance we report
moderately strong Spearman's Rho correlation by comparing classifier output
against WHO/NREVSS laboratory data for A(H1N1) in the USA during the 2009-2010
influenza season. Conclusions: The study adds to evidence supporting a high
degree of correlation between pre-diagnostic social media signals and
diagnostic influenza case data, pointing the way towards low cost sensor
networks. We believe that the signals we have modelled may be applicable to a
wide range of diseases
Associations between exposure to and expression of negative opinions about Human Papillomavirus vaccines on social media: an observational study
Background
Groups and individuals that seek to negatively influence public opinion about the safety and value of vaccination are active in online and social media and may influence decision making within some communities.
Objective
We sought to measure whether exposure to negative opinions about human papillomavirus (HPV) vaccines in Twitter communities is associated with the subsequent expression of negative opinions by explicitly measuring potential information exposure over the social structure of Twitter communities.
Methods
We hypothesized that prior exposure to opinions rejecting the safety or value of HPV vaccines would be associated with an increased risk of posting similar opinions and tested this hypothesis by analyzing temporal sequences of messages posted on Twitter (tweets). The study design was a retrospective analysis of tweets related to HPV vaccines and the social connections between users. Between October 2013 and April 2014, we collected 83,551 English-language tweets that included terms related to HPV vaccines and the 957,865 social connections among 30,621 users posting or reposting the tweets. Tweets were classified as expressing negative or neutral/positive opinions using a machine learning classifier previously trained on a manually labeled sample.
Results
During the 6-month period, 25.13% (20,994/83,551) of tweets were classified as negative; among the 30,621 users that tweeted about HPV vaccines, 9046 (29.54%) were exposed to a majority of negative tweets. The likelihood of a user posting a negative tweet after exposure to a majority of negative opinions was 37.78% (2780/7361) compared to 10.92% (1234/11,296) for users who were exposed to a majority of positive and neutral tweets corresponding to a relative risk of 3.46 (95% CI 3.25-3.67, P<.001).
Conclusions
The heterogeneous community structure on Twitter appears to skew the information to which users are exposed in relation to HPV vaccines. We found that among users that tweeted about HPV vaccines, those who were more often exposed to negative opinions were more likely to subsequently post negative opinions. Although this research may be useful for identifying individuals and groups currently at risk of disproportionate exposure to misinformation about HPV vaccines, there is a clear need for studies capable of determining the factors that affect the formation and adoption of beliefs about public health interventions
Identifying Purpose Behind Electoral Tweets
Tweets pertaining to a single event, such as a national election, can number
in the hundreds of millions. Automatically analyzing them is beneficial in many
downstream natural language applications such as question answering and
summarization. In this paper, we propose a new task: identifying the purpose
behind electoral tweets--why do people post election-oriented tweets? We show
that identifying purpose is correlated with the related phenomenon of sentiment
and emotion detection, but yet significantly different. Detecting purpose has a
number of applications including detecting the mood of the electorate,
estimating the popularity of policies, identifying key issues of contention,
and predicting the course of events. We create a large dataset of electoral
tweets and annotate a few thousand tweets for purpose. We develop a system that
automatically classifies electoral tweets as per their purpose, obtaining an
accuracy of 43.56% on an 11-class task and an accuracy of 73.91% on a 3-class
task (both accuracies well above the most-frequent-class baseline). Finally, we
show that resources developed for emotion detection are also helpful for
detecting purpose
Enhancing Twitter Data Analysis with Simple Semantic Filtering: Example in Tracking Influenza-Like Illnesses
Systems that exploit publicly available user generated content such as
Twitter messages have been successful in tracking seasonal influenza. We
developed a novel filtering method for Influenza-Like-Illnesses (ILI)-related
messages using 587 million messages from Twitter micro-blogs. We first filtered
messages based on syndrome keywords from the BioCaster Ontology, an extant
knowledge model of laymen's terms. We then filtered the messages according to
semantic features such as negation, hashtags, emoticons, humor and geography.
The data covered 36 weeks for the US 2009 influenza season from 30th August
2009 to 8th May 2010. Results showed that our system achieved the highest
Pearson correlation coefficient of 98.46% (p-value<2.2e-16), an improvement of
3.98% over the previous state-of-the-art method. The results indicate that
simple NLP-based enhancements to existing approaches to mine Twitter data can
increase the value of this inexpensive resource.Comment: 10 pages, 5 figures, IEEE HISB 2012 conference, Sept 27-28, 2012, La
Jolla, California, U
The Addition Symptoms Parameter on Sentiment Analysis to Measure Public Health Concerns
Information about public health has a very important role not only for health practitioners, but also for goverment. The importance of health information can also affect the emotional changes that occur in the community, especially if there is news about the spread of infectious disease (epidemic) in particular area at the time, such as case of outbreaks Ebola disease or Mers in specific area. Based on data obtained from Semiocast, Indonesia is the country with fifth largest number of Twitter users in the world, where every topic that lively discussed will also influence a global trending topic. This paper will discuss the measurement of public health concern (Degree of Concern) level by using sentiment analysis classification on the twitter status. Sentiment data of the tweets were analyzed and given some value by using a scoring method. The scoring method equation (Kumar A. et al., 2012) will be tested with new additional parameters, ie symptoms parameters. The value of any twitter user sentiment is determined based on adjectives, verbs, and adverbs that contained in the sentence. The method that we used to find the semantic value of adjectives is corpus-based method. While for finding the semantic value of the verb and adverb we used a dictionary-based method