4 research outputs found
Global disease monitoring and forecasting with Wikipedia
Infectious disease is a leading threat to public health, economic stability,
and other key social structures. Efforts to mitigate these impacts depend on
accurate and timely monitoring to measure the risk and progress of disease.
Traditional, biologically-focused monitoring techniques are accurate but costly
and slow; in response, new techniques based on social internet data such as
social media and search queries are emerging. These efforts are promising, but
important challenges in the areas of scientific peer review, breadth of
diseases and countries, and forecasting hamper their operational usefulness.
We examine a freely available, open data source for this use: access logs
from the online encyclopedia Wikipedia. Using linear models, language as a
proxy for location, and a systematic yet simple article selection procedure, we
tested 14 location-disease combinations and demonstrate that these data
feasibly support an approach that overcomes these challenges. Specifically, our
proof-of-concept yields models with up to 0.92, forecasting value up to
the 28 days tested, and several pairs of models similar enough to suggest that
transferring models from one location to another without re-training is
feasible.
Based on these preliminary results, we close with a research agenda designed
to overcome these challenges and produce a disease monitoring and forecasting
system that is significantly more effective, robust, and globally comprehensive
than the current state of the art.Comment: 27 pages; 4 figures; 4 tables. Version 2: Cite McIver & Brownstein
and adjust novelty claims accordingly; revise title; various revisions for
clarit
Social Networks: A Cradle of Globalized Culture in the Mediterranean Region
International audienceMemes have been defined by R. Dawkins as cultural phenomena that propagate through non genetic ways. In this paper we examine three very popular Internet Memes and study their impact on the society in mediterranean countries. We use google trends as well as Topsy in order to quantify the impact of the Memes on the mediterranean societies. We obtain quite different results with the different tools we use, which we attempt to explain based on some propagation characteristic of each one of the Memes. Our analysis shows the extent at which these Memes cross borders and thus contribute to the creation of a globalized culture. We end the paper by identifying some of the impacts of the globalization of culture
An approach for using Wikipedia to measure the flow of trends across countries
Wikipedia has grown to become the most successful online encyclopedia on the Web, containing over 24 million articles, offered in over 240 languages. In just over 10 years Wikipedia has transformed from being just an encyclopedia of knowledge, to a wealth of facts and information, from articles discussing trivia, political issues, geographies and demographics, to popular culture, news articles, and social events. In this paper we explore the use of Wikipedia for identifying the flow of information and trends across the world. We start with the hypothesis that, given that Wikipedia is a resource that is globally available in different languages across countries, access to its articles could be a reflection human activity. To explore this hypothesis we try to establish metrics on the use of Wikipedia in order to identify potential trends and to establish whether or how those trends flow from one county to another. We subsequently compare the outcome of this analysis to that of more established methods that are based on online social media or traditional media. We explore this hypothesis by applying our approach to a subset of Wikipedia articles and also a specific worldwide social phenomenon that occurred during 2012; we investigate whether access to relevant Wikipedia articles correlates to the viral success of the South Korean pop song, “Gangnam Style ” and the associated artist “PSY ” as evidenced by traditional and online social media. Our analysis demonstrates that Wikipedia can indeed provide a useful measure for detecting social trends and events, and in the case that we studied; it could have been possible to identify the specific trend quicker in comparison to other established trend identification services such as Google Trends
Understanding the topics and opinions from social media content
Social media has become one indispensable part of people’s daily life, as it records and reflects people’s opinions and events of interest, as well as influences people’s perceptions. As the most commonly employed and easily accessed data format on social media, a great deal of the social media textual content is not only factual and objective, but also rich in opinionated information. Thus, besides the topics Internet users are talking about in social media textual content, it is also of great importance to understand the opinions they are expressing. In this thesis, I present my broadly applicable text mining approaches, in order to understand the topics and opinions of user-generated texts on social media, to provide insights about the thoughts of Internet users on entities, events, etc. Specifically, I develop approaches to understand the semantic differences between language-specific editions of Wikipedia, when discussing certain entities from the related topical aspects perspective and the aggregated sentiment bias perspective. Moreover, I employ effective features to detect the reputation-influential sentences for person and company entities in Wikipedia articles, which lead to the detected sentiment bias. Furthermore, I propose neural network models with different levels of attention mechanism, to detect the stances of tweets towards any given target. I also introduce an online timeline generation approach, to detect and summarise the relevant sub-topics in the tweet stream, in order to provide Internet users with some insights about the evolution of major events they are interested in