19,976 research outputs found
Global disease monitoring and forecasting with Wikipedia
Infectious disease is a leading threat to public health, economic stability,
and other key social structures. Efforts to mitigate these impacts depend on
accurate and timely monitoring to measure the risk and progress of disease.
Traditional, biologically-focused monitoring techniques are accurate but costly
and slow; in response, new techniques based on social internet data such as
social media and search queries are emerging. These efforts are promising, but
important challenges in the areas of scientific peer review, breadth of
diseases and countries, and forecasting hamper their operational usefulness.
We examine a freely available, open data source for this use: access logs
from the online encyclopedia Wikipedia. Using linear models, language as a
proxy for location, and a systematic yet simple article selection procedure, we
tested 14 location-disease combinations and demonstrate that these data
feasibly support an approach that overcomes these challenges. Specifically, our
proof-of-concept yields models with up to 0.92, forecasting value up to
the 28 days tested, and several pairs of models similar enough to suggest that
transferring models from one location to another without re-training is
feasible.
Based on these preliminary results, we close with a research agenda designed
to overcome these challenges and produce a disease monitoring and forecasting
system that is significantly more effective, robust, and globally comprehensive
than the current state of the art.Comment: 27 pages; 4 figures; 4 tables. Version 2: Cite McIver & Brownstein
and adjust novelty claims accordingly; revise title; various revisions for
clarit
Measuring internet activity: a (selective) review of methods and metrics
Two Decades after the birth of the World Wide Web, more than two billion people around the world are Internet users. The digital landscape is littered with hints that the affordances of digital communications are being leveraged to transform life in profound and important ways. The reach and influence of digitally mediated activity grow by the day and touch upon all aspects of life, from health, education, and commerce to religion and governance. This trend demands that we seek answers to the biggest questions about how digitally mediated communication changes society and the role of different policies in helping or hindering the beneficial aspects of these changes. Yet despite the profusion of data the digital age has brought upon usâwe now have access to a flood of information about the movements, relationships, purchasing decisions, interests, and intimate thoughts of people around the worldâthe distance between the great questions of the digital age and our understanding of the impact of digital communications on society remains large. A number of ongoing policy questions have emerged that beg for better empirical data and analyses upon which to base wider and more insightful perspectives on the mechanics of social, economic, and political life online. This paper seeks to describe the conceptual and practical impediments to measuring and understanding digital activity and highlights a sample of the many efforts to fill the gap between our incomplete understanding of digital life and the formidable policy questions related to developing a vibrant and healthy Internet that serves the public interest and contributes to human wellbeing. Our primary focus is on efforts to measure Internet activity, as we believe obtaining robust, accurate data is a necessary and valuable first step that will lead us closer to answering the vitally important questions of the digital realm. Even this step is challenging: the Internet is difficult to measure and monitor, and there is no simple aggregate measure of Internet activityâno GDP, no HDI. In the following section we present a framework for assessing efforts to document digital activity. The next three sections offer a summary and description of many of the ongoing projects that document digital activity, with two final sections devoted to discussion and conclusions
Mapping bilateral information interests using the activity of Wikipedia editors
We live in a global village where electronic communication has eliminated the
geographical barriers of information exchange. The road is now open to
worldwide convergence of information interests, shared values, and
understanding. Nevertheless, interests still vary between countries around the
world. This raises important questions about what today's world map of in-
formation interests actually looks like and what factors cause the barriers of
information exchange between countries. To quantitatively construct a world map
of information interests, we devise a scalable statistical model that
identifies countries with similar information interests and measures the
countries' bilateral similarities. From the similarities we connect countries
in a global network and find that countries can be mapped into 18 clusters with
similar information interests. Through regression we find that language and
religion best explain the strength of the bilateral ties and formation of
clusters. Our findings provide a quantitative basis for further studies to
better understand the complex interplay between shared interests and conflict
on a global scale. The methodology can also be extended to track changes over
time and capture important trends in global information exchange.Comment: 11 pages, 3 figures in Palgrave Communications 1 (2015
Are anonymity-seekers just like everybody else? An analysis of contributions to Wikipedia from Tor
User-generated content sites routinely block contributions from users of
privacy-enhancing proxies like Tor because of a perception that proxies are a
source of vandalism, spam, and abuse. Although these blocks might be effective,
collateral damage in the form of unrealized valuable contributions from
anonymity seekers is invisible. One of the largest and most important
user-generated content sites, Wikipedia, has attempted to block contributions
from Tor users since as early as 2005. We demonstrate that these blocks have
been imperfect and that thousands of attempts to edit on Wikipedia through Tor
have been successful. We draw upon several data sources and analytical
techniques to measure and describe the history of Tor editing on Wikipedia over
time and to compare contributions from Tor users to those from other groups of
Wikipedia users. Our analysis suggests that although Tor users who slip through
Wikipedia's ban contribute content that is more likely to be reverted and to
revert others, their contributions are otherwise similar in quality to those
from other unregistered participants and to the initial contributions of
registered users.Comment: To appear in the IEEE Symposium on Security & Privacy, May 202
Thought for Food: the impact of ICT on agribusiness
This report outlines the impact of ICT on the food economy. On the basis of a literature review from four disciplines - knowledge management, management information systems, operations research and logistics, and economics - the demand for new ICT applications, the supply of new applications and the match between demand and supply are identified. Subsequently the impact of new ICT applications on the food economy is discussed. The report relates the development of new technologies to innovation and adoption processes and economic growth, and to concepts of open innovations and living lab
Cultural Diffusion and Trends in Facebook Photographs
Online social media is a social vehicle in which people share various moments
of their lives with their friends, such as playing sports, cooking dinner or
just taking a selfie for fun, via visual means, that is, photographs. Our study
takes a closer look at the popular visual concepts illustrating various
cultural lifestyles from aggregated, de-identified photographs. We perform
analysis both at macroscopic and microscopic levels, to gain novel insights
about global and local visual trends as well as the dynamics of interpersonal
cultural exchange and diffusion among Facebook friends. We processed images by
automatically classifying the visual content by a convolutional neural network
(CNN). Through various statistical tests, we find that socially tied
individuals more likely post images showing similar cultural lifestyles. To
further identify the main cause of the observed social correlation, we use the
Shuffle test and the Preference-based Matched Estimation (PME) test to
distinguish the effects of influence and homophily. The results indicate that
the visual content of each user's photographs are temporally, although not
necessarily causally, correlated with the photographs of their friends, which
may suggest the effect of influence. Our paper demonstrates that Facebook
photographs exhibit diverse cultural lifestyles and preferences and that the
social interaction mediated through the visual channel in social media can be
an effective mechanism for cultural diffusion.Comment: 10 pages, To appear in ICWSM 2017 (Full Paper
Structuring visual exploratory analysis of skill demand
The analysis of increasingly large and diverse data for meaningful interpretation and question answering is handicapped by human cognitive limitations. Consequently, semi-automatic abstraction of complex data within structured information spaces becomes increasingly important, if its knowledge content is to support intuitive, exploratory discovery. Exploration of skill demand is an area where regularly updated, multi-dimensional data may be exploited to assess capability within the workforce to manage the demands of the modern, technology- and data-driven economy. The knowledge derived may be employed by skilled practitioners in defining career pathways, to identify where, when and how to update their skillsets in line with advancing technology and changing work demands. This same knowledge may also be used to identify the combination of skills essential in recruiting for new roles. To address the challenges inherent in exploring the complex, heterogeneous, dynamic data that feeds into such applications, we investigate the use of an ontology to guide structuring of the information space, to allow individuals and institutions to interactively explore and interpret the dynamic skill demand landscape for their specific needs. As a test case we consider the relatively new and highly dynamic field of Data Science, where insightful, exploratory data analysis and knowledge discovery are critical. We employ context-driven and task-centred scenarios to explore our research questions and guide iterative design, development and formative evaluation of our ontology-driven, visual exploratory discovery and analysis approach, to measure where it adds value to usersâ analytical activity. Our findings reinforce the potential in our approach, and point us to future paths to build on
- âŠ