10 research outputs found
Multi-scale Population and Mobility Estimation with Geo-tagged Tweets
Recent outbreaks of Ebola and Dengue viruses have again elevated the
significance of the capability to quickly predict disease spread in an emergent
situation. However, existing approaches usually rely heavily on the
time-consuming census processes, or the privacy-sensitive call logs, leading to
their unresponsive nature when facing the abruptly changing dynamics in the
event of an outbreak. In this paper we study the feasibility of using
large-scale Twitter data as a proxy of human mobility to model and predict
disease spread. We report that for Australia, Twitter users' distribution
correlates well the census-based population distribution, and that the Twitter
users' travel patterns appear to loosely follow the gravity law at multiple
scales of geographic distances, i.e. national level, state level and
metropolitan level. The radiation model is also evaluated on this dataset
though it has shown inferior fitness as a result of Australia's sparse
population and large landmass. The outcomes of the study form the cornerstones
for future work towards a model-based, responsive prediction method from
Twitter data for disease spread.Comment: 1st International Workshop on Big Data Analytics for Biosecurity
(BioBAD2015), 4 page
Creating Full Individual-level Location Timelines from Sparse Social Media Data
In many domain applications, a continuous timeline of human locations is
critical; for example for understanding possible locations where a disease may
spread, or the flow of traffic. While data sources such as GPS trackers or Call
Data Records are temporally-rich, they are expensive, often not publicly
available or garnered only in select locations, restricting their wide use.
Conversely, geo-located social media data are publicly and freely available,
but present challenges especially for full timeline inference due to their
sparse nature. We propose a stochastic framework, Intermediate Location
Computing (ILC) which uses prior knowledge about human mobility patterns to
predict every missing location from an individual's social media timeline. We
compare ILC with a state-of-the-art RNN baseline as well as methods that are
optimized for next-location prediction only. For three major cities, ILC
predicts the top 1 location for all missing locations in a timeline, at 1 and
2-hour resolution, with up to 77.2% accuracy (up to 6% better accuracy than all
compared methods). Specifically, ILC also outperforms the RNN in settings of
low data; both cases of very small number of users (under 50), as well as
settings with more users, but with sparser timelines. In general, the RNN model
needs a higher number of users to achieve the same performance as ILC. Overall,
this work illustrates the tradeoff between prior knowledge of heuristics and
more data, for an important societal problem of filling in entire timelines
using freely available, but sparse social media data.Comment: 10 pages, 8 figures, 2 table
Autonomous surveillance for biosecurity
The global movement of people and goods has increased the risk of biosecurity
threats and their potential to incur large economic, social, and environmental
costs. Conventional manual biosecurity surveillance methods are limited by
their scalability in space and time. This article focuses on autonomous
surveillance systems, comprising sensor networks, robots, and intelligent
algorithms, and their applicability to biosecurity threats. We discuss the
spatial and temporal attributes of autonomous surveillance technologies and map
them to three broad categories of biosecurity threat: (i) vector-borne
diseases; (ii) plant pests; and (iii) aquatic pests. Our discussion reveals a
broad range of opportunities to serve biosecurity needs through autonomous
surveillance.Comment: 26 pages, Trends in Biotechnology, 3 March 2015, ISSN 0167-7799,
http://dx.doi.org/10.1016/j.tibtech.2015.01.003.
(http://www.sciencedirect.com/science/article/pii/S0167779915000190
Human dynamics in the age of big data: a theory-data-driven approach
The revolution of information and communication technology (ICT) in the past two decades have transformed the world and people’s lives with the ways that knowledge is produced. With the advancements in location-aware technologies, a large volume of data so-called “big data” is now available through various sources to explore the world. This dissertation examines the potential use of such data in understanding human dynamics by focusing on both theory- and data-driven approaches. Specifically, human dynamics represented by communication and activities is linked to geographic concepts of space and place through social media data to set a research platform for effective use of social media as an information system. Three case studies covering these conceptual linkages are presented to (1) identify communication patterns on social media; (2) identify spatial patterns of activities in urban areas and detect events; and (3) explore urban mobility patterns. The first case study examines the use of and communication dynamics on Twitter during Hurricane Sandy utilizing survey and data analytics techniques. Twitter was identified as a valuable source of disaster-related information. Additionally, the results shed lights on the most significant information that can be derived from Twitter during disasters and the need for establishing bi-directional communications during such events to achieve an effective communication. The second case study examines the potential of Twitter in identifying activities and events and exploring movements during Hurricane Sandy utilizing both time-geographic information and qualitative social media text data. The study provides insights for enhancing situational awareness during natural disasters. The third case study examines the potential of Twitter in modeling commuting trip distribution in New York City. By integrating both traditional and social media data and utilizing machine learning techniques, the study identified Twitter as a valuable source for transportation modeling. Despite the limitations of social media such as the accuracy issue, there is tremendous opportunity for geographers to enrich their understanding of human dynamics in the world. However, we will need new research frameworks, which integrate geographic concepts with information systems theories to theorize the process. Furthermore, integrating various data sources is the key to future research and will need new computational approaches. Addressing these computational challenges, therefore, will be a crucial step to extend the frontier of big data knowledge from a geographic perspective. KEYWORDS: Big data, social media, Twitter, human dynamics, VGI, natural disasters, Hurricane Sandy, transportation modeling, machine learning, situational awareness, NYC, GI