100 research outputs found
City-level Geolocation of Tweets for Real-time Visual Analytics
Real-time tweets can provide useful information on evolving events and
situations. Geotagged tweets are especially useful, as they indicate the
location of origin and provide geographic context. However, only a small
portion of tweets are geotagged, limiting their use for situational awareness.
In this paper, we adapt, improve, and evaluate a state-of-the-art deep learning
model for city-level geolocation prediction, and integrate it with a visual
analytics system tailored for real-time situational awareness. We provide
computational evaluations to demonstrate the superiority and utility of our
geolocation prediction model within an interactive system.Comment: 4 pages, 2 tables, 1 figure, SIGSPATIAL GeoAI Worksho
Can we predict a riot? Disruptive event detection using Twitter
In recent years, there has been increased interest in real-world event detection using publicly accessible data made available through Internet technology such as Twitter, Facebook, and YouTube. In these highly interactive systems, the general public are able to post real-time reactions to “real world” events, thereby acting as social sensors of terrestrial activity. Automatically detecting and categorizing events, particularly small-scale incidents, using streamed data is a non-trivial task but would be of high value to public safety organisations such as local police, who need to respond accordingly. To address this challenge, we present an end-to-end integrated event detection framework that comprises five main components: data collection, pre-processing, classification, online clustering, and summarization. The integration between classification and clustering enables events to be detected, as well as related smaller-scale “disruptive events,” smaller incidents that threaten social safety and security or could disrupt social order. We present an evaluation of the effectiveness of detecting events using a variety of features derived from Twitter posts, namely temporal, spatial, and textual content. We evaluate our framework on a large-scale, real-world dataset from Twitter. Furthermore, we apply our event detection system to a large corpus of tweets posted during the August 2011 riots in England. We use ground-truth data based on intelligence gathered by the London Metropolitan Police Service, which provides a record of actual terrestrial events and incidents during the riots, and show that our system can perform as well as terrestrial sources, and even better in some cases
Extracting News Events from Microblogs
Twitter stream has become a large source of information for many people, but
the magnitude of tweets and the noisy nature of its content have made
harvesting the knowledge from Twitter a challenging task for researchers for a
long time. Aiming at overcoming some of the main challenges of extracting the
hidden information from tweet streams, this work proposes a new approach for
real-time detection of news events from the Twitter stream. We divide our
approach into three steps. The first step is to use a neural network or deep
learning to detect news-relevant tweets from the stream. The second step is to
apply a novel streaming data clustering algorithm to the detected news tweets
to form news events. The third and final step is to rank the detected events
based on the size of the event clusters and growth speed of the tweet
frequencies. We evaluate the proposed system on a large, publicly available
corpus of annotated news events from Twitter. As part of the evaluation, we
compare our approach with a related state-of-the-art solution. Overall, our
experiments and user-based evaluation show that our approach on detecting
current (real) news events delivers a state-of-the-art performance
Mining microblogs for culture-awareness in web adaptation
Prior studies in sociology and human-computer interaction indicate that persons
from different countries and cultural origins tend to have their preferences in real-life
communication and the usage of web and social media applications. With Twitter
data, statistical and machine learning tools, this study advances our understand
ing of microblogging in respect of cultural differences and demonstrates possible
solutions of inferring and exploiting cultural origins for building adaptive web ap
plications. Our findings reveal statistically significant differences in Twitter feature
usage in respect of geographic locations of users. These differences in microblogger
behaviour and user language defined in user profiles enabled us to infer user country
origins with an accuracy of more than 90%. Other user origin predictive solutions
we proposed do not require other data sources and human involvement for training
the models, enabling the high accuracy of user country inference when exploiting
information extracted from a user followers’ network, or with data derived from
Twitter profiles. With origin predictive models, we analysed communication and
privacy preferences and built a culture-aware recommender system. Our analysis of
friend responses shows that Twitter users tend to communicate mostly within their
cultural regions. Usage of privacy settings showed that privacy perceptions differ
across cultures. Finally, we created and evaluated movie recommendation strategies
considering user cultural groups, and addressed a cold-start scenario with a new
user. We believe that the findings discussed give insights into the sociological and
web research, in particular on cultural differences in online communication
Data-Centric Epidemic Forecasting: A Survey
The COVID-19 pandemic has brought forth the importance of epidemic
forecasting for decision makers in multiple domains, ranging from public health
to the economy as a whole. While forecasting epidemic progression is frequently
conceptualized as being analogous to weather forecasting, however it has some
key differences and remains a non-trivial task. The spread of diseases is
subject to multiple confounding factors spanning human behavior, pathogen
dynamics, weather and environmental conditions. Research interest has been
fueled by the increased availability of rich data sources capturing previously
unobservable facets and also due to initiatives from government public health
and funding agencies. This has resulted, in particular, in a spate of work on
'data-centered' solutions which have shown potential in enhancing our
forecasting capabilities by leveraging non-traditional data sources as well as
recent innovations in AI and machine learning. This survey delves into various
data-driven methodological and practical advancements and introduces a
conceptual framework to navigate through them. First, we enumerate the large
number of epidemiological datasets and novel data streams that are relevant to
epidemic forecasting, capturing various factors like symptomatic online
surveys, retail and commerce, mobility, genomics data and more. Next, we
discuss methods and modeling paradigms focusing on the recent data-driven
statistical and deep-learning based methods as well as on the novel class of
hybrid models that combine domain knowledge of mechanistic models with the
effectiveness and flexibility of statistical approaches. We also discuss
experiences and challenges that arise in real-world deployment of these
forecasting systems including decision-making informed by forecasts. Finally,
we highlight some challenges and open problems found across the forecasting
pipeline.Comment: 67 pages, 12 figure
Temporal Signature Modeling and Analysis
A vast amount of digital satellite and aerial images are collected over time, which calls for techniques to extract useful high-level information, such as recognizable events. One part of this thesis proposes a framework for streaming analysis of the time series, which can recognize events without supervision and memorize them by building the temporal contexts. The memorized historical data is then used to predict the future and detect anomalies. A new incremental clustering method is proposed to recognize the event without training. A memorization method of double localization, including relative and absolute localization, is proposed to model the temporal context. Finally, the predictive model is built based on the method of memorization. The Edinburgh Pedestrian Dataset , which offers about 1000 observed trajectories of pedestrians detected in camera images each working day for several months, is used as an example to illustrate the framework.
Although there is a large amount of image data captured, most of them are not available to the public. The other part of this thesis developed a method of generating spatial-spectral-temporal synthetic images by enhancing the capacity of a current tool called DIRISG (Digital Imaging and Remote Sensing Image Generation). Currently, DIRSIG can only model limited temporal signatures. In order to observe general temporal changes in a process within the scene, a process model, which links the observable signatures of interest temporally, should be developed and incorporated into DIRSIG. The sub process models could be categorized into two types. One is that the process model drives the property of each facet of the object changing over time, and the other one is to drive the geometry location of the object in the scene changing as a function of time. Two example process models are used to show how process models can be incorporated into DIRSIG
- …