100 research outputs found

    City-level Geolocation of Tweets for Real-time Visual Analytics

    Full text link
    Real-time tweets can provide useful information on evolving events and situations. Geotagged tweets are especially useful, as they indicate the location of origin and provide geographic context. However, only a small portion of tweets are geotagged, limiting their use for situational awareness. In this paper, we adapt, improve, and evaluate a state-of-the-art deep learning model for city-level geolocation prediction, and integrate it with a visual analytics system tailored for real-time situational awareness. We provide computational evaluations to demonstrate the superiority and utility of our geolocation prediction model within an interactive system.Comment: 4 pages, 2 tables, 1 figure, SIGSPATIAL GeoAI Worksho

    Can we predict a riot? Disruptive event detection using Twitter

    Get PDF
    In recent years, there has been increased interest in real-world event detection using publicly accessible data made available through Internet technology such as Twitter, Facebook, and YouTube. In these highly interactive systems, the general public are able to post real-time reactions to “real world” events, thereby acting as social sensors of terrestrial activity. Automatically detecting and categorizing events, particularly small-scale incidents, using streamed data is a non-trivial task but would be of high value to public safety organisations such as local police, who need to respond accordingly. To address this challenge, we present an end-to-end integrated event detection framework that comprises five main components: data collection, pre-processing, classification, online clustering, and summarization. The integration between classification and clustering enables events to be detected, as well as related smaller-scale “disruptive events,” smaller incidents that threaten social safety and security or could disrupt social order. We present an evaluation of the effectiveness of detecting events using a variety of features derived from Twitter posts, namely temporal, spatial, and textual content. We evaluate our framework on a large-scale, real-world dataset from Twitter. Furthermore, we apply our event detection system to a large corpus of tweets posted during the August 2011 riots in England. We use ground-truth data based on intelligence gathered by the London Metropolitan Police Service, which provides a record of actual terrestrial events and incidents during the riots, and show that our system can perform as well as terrestrial sources, and even better in some cases

    Extracting News Events from Microblogs

    Full text link
    Twitter stream has become a large source of information for many people, but the magnitude of tweets and the noisy nature of its content have made harvesting the knowledge from Twitter a challenging task for researchers for a long time. Aiming at overcoming some of the main challenges of extracting the hidden information from tweet streams, this work proposes a new approach for real-time detection of news events from the Twitter stream. We divide our approach into three steps. The first step is to use a neural network or deep learning to detect news-relevant tweets from the stream. The second step is to apply a novel streaming data clustering algorithm to the detected news tweets to form news events. The third and final step is to rank the detected events based on the size of the event clusters and growth speed of the tweet frequencies. We evaluate the proposed system on a large, publicly available corpus of annotated news events from Twitter. As part of the evaluation, we compare our approach with a related state-of-the-art solution. Overall, our experiments and user-based evaluation show that our approach on detecting current (real) news events delivers a state-of-the-art performance

    Mining microblogs for culture-awareness in web adaptation

    Get PDF
    Prior studies in sociology and human-computer interaction indicate that persons from different countries and cultural origins tend to have their preferences in real-life communication and the usage of web and social media applications. With Twitter data, statistical and machine learning tools, this study advances our understand ing of microblogging in respect of cultural differences and demonstrates possible solutions of inferring and exploiting cultural origins for building adaptive web ap plications. Our findings reveal statistically significant differences in Twitter feature usage in respect of geographic locations of users. These differences in microblogger behaviour and user language defined in user profiles enabled us to infer user country origins with an accuracy of more than 90%. Other user origin predictive solutions we proposed do not require other data sources and human involvement for training the models, enabling the high accuracy of user country inference when exploiting information extracted from a user followers’ network, or with data derived from Twitter profiles. With origin predictive models, we analysed communication and privacy preferences and built a culture-aware recommender system. Our analysis of friend responses shows that Twitter users tend to communicate mostly within their cultural regions. Usage of privacy settings showed that privacy perceptions differ across cultures. Finally, we created and evaluated movie recommendation strategies considering user cultural groups, and addressed a cold-start scenario with a new user. We believe that the findings discussed give insights into the sociological and web research, in particular on cultural differences in online communication

    Unsupervised learning on social data

    Get PDF

    Data-Centric Epidemic Forecasting: A Survey

    Full text link
    The COVID-19 pandemic has brought forth the importance of epidemic forecasting for decision makers in multiple domains, ranging from public health to the economy as a whole. While forecasting epidemic progression is frequently conceptualized as being analogous to weather forecasting, however it has some key differences and remains a non-trivial task. The spread of diseases is subject to multiple confounding factors spanning human behavior, pathogen dynamics, weather and environmental conditions. Research interest has been fueled by the increased availability of rich data sources capturing previously unobservable facets and also due to initiatives from government public health and funding agencies. This has resulted, in particular, in a spate of work on 'data-centered' solutions which have shown potential in enhancing our forecasting capabilities by leveraging non-traditional data sources as well as recent innovations in AI and machine learning. This survey delves into various data-driven methodological and practical advancements and introduces a conceptual framework to navigate through them. First, we enumerate the large number of epidemiological datasets and novel data streams that are relevant to epidemic forecasting, capturing various factors like symptomatic online surveys, retail and commerce, mobility, genomics data and more. Next, we discuss methods and modeling paradigms focusing on the recent data-driven statistical and deep-learning based methods as well as on the novel class of hybrid models that combine domain knowledge of mechanistic models with the effectiveness and flexibility of statistical approaches. We also discuss experiences and challenges that arise in real-world deployment of these forecasting systems including decision-making informed by forecasts. Finally, we highlight some challenges and open problems found across the forecasting pipeline.Comment: 67 pages, 12 figure

    Temporal Signature Modeling and Analysis

    Get PDF
    A vast amount of digital satellite and aerial images are collected over time, which calls for techniques to extract useful high-level information, such as recognizable events. One part of this thesis proposes a framework for streaming analysis of the time series, which can recognize events without supervision and memorize them by building the temporal contexts. The memorized historical data is then used to predict the future and detect anomalies. A new incremental clustering method is proposed to recognize the event without training. A memorization method of double localization, including relative and absolute localization, is proposed to model the temporal context. Finally, the predictive model is built based on the method of memorization. The Edinburgh Pedestrian Dataset , which offers about 1000 observed trajectories of pedestrians detected in camera images each working day for several months, is used as an example to illustrate the framework. Although there is a large amount of image data captured, most of them are not available to the public. The other part of this thesis developed a method of generating spatial-spectral-temporal synthetic images by enhancing the capacity of a current tool called DIRISG (Digital Imaging and Remote Sensing Image Generation). Currently, DIRSIG can only model limited temporal signatures. In order to observe general temporal changes in a process within the scene, a process model, which links the observable signatures of interest temporally, should be developed and incorporated into DIRSIG. The sub process models could be categorized into two types. One is that the process model drives the property of each facet of the object changing over time, and the other one is to drive the geometry location of the object in the scene changing as a function of time. Two example process models are used to show how process models can be incorporated into DIRSIG

    WELLNESS PROFILING ON SOCIAL NETWORKS

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH
    corecore