192 research outputs found

    Image Analysis Enhanced Event Detection from Geo-tagged Tweet Streams

    Full text link
    Events detected from social media streams often include early signs of accidents, crimes or disasters. Therefore, they can be used by related parties for timely and efficient response. Although significant progress has been made on event detection from tweet streams, most existing methods have not considered the posted images in tweets, which provide richer information than the text, and potentially can be a reliable indicator of whether an event occurs or not. In this paper, we design an event detection algorithm that combines textual, statistical and image information, following an unsupervised machine learning approach. Specifically, the algorithm starts with semantic and statistical analyses to obtain a list of tweet clusters, each of which corresponds to an event candidate, and then performs image analysis to separate events from non-events---a convolutional autoencoder is trained for each cluster as an anomaly detector, where a part of the images are used as the training data and the remaining images are used as the test instances. Our experiments on multiple datasets verify that when an event occurs, the mean reconstruction errors of the training and test images are much closer, compared with the case where the candidate is a non-event cluster. Based on this finding, the algorithm rejects a candidate if the difference is larger than a threshold. Experimental results over millions of tweets demonstrate that this image analysis enhanced approach can significantly increase the precision with minimum impact on the recall.Comment: 12 pages, 4 figure

    Visual analytics of location-based social networks for decision support

    Get PDF
    Recent advances in technology have enabled people to add location information to social networks called Location-Based Social Networks (LBSNs) where people share their communication and whereabouts not only in their daily lives, but also during abnormal situations, such as crisis events. However, since the volume of the data exceeds the boundaries of human analytical capabilities, it is almost impossible to perform a straightforward qualitative analysis of the data. The emerging field of visual analytics has been introduced to tackle such challenges by integrating the approaches from statistical data analysis and human computer interaction into highly interactive visual environments. Based on the idea of visual analytics, this research contributes the techniques of knowledge discovery in social media data for providing comprehensive situational awareness. We extract valuable hidden information from the huge volume of unstructured social media data and model the extracted information for visualizing meaningful information along with user-centered interactive interfaces. We develop visual analytics techniques and systems for spatial decision support through coupling modeling of spatiotemporal social media data, with scalable and interactive visual environments. These systems allow analysts to detect and examine abnormal events within social media data by integrating automated analytical techniques and visual methods. We provide comprehensive analysis of public behavior response in disaster events through exploring and examining the spatial and temporal distribution of LBSNs. We also propose a trajectory-based visual analytics of LBSNs for anomalous human movement analysis during crises by incorporating a novel classification technique. Finally, we introduce a visual analytics approach for forecasting the overall flow of human crowds

    Real-time topic detection with bursty n-grams: RGU's submission to the 2014 SNOW challenge.

    Get PDF
    Twitter is becoming an ever more popular platform for discovering and sharing information about current events, both personal and global. The scale and diversity of messages makes the discovery and analysis of breaking news very challenging. Nonetheless, journalists and other news consumers are increasingly relying on tools to help them make sense of Twitter. Here, we describe a fully-automated system capable of detecting trends related to breaking news in real-time. It identifies words or phrases that `burst' with sudden increased frequencies, and groups these into topics. It identifies a diverse set of recent tweets that are related to these topics, and uses these to create a suitable human-readable headline. In addition, images coming from the diverse tweets are also added to the topic. Our system was evaluated using 24 hours of tweets as part of the Social News On the Web (SNOW) 2014 data challenge

    Event Modeling in Social Media with Application to Disaster Damage Assessment

    Get PDF
    This thesis addresses the modeling of events in social media, with an emphasis on the detection, tracking, and analysis of disaster-related events like the 2011 Tohuku Earthquake in Japan. Successful event modeling is critical for many applications including information search, entity extraction, disaster assessment, and emergency monitoring. However, modeling events in social media is challenging since: (i) social media is noisy and oftentimes incomplete, in the sense that users provide only partial evidence of their participation in an event; (ii) messages in social media are usually short, providing only little textual narrative (thereby making event detection difficult); and (iii) the size of short-lived events typically changes rapidly, growing and shrinking in sharp bursts. With these challenges in mind, this thesis proposes a framework for event modeling in social media and makes three major contributions: The first contribution is a signal processing-inspired approach for event detection from social media. Concretely, this research proposes an iterative spatial- temporal event mining algorithm for identifying and extracting topics from social media. One of the key aspects of the proposed algorithm is a signal processing-inspired approach for viewing spatial-temporal term occurrences as signals, analyzing the noise contained in the signals, and applying noise filters to improve the quality of event extraction from these signals. The second contribution is a new model of population dynamics of event-related crowds in social media as they first form, evolve, and eventually dissolve. Toward robust population modeling, a duration model is proposed to predict the time users spend in a particular crowd. And then a time-evolving population model is designed for estimating the number of people departing a crowd, which enables the prediction of the total population remaining in a crowd. The third contribution of this thesis is a set of methods for event analytics for leveraging social media in an earthquake damage assessment scenario. Firstly, the difference between text tweets and image tweets is investigated, and then three features – tweet density, re-tweet density, and user tweeting count – are extracted to model the intensity attenuation of earthquakes. The observation that the relationship between social media activity vs. the loss/damage attenuation suggests that social media following a catastrophic event can provide rapid insight into the extent of damage

    Context-Aware Message-Level Rumour Detection with Weak Supervision

    Get PDF
    Social media has become the main source of all sorts of information beyond a communication medium. Its intrinsic nature can allow a continuous and massive flow of misinformation to make a severe impact worldwide. In particular, rumours emerge unexpectedly and spread quickly. It is challenging to track down their origins and stop their propagation. One of the most ideal solutions to this is to identify rumour-mongering messages as early as possible, which is commonly referred to as "Early Rumour Detection (ERD)". This dissertation focuses on researching ERD on social media by exploiting weak supervision and contextual information. Weak supervision is a branch of ML where noisy and less precise sources (e.g. data patterns) are leveraged to learn limited high-quality labelled data (Ratner et al., 2017). This is intended to reduce the cost and increase the efficiency of the hand-labelling of large-scale data. This thesis aims to study whether identifying rumours before they go viral is possible and develop an architecture for ERD at individual post level. To this end, it first explores major bottlenecks of current ERD. It also uncovers a research gap between system design and its applications in the real world, which have received less attention from the research community of ERD. One bottleneck is limited labelled data. Weakly supervised methods to augment limited labelled training data for ERD are introduced. The other bottleneck is enormous amounts of noisy data. A framework unifying burst detection based on temporal signals and burst summarisation is investigated to identify potential rumours (i.e. input to rumour detection models) by filtering out uninformative messages. Finally, a novel method which jointly learns rumour sources and their contexts (i.e. conversational threads) for ERD is proposed. An extensive evaluation setting for ERD systems is also introduced
    • …