386 research outputs found

    Time Aware Knowledge Extraction for Microblog Summarization on Twitter

    Full text link
    Microblogging services like Twitter and Facebook collect millions of user generated content every moment about trending news, occurring events, and so on. Nevertheless, it is really a nightmare to find information of interest through the huge amount of available posts that are often noise and redundant. In general, social media analytics services have caught increasing attention from both side research and industry. Specifically, the dynamic context of microblogging requires to manage not only meaning of information but also the evolution of knowledge over the timeline. This work defines Time Aware Knowledge Extraction (briefly TAKE) methodology that relies on temporal extension of Fuzzy Formal Concept Analysis. In particular, a microblog summarization algorithm has been defined filtering the concepts organized by TAKE in a time-dependent hierarchy. The algorithm addresses topic-based summarization on Twitter. Besides considering the timing of the concepts, another distinguish feature of the proposed microblog summarization framework is the possibility to have more or less detailed summary, according to the user's needs, with good levels of quality and completeness as highlighted in the experimental results.Comment: 33 pages, 10 figure

    Interpretable classification and summarization of crisis events from microblogs

    Get PDF
    The widespread use of social media platforms has created convenient ways to obtain and spread up-to-date information during crisis events such as disasters. Time-critical analysis of crisis-related information helps humanitarian organizations and governmental bodies gain actionable information and plan for aid response. However, situational information is often immersed in a high volume of irrelevant content. Moreover, crisis-related messages also vary greatly in terms of information types, ranging from general situational awareness - such as information about warnings, infrastructure damages, and casualties - to individual needs. Different humanitarian organizations or governmental bodies usually demand information of different types for various tasks such as crisis preparation, resource planning, and aid response. To cope with information overload and efficiently support stakeholders in crisis situations, it is necessary to (a) classify data posted during crisis events into fine-grained humanitarian categories, (b) summarize the situational data in near real-time. In this thesis, we tackle the aforementioned problems and propose novel methods for the classification and summarization of user-generated posts from microblogs. Previous studies have introduced various machine learning techniques to assist humanitarian or governmental bodies, but they primarily focused on model performance. Unlike those works, we develop interpretable machine-learning models which can provide explanations of model decisions. Generally, we focus on three methods for reducing information overload in crisis situations: (i) post classification, (ii) post summarization, (iii) interpretable models for post classification and summarization. We evaluate our methods using posts from the microblogging platform Twitter, so-called tweets. First, we expand publicly available labeled datasets with rationale annotations. Each tweet is annotated with a class label and rationales, which are short snippets from the tweet to explain its assigned label. Using the data, we develop trustworthy classification methods that give the best tradeoff between model performance and interoperability. Rationale snippets usually convey essential information in the tweets. Hence, we propose an integer linear programming-based summarization method that maximizes the coverage of rationale phrases to generate summaries of class-level tweet data. Next, we introduce an approach that can enhance latent embedding representations of tweets in vector space. Our approach helps improve the classification performance-interpretability tradeoff and detect near duplicates for designing a summarization model with low computational complexity. Experiments show that rationale labels are helpful for developing interpretable-by-design models. However, annotations are not always available, especially in real-time situations for new tasks and crisis events. In the last part of the thesis, we propose a two-stage approach to extract the rationales under minimal human supervision

    Can we predict a riot? Disruptive event detection using Twitter

    Get PDF
    In recent years, there has been increased interest in real-world event detection using publicly accessible data made available through Internet technology such as Twitter, Facebook, and YouTube. In these highly interactive systems, the general public are able to post real-time reactions to “real world” events, thereby acting as social sensors of terrestrial activity. Automatically detecting and categorizing events, particularly small-scale incidents, using streamed data is a non-trivial task but would be of high value to public safety organisations such as local police, who need to respond accordingly. To address this challenge, we present an end-to-end integrated event detection framework that comprises five main components: data collection, pre-processing, classification, online clustering, and summarization. The integration between classification and clustering enables events to be detected, as well as related smaller-scale “disruptive events,” smaller incidents that threaten social safety and security or could disrupt social order. We present an evaluation of the effectiveness of detecting events using a variety of features derived from Twitter posts, namely temporal, spatial, and textual content. We evaluate our framework on a large-scale, real-world dataset from Twitter. Furthermore, we apply our event detection system to a large corpus of tweets posted during the August 2011 riots in England. We use ground-truth data based on intelligence gathered by the London Metropolitan Police Service, which provides a record of actual terrestrial events and incidents during the riots, and show that our system can perform as well as terrestrial sources, and even better in some cases

    Leveraging Social Media and Web of Data for Crisis Response Coordination

    Get PDF
    There is an ever increasing number of users in social media (1B+ Facebook users, 500M+ Twitter users) and ubiquitous mobile access (6B+ mobile phone subscribers) who share their observations and opinions. In addition, the Web of Data and existing knowledge bases keep on growing at a rapid pace. In this scenario, we have unprecedented opportunities to improve crisis response by extracting social signals, creating spatio-temporal mappings, performing analytics on social and Web of Data, and supporting a variety of applications. Such applications can help provide situational awareness during an emergency, improve preparedness, and assist during the rebuilding/recovery phase of a disaster. Data mining can provide valuable insights to support emergency responders and other stakeholders during crisis. However, there are a number of challenges and existing computing technology may not work in all cases. Therefore, our objective here is to present the characterization of such data mining tasks, and challenges that need further research attention
    • …
    corecore