1,558 research outputs found

    Scalable Privacy-Compliant Virality Prediction on Twitter

    Get PDF
    The digital town hall of Twitter becomes a preferred medium of communication for individuals and organizations across the globe. Some of them reach audiences of millions, while others struggle to get noticed. Given the impact of social media, the question remains more relevant than ever: how to model the dynamics of attention in Twitter. Researchers around the world turn to machine learning to predict the most influential tweets and authors, navigating the volume, velocity, and variety of social big data, with many compromises. In this paper, we revisit content popularity prediction on Twitter. We argue that strict alignment of data acquisition, storage and analysis algorithms is necessary to avoid the common trade-offs between scalability, accuracy and privacy compliance. We propose a new framework for the rapid acquisition of large-scale datasets, high accuracy supervisory signal and multilanguage sentiment prediction while respecting every privacy request applicable. We then apply a novel gradient boosting framework to achieve state-of-the-art results in virality ranking, already before including tweet's visual or propagation features. Our Gradient Boosted Regression Tree is the first to offer explainable, strong ranking performance on benchmark datasets. Since the analysis focused on features available early, the model is immediately applicable to incoming tweets in 18 languages.Comment: AffCon@AAAI-19 Best Paper Award; Presented at AAAI-19 W1: Affective Content Analysi

    CrisisMMD: Multimodal Twitter Datasets from Natural Disasters

    Full text link
    During natural and man-made disasters, people use social media platforms such as Twitter to post textual and multime- dia content to report updates about injured or dead people, infrastructure damage, and missing or found people among other information types. Studies have revealed that this on- line information, if processed timely and effectively, is ex- tremely useful for humanitarian organizations to gain situational awareness and plan relief operations. In addition to the analysis of textual content, recent studies have shown that imagery content on social media can boost disaster response significantly. Despite extensive research that mainly focuses on textual content to extract useful information, limited work has focused on the use of imagery content or the combination of both content types. One of the reasons is the lack of labeled imagery data in this domain. Therefore, in this paper, we aim to tackle this limitation by releasing a large multi-modal dataset collected from Twitter during different natural disasters. We provide three types of annotations, which are useful to address a number of crisis response and management tasks for different humanitarian organizations.Comment: 9 page

    Crisis Analytics: Big Data Driven Crisis Response

    Get PDF
    Disasters have long been a scourge for humanity. With the advances in technology (in terms of computing, communications, and the ability to process and analyze big data), our ability to respond to disasters is at an inflection point. There is great optimism that big data tools can be leveraged to process the large amounts of crisis-related data (in the form of user generated data in addition to the traditional humanitarian data) to provide an insight into the fast-changing situation and help drive an effective disaster response. This article introduces the history and the future of big crisis data analytics, along with a discussion on its promise, challenges, and pitfalls

    Multi-source Multimodal Data and Deep Learning for Disaster Response: A Systematic Review.

    Get PDF
    Mechanisms for sharing information in a disaster situation have drastically changed due to new technological innovations throughout the world. The use of social media applications and collaborative technologies for information sharing have become increasingly popular. With these advancements, the amount of data collected increases daily in different modalities, such as text, audio, video, and images. However, to date, practical Disaster Response (DR) activities are mostly depended on textual information, such as situation reports and email content, and the benefit of other media is often not realised. Deep Learning (DL) algorithms have recently demonstrated promising results in extracting knowledge from multiple modalities of data, but the use of DL approaches for DR tasks has thus far mostly been pursued in an academic context. This paper conducts a systematic review of 83 articles to identify the successes, current and future challenges, and opportunities in using DL for DR tasks. Our analysis is centred around the components of learning, a set of aspects that govern the application of Machine learning (ML) for a given problem domain. A flowchart and guidance for future research are developed as an outcome of the analysis to ensure the benefits of DL for DR activities are utilized.Publishe

    Knowledge will Propel Machine Understanding of Content: Extrapolating from Current Examples

    Full text link
    Machine Learning has been a big success story during the AI resurgence. One particular stand out success relates to learning from a massive amount of data. In spite of early assertions of the unreasonable effectiveness of data, there is increasing recognition for utilizing knowledge whenever it is available or can be created purposefully. In this paper, we discuss the indispensable role of knowledge for deeper understanding of content where (i) large amounts of training data are unavailable, (ii) the objects to be recognized are complex, (e.g., implicit entities and highly subjective content), and (iii) applications need to use complementary or related data in multiple modalities/media. What brings us to the cusp of rapid progress is our ability to (a) create relevant and reliable knowledge and (b) carefully exploit knowledge to enhance ML/NLP techniques. Using diverse examples, we seek to foretell unprecedented progress in our ability for deeper understanding and exploitation of multimodal data and continued incorporation of knowledge in learning techniques.Comment: Pre-print of the paper accepted at 2017 IEEE/WIC/ACM International Conference on Web Intelligence (WI). arXiv admin note: substantial text overlap with arXiv:1610.0770
    • …
    corecore