356 research outputs found

    What’s Happening Around the World? A Survey and Framework on Event Detection Techniques on Twitter

    Full text link
    © 2019, Springer Nature B.V. In the last few years, Twitter has become a popular platform for sharing opinions, experiences, news, and views in real-time. Twitter presents an interesting opportunity for detecting events happening around the world. The content (tweets) published on Twitter are short and pose diverse challenges for detecting and interpreting event-related information. This article provides insights into ongoing research and helps in understanding recent research trends and techniques used for event detection using Twitter data. We classify techniques and methodologies according to event types, orientation of content, event detection tasks, their evaluation, and common practices. We highlight the limitations of existing techniques and accordingly propose solutions to address the shortcomings. We propose a framework called EDoT based on the research trends, common practices, and techniques used for detecting events on Twitter. EDoT can serve as a guideline for developing event detection methods, especially for researchers who are new in this area. We also describe and compare data collection techniques, the effectiveness and shortcomings of various Twitter and non-Twitter-based features, and discuss various evaluation measures and benchmarking methodologies. Finally, we discuss the trends, limitations, and future directions for detecting events on Twitter

    A data analysis approach to study events’ influence in social networks

    Get PDF
    Dissertação de mestrado em Computer ScienceNowadays, the assimilation of web content, by each individual, has a considerable impact on our’ everyday life. With the undeniable success of online social networks and microblogs, such as Facebook, Instagram and Twitter, the phenomenon of influence exerted by users of such platforms on other users, and how it propagates in the network, has been attracting, for some years computer scientists, information technicians, and marketing specialists. Increased connectivity, multi-model access and the rise of social media shortened the distance between almost every person in the world, more and more content is generated. Extracting and analyzing a significant amount of data is not a trivial task, Big Data techniques are essential. Through the analysis of this interaction, an exchange of information and feelings, it is entirely imaginable its usefulness in understanding complex human behaviours and so, help diverse organization’s decision-making. Influence maximization and viral marketing are among the possibilities. This work is intended to study what is the impact and role that an event’s social influence has and how does it propagate, particularly on its surrounding territory. This influence is inferred by analysis of the online platform’s data, by applying intelligent techniques, right after its extraction. The final step is to validate the results with data from different sources. Helping businesses through actionable and valuable knowledge is the ultimate goal. This document contemplates an introductory section where the study subject and its State of the Art are addressed. Next, the problem and what direction to take to solve it are discussed.Atualmente, a assimilação de conteúdo Web, por cada individuo, tem um impacto considerável no nosso quotidiano. Com o inegável sucesso de redes sociais e microblogs, como por exemplo Facebook, Instagram e Twitter, o fenómeno de influência exercida, por utilizadores de tais plataformas, em outros utilizadores e como se propaga na rede tem atraído, por alguns anos, informáticos, técnicos de informação e especialistas em marketing. O aumento da conectividade, o acesso multi-modal e a proliferação dos meios de comunicação social reduziram a distância entre quase todas as pessoas do mundo, mais e mais conteúdo é gerado. Extrair e analisar uma grande quantidade de dados não é uma tarefa trivial, são essenciais técnicas de Big Data. Através da análise desta interação, troca de informações e emoções, é perfeitamente imaginável a sua utilidade na compreensão de complexos comportamentos humanos e, portanto, ajudar na tomada de decisão de diversas organizações. A maximização da influência e o marketing viral estão entre as possibilidades. Este trabalho destina-se a estudar qual é o impacto e o papel que a influência social de um evento tem e como se propaga, particularmente no território envolvente. Esta influência é inferida pela análise dos dados de plataformas online, aplicando técnicas inteligentes, logo após a sua extração . O passo final é validar os resultados com dados de diferentes fontes. Ajudar empresas através do conhecimento valioso e atuável é o objetivo final. Este documento contempla uma seção introdutória, onde o assunto de estudo e o seu estado da arte são abordados. De seguida, é discutido o problema e a direção a seguir para o solucionar

    A Data-driven, High-performance and Intelligent CyberInfrastructure to Advance Spatial Sciences

    Get PDF
    abstract: In the field of Geographic Information Science (GIScience), we have witnessed the unprecedented data deluge brought about by the rapid advancement of high-resolution data observing technologies. For example, with the advancement of Earth Observation (EO) technologies, a massive amount of EO data including remote sensing data and other sensor observation data about earthquake, climate, ocean, hydrology, volcano, glacier, etc., are being collected on a daily basis by a wide range of organizations. In addition to the observation data, human-generated data including microblogs, photos, consumption records, evaluations, unstructured webpages and other Volunteered Geographical Information (VGI) are incessantly generated and shared on the Internet. Meanwhile, the emerging cyberinfrastructure rapidly increases our capacity for handling such massive data with regard to data collection and management, data integration and interoperability, data transmission and visualization, high-performance computing, etc. Cyberinfrastructure (CI) consists of computing systems, data storage systems, advanced instruments and data repositories, visualization environments, and people, all linked together by software and high-performance networks to improve research productivity and enable breakthroughs that are not otherwise possible. The Geospatial CI (GCI, or CyberGIS), as the synthesis of CI and GIScience has inherent advantages in enabling computationally intensive spatial analysis and modeling (SAM) and collaborative geospatial problem solving and decision making. This dissertation is dedicated to addressing several critical issues and improving the performance of existing methodologies and systems in the field of CyberGIS. My dissertation will include three parts: The first part is focused on developing methodologies to help public researchers find appropriate open geo-spatial datasets from millions of records provided by thousands of organizations scattered around the world efficiently and effectively. Machine learning and semantic search methods will be utilized in this research. The second part develops an interoperable and replicable geoprocessing service by synthesizing the high-performance computing (HPC) environment, the core spatial statistic/analysis algorithms from the widely adopted open source python package – Python Spatial Analysis Library (PySAL), and rich datasets acquired from the first research. The third part is dedicated to studying optimization strategies for feature data transmission and visualization. This study is intended for solving the performance issue in large feature data transmission through the Internet and visualization on the client (browser) side. Taken together, the three parts constitute an endeavor towards the methodological improvement and implementation practice of the data-driven, high-performance and intelligent CI to advance spatial sciences.Dissertation/ThesisDoctoral Dissertation Geography 201

    Extracting Temporal Expressions from Unstructured Open Resources

    Get PDF
    AETAS is an end-to-end system with SOA approach that retrieves plain text data from web and blog news and represents and stores them in RDF, with a special focus on their temporal dimension. The system allows users to acquire, browse and query Linked Data obtained from unstructured sources

    The Web of False Information: Rumors, Fake News, Hoaxes, Clickbait, and Various Other Shenanigans

    Full text link
    A new era of Information Warfare has arrived. Various actors, including state-sponsored ones, are weaponizing information on Online Social Networks to run false information campaigns with targeted manipulation of public opinion on specific topics. These false information campaigns can have dire consequences to the public: mutating their opinions and actions, especially with respect to critical world events like major elections. Evidently, the problem of false information on the Web is a crucial one, and needs increased public awareness, as well as immediate attention from law enforcement agencies, public institutions, and in particular, the research community. In this paper, we make a step in this direction by providing a typology of the Web's false information ecosystem, comprising various types of false information, actors, and their motives. We report a comprehensive overview of existing research on the false information ecosystem by identifying several lines of work: 1) how the public perceives false information; 2) understanding the propagation of false information; 3) detecting and containing false information on the Web; and 4) false information on the political stage. In this work, we pay particular attention to political false information as: 1) it can have dire consequences to the community (e.g., when election results are mutated) and 2) previous work show that this type of false information propagates faster and further when compared to other types of false information. Finally, for each of these lines of work, we report several future research directions that can help us better understand and mitigate the emerging problem of false information dissemination on the Web

    Context-Preserving Visual Analytics of Multi-Scale Spatial Aggregation.

    Get PDF
    Spatial datasets (i.e., location-based social media, crime incident reports, and demographic data) often exhibit varied distribution patterns at multiple spatial scales. Examining these patterns across different scales enhances the understanding from global to local perspectives and offers new insights into the nature of various spatial phenomena. Conventional navigation techniques in such multi-scale data-rich spaces are often inefficient, require users to choose between an overview or detailed information, and do not support identifying spatial patterns at varying scales. In this work, we present a context-preserving visual analytics technique that aggregates spatial datasets into hierarchical clusters and visualizes the multi-scale aggregates in a single visual space. We design a boundary distortion algorithm to minimize the visual clutter caused by overlapping aggregates and explore visual encoding strategies including color, transparency, shading, and shapes, in order to illustrate the hierarchical and statistical patterns of the multi-scale aggregates. We also propose a transparency-based technique that maintains a smooth visual transition as the users navigate across adjacent scales. To further support effective semantic exploration in the multi-scale space, we design a set of text-based encoding and layout methods that draw textual labels along the boundary or filled within the aggregates. The text itself not only summarizes the semantics at each scale, but also indicates the spatial coverage of the aggregates and their hierarchical relationships. We demonstrate the effectiveness of the proposed approaches through real-world application examples and user studies
    corecore