Search CORE

26 research outputs found

Towards reproducible research of event detection techniques for Twitter

Author: Grossniklaus Michael
Kircher Lukas
Schilling Harry
Weiler Andreas
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

Crossref

ZHAW digitalcollection

Searching for superspreaders of information in real-world social media

Author: Andrade Jr. Jose S.
Makse Hernan A.
Muchnik Lev
Pei Sen
Zheng Zhiming
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 03/07/2014
Field of study

A number of predictors have been suggested to detect the most influential spreaders of information in online social media across various domains such as Twitter or Facebook. In particular, degree, PageRank, k-core and other centralities have been adopted to rank the spreading capability of users in information dissemination media. So far, validation of the proposed predictors has been done by simulating the spreading dynamics rather than following real information flow in social networks. Consequently, only model-dependent contradictory results have been achieved so far for the best predictor. Here, we address this issue directly. We search for influential spreaders by following the real spreading dynamics in a wide range of networks. We find that the widely-used degree and PageRank fail in ranking users' influence. We find that the best spreaders are consistently located in the k-core across dissimilar social platforms such as Twitter, Facebook, Livejournal and scientific publishing in the American Physical Society. Furthermore, when the complete global network structure is unavailable, we find that the sum of the nearest neighbors' degree is a reliable local proxy for user's influence. Our analysis provides practical instructions for optimal design of strategies for "viral" information dissemination in relevant applications.Comment: 12 pages, 7 figure

arXiv.org e-Print Archive

City University of New York

PubMed Central

Inverted Index Entry Invalidation Strategy for Real Time Search

Author: Ríssola Esteban A.
Tolosa Gabriel Hernán
Publication venue
Publication date: 01/10/2015
Field of study

The impressive rise of user-generated content on the web in the hands of sites like Twitter imposes new challenges to search systems. The concept of real-time search emerges, increasing the role that efficient indexing and retrieval algorithms play in this scenario. Thousands of new updates need to be processed in the very moment they are generated and users expect content to be “searchable” within seconds. This lead to the develop of efficient data structures and algorithms that may face this challenge efficiently. In this work, we introduce the concept of index entry invalidator, a strategy responsible for keeping track of the evolu- tion of the underlying vocabulary and selectively invalidóte and evict those inverted index entries that do not considerably degrade retrieval effectiveness. Consequently, the index becomes smaller and may increase overall efficiency. We study the dynamics of the vocabulary using a real dataset and also provide an evaluation of the proposed strategy using a search engine specifically designed for real-time indexing and search.XII Workshop Bases de Datos y Minería de Datos (WBDDM)Red de Universidades con Carreras en Informática (RedUNCI

Creating extended gender labelled datasets of Twitter users

Author: Batista F.
Carvalho J. P.
Vicente M.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

The gender information of a Twitter user is not known a priori when analysing Twitter data, because user registration does not include gender information. This paper proposes an approach for creating extended gender labelled datasets of Twitter users. The process involves creating a smaller database of active Twitter users and to manually label the gender. The process follows by extracting features from unstructured information found on each user profile and by creating a gender classification model. The model is then applied to a larger dataset, thus providing automatic labels and corresponding confidence scores, which can be used to estimate the most accurately labeled users. The resulting databases can be further enriched with additional information extracted, for example, from the profile picture and from the user location. The proposed approach was successfully applied to English and Portuguese users, leading to two large datasets containing more than 57K labeled users each.info:eu-repo/semantics/acceptedVersio

Crossref

Repositório Institucional do ISCTE-IUL

Detecting the Influence of Spreading in Social Networks with Excitable Sensor Networks

Author: Pei Sen
Tang Shaoting
Zheng Zhiming
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 02/04/2015
Field of study

Detecting spreading outbreaks in social networks with sensors is of great significance in applications. Inspired by the formation mechanism of human's physical sensations to external stimuli, we propose a new method to detect the influence of spreading by constructing excitable sensor networks. Exploiting the amplifying effect of excitable sensor networks, our method can better detect small-scale spreading processes. At the same time, it can also distinguish large-scale diffusion instances due to the self-inhibition effect of excitable elements. Through simulations of diverse spreading dynamics on typical real-world social networks (facebook, coauthor and email social networks), we find that the excitable senor networks are capable of detecting and ranking spreading processes in a much wider range of influence than other commonly used sensor placement methods, such as random, targeted, acquaintance and distance strategies. In addition, we validate the efficacy of our method with diffusion data from a real-world online social system, Twitter. We find that our method can detect more spreading topics in practice. Our approach provides a new direction in spreading detection and should be useful for designing effective detection methods

arXiv.org e-Print Archive

Public Library of Science (PLOS)

FigShare

Report on the Evaluation-as-a-Service (EaaS) Expert Workshop

Author: Allan Hanbury
Anastasia Krithara
Balikas Georgios
Frank Hopfgartner
Henning Müller
Ivan Eggel
Jayashree Kalpathy-Cramer
Jimmy
Jimmy Lin
Krisztian Balog
Martin Potthast
Noriko Kando
Ounis Iadh
Potthast Martin
Simon Mercer
Tim Gollub
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 23/06/2015
Field of study

In this report, we summarize the outcome of the "Evaluation-as-a-Service" workshop that was held on the 5th and 6th March 2015 in Sierre, Switzerland. The objective of the meeting was to bring together initiatives that use cloud infrastructures, virtual machines, APIs (Application Programming Interface) and related projects that provide evaluation of information retrieval or machine learning tools as a service

Crossref

Hes-so: ArODES Open Archive (University of Applied Sciences and Arts Western Switzerland / Haute école spécialisée de Suisse occidentale / FH Westschweiz)

Enlighten

Inverted Index Entry Invalidation Strategy for Real Time Search

Author: Ríssola Esteban A.
Tolosa Gabriel Hernán
Publication venue
Publication date: 01/10/2015
Field of study

Servicio de Difusión de la Creación Intelectual

Recommended from our members

This Account Doesn’t Exist: Tweet Decay and the Politics of Deletion in the Brexit Debate

Author: Bastos M. T.
Publication venue: 'SAGE Publications'
Publication date: 01/05/2021
Field of study

Literature on influence operations has identified metrics that are indicative of social media manipulation, but few studies have explored the lifecycle of low-quality information. We contribute to this literature by reconstructing nearly 3 million messages posted by 1 million users in the last days of the Brexit referendum campaign. While previous studies have found that on average only 4% of tweets disappear, we found that 33% of the tweets leading up to the referendum vote are no longer available. Only about half of the most active accounts that tweeted the referendum continue to operate publicly, and 20% of all accounts are no longer active. We tested whether partisan content was more likely to disappear and found more messages from the Leave campaign that disappeared than the entire universe of tweets affiliated with the Remain campaign. We compare these results with an assorted set of 45 hashtags posted in the same period and find that political campaigns present much higher ratios of user and tweet decay. These results are validated by inspecting 2 million Brexit-related tweets posted over a period of nearly 4 years. The article concludes with an overview of these findings and recommendations for future research

City Research Online

Tools for the Analysis and Visualization of Twitter Language Data

Author: Burghardt Manuel
Publication venue: Martin-Luther-Universität Halle-Wittenberg, Institut für Anglistik und Amerikanistik
Publication date: 01/01/2015
Field of study

The microblogging service Twitter provides vast amounts of user-generated language data. In this article I give an overview of related work on Twitter as an object of study. I also describe the anatomy of a Twitter message and discuss typical uses of the Twitter platform. The Twitter Application Programming Interface (API) will be introduced in a generic, non-technical way to provide a basic under-standing of existing opportunities but also limitations when working with Twitter data. I propose a basic classification system for existing tools that can be used for collecting and analyzing Twitter data and introduce some exemplary tools for each category. Then, I present a more comprehensive work-flow for conducting studies with Twitter data, which comprises the following steps: crawling, annotation, analysis and visualization. Finally, I illustrate the generic workflow by describing an exemplary study from the context of social TV research. At the end of the article, the main issues concerning tools and methods for the analysis of Twitter data are briefly addressed

University of Regensburg Publication Server