Search CORE

2,397 research outputs found

Tracking the History and Evolution of Entities: Entity-centric Temporal Analysis of Large Social Media Archives

Author: Fafalios Pavlos
Iosifidis Vasileios
Ntoutsi Eirini
Stefanidis Kostas
Publication venue
Publication date: 24/10/2018
Field of study

How did the popularity of the Greek Prime Minister evolve in 2015? How did the predominant sentiment about him vary during that period? Were there any controversial sub-periods? What other entities were related to him during these periods? To answer these questions, one needs to analyze archived documents and data about the query entities, such as old news articles or social media archives. In particular, user-generated content posted in social networks, like Twitter and Facebook, can be seen as a comprehensive documentation of our society, and thus meaningful analysis methods over such archived data are of immense value for sociologists, historians and other interested parties who want to study the history and evolution of entities and events. To this end, in this paper we propose an entity-centric approach to analyze social media archives and we define measures that allow studying how entities were reflected in social media in different time periods and under different aspects, like popularity, attitude, controversiality, and connectedness with other entities. A case study using a large Twitter archive of four years illustrates the insights that can be gained by such an entity-centric and multi-aspect analysis.Comment: This is a preprint of an article accepted for publication in the International Journal on Digital Libraries (2018

arXiv.org e-Print Archive

Trepo - Institutional Repository of Tampere University

Cashtag piggybacking: uncovering spam and bot activity in stock microblogs on Twitter

Author: Cresci Stefano
Lillo Fabrizio
Regoli Daniele
Tardelli Serena
Tesconi Maurizio
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 17/07/2018
Field of study

Microblogs are increasingly exploited for predicting prices and traded volumes of stocks in financial markets. However, it has been demonstrated that much of the content shared in microblogging platforms is created and publicized by bots and spammers. Yet, the presence (or lack thereof) and the impact of fake stock microblogs has never systematically been investigated before. Here, we study 9M tweets related to stocks of the 5 main financial markets in the US. By comparing tweets with financial data from Google Finance, we highlight important characteristics of Twitter stock microblogs. More importantly, we uncover a malicious practice - referred to as cashtag piggybacking - perpetrated by coordinated groups of bots and likely aimed at promoting low-value stocks by exploiting the popularity of high-value ones. Among the findings of our study is that as much as 71% of the authors of suspicious financial tweets are classified as bots by a state-of-the-art spambot detection algorithm. Furthermore, 37% of them were suspended by Twitter a few months after our investigation. Our results call for the adoption of spam and bot detection techniques in all studies and applications that exploit user-generated content for predicting the stock market

arXiv.org e-Print Archive

Archivio istituzionale della Ricerca - Scuola Normale Superiore

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

A unified view of data-intensive flows in business intelligence systems : a survey

Author: Abelló Gamazo Alberto
Jovanovic Petar
Romero Moral Óscar
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Data-intensive flows are central processes in today’s business intelligence (BI) systems, deploying different technologies to deliver data, from a multitude of data sources, in user-preferred and analysis-ready formats. To meet complex requirements of next generation BI systems, we often need an effective combination of the traditionally batched extract-transform-load (ETL) processes that populate a data warehouse (DW) from integrated data sources, and more real-time and operational data flows that integrate source data at runtime. Both academia and industry thus must have a clear understanding of the foundations of data-intensive flows and the challenges of moving towards next generation BI environments. In this paper we present a survey of today’s research on data-intensive flows and the related fundamental fields of database theory. The study is based on a proposed set of dimensions describing the important challenges of data-intensive flows in the next generation BI setting. As a result of this survey, we envision an architecture of a system for managing the lifecycle of data-intensive flows. The results further provide a comprehensive understanding of data-intensive flows, recognizing challenges that still are to be addressed, and how the current solutions can be applied for addressing these challenges.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Islamic view towards Bitcoin

Author: JARBOUI Anis
MNIF Emna
Publication venue: University of Turin
Publication date: 11/10/2020
Field of study

This paper proposes to analyze the agent behavior by means of big data extracted from the search engine « Google trends » and Twitter API to visualize the emotions and the manner of thinking about « Bitcoin » in the Islamic context. Two kinds of sentiment measures are constructed. The first is based on the search query of the word « Bitcoin » with religious connotation all over the world from 14/04/2017 to 14/04/2018 in weekly frequency. The second is built on twitter data from 03/04/2018 to 13/04/2018, by using a Bayesian machine learning device exploiting deep natural language processing modules to assign emotions and sentiment orientations. In the next step, the Granger causality analysis is used to investigate the hypothesis that this sentiment causes the volatility and the returns of « Bitcoin ». The results show that, at a first-level that twitter users of the word « Islamic Bitcoin » improve positive sentiment. Secondly, the Twitter sentiment measure has a significant effect on lagged Bitcoin returns and volatility. Furthermore, this sentimental variable Granger causes Bitcoin returns and volatility.  This study contributes to the literature by studying the influence of the doctrinal view towards Bitcoin on his prices dynamics. Knowing that Bitcoin is a new financial asset and there is a large debate on his compliance with sharia

Università degli studi di Torino: [email protected] - SIstema RIviste Open access

We usually don't like going to the dentist : using common sense to detect irony on Twitter

Author: Hoste Veronique
Lefever Els
Van Hee Cynthia
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/2018
Field of study

Ghent University Academic Bibliography

Archivsystem Ask23

A novel Big Data analytics and intelligent technique to predict driver's intent

Author: Abtahi
Adam Grzywaczewski
Agrawal
Al-Sultan
Asimov
Bernardo
Bezdek
Bhavsar
Bostrom
Chang
Chen
Dawson
De Domenico
Diaz-Cabrera
Doctor
Doctor
Dreier
Faiyaz Doctor
Filev
Froehlich
Gerhardt
Grudin
Grzywaczewski
Hashem
Hawkins
Hawkins
Haykin
Hirsch
Huang
Huang
Iqbal
Jaguar Land Rover Limited
Jain
James
Kaisler
Kapicioglu
Karyotis
Karyotis
Kotsiantis
Kumar
Kumar
Kurihata
Lech Birek
Liao
Liu
Luukka
Mahmud
Maniak
Maniak
McFarland
McInerney
Mitchell
Nasoz
Noulas
Palen
Pang
Parpinelli
Poli
Quercia
Rahat Iqbal
Rainville
Reininger
Richards
Rish
Sagiroglu
Simmons
Sun
Suthaharan
Tan
Tran
Utgoff
Victor Chang
Wang
Warren
Wells-Parker
Whitley
Zadeh
Publication venue: 'Elsevier BV'
Publication date: 06/04/2018
Field of study

Modern age offers a great potential for automatically predicting the driver's intent through the increasing miniaturization of computing technologies, rapid advancements in communication technologies and continuous connectivity of heterogeneous smart objects. Inside the cabin and engine of modern cars, dedicated computer systems need to possess the ability to exploit the wealth of information generated by heterogeneous data sources with different contextual and conceptual representations. Processing and utilizing this diverse and voluminous data, involves many challenges concerning the design of the computational technique used to perform this task. In this paper, we investigate the various data sources available in the car and the surrounding environment, which can be utilized as inputs in order to predict driver's intent and behavior. As part of investigating these potential data sources, we conducted experiments on e-calendars for a large number of employees, and have reviewed a number of available geo referencing systems. Through the results of a statistical analysis and by computing location recognition accuracy results, we explored in detail the potential utilization of calendar location data to detect the driver's intentions. In order to exploit the numerous diverse data inputs available in modern vehicles, we investigate the suitability of different Computational Intelligence (CI) techniques, and propose a novel fuzzy computational modelling methodology. Finally, we outline the impact of applying advanced CI and Big Data analytics techniques in modern vehicles on the driver and society in general, and discuss ethical and legal issues arising from the deployment of intelligent self-learning cars

University of Essex Research Repository

Crossref

Teeside University's Research Repository

Coventry University Pure Portal