Search CORE

9 research outputs found

Active learning in annotating micro-blogs dealing with e-reputation

Author: Cossu Jean-Valère
Molina-Villegas Alejandro
Tello-Signoret Mariana
Publication venue
Publication date: 25/09/2017
Field of study

Elections unleash strong political views on Twitter, but what do people really think about politics? Opinion and trend mining on micro blogs dealing with politics has recently attracted researchers in several fields including Information Retrieval and Machine Learning (ML). Since the performance of ML and Natural Language Processing (NLP) approaches are limited by the amount and quality of data available, one promising alternative for some tasks is the automatic propagation of expert annotations. This paper intends to develop a so-called active learning process for automatically annotating French language tweets that deal with the image (i.e., representation, web reputation) of politicians. Our main focus is on the methodology followed to build an original annotated dataset expressing opinion from two French politicians over time. We therefore review state of the art NLP-based ML algorithms to automatically annotate tweets using a manual initiation step as bootstrap. This paper focuses on key issues about active learning while building a large annotated data set from noise. This will be introduced by human annotators, abundance of data and the label distribution across data and entities. In turn, we show that Twitter characteristics such as the author's name or hashtags can be considered as the bearing point to not only improve automatic systems for Opinion Mining (OM) and Topic Classification but also to reduce noise in human annotations. However, a later thorough analysis shows that reducing noise might induce the loss of crucial information.Comment: Journal of Interdisciplinary Methodologies and Issues in Science - Vol 3 - Contextualisation digitale - 201

arXiv.org e-Print Archive

Episciences.org

The Effects of Twitter Sentiment on Stock Price Returns

Author: A Gross-Klussmann
A Vespignani
AC MacKinlay
AG Haldane
B Pang
B Sluban
BG Malkiel
C Curme
C Vega
CWJ Granger
Darko Aleksovski
DM Cutler
E Boehmer
F Lillo
F Schweitzer
G Birz
G King
Gabriele Ranco
Guido Caldarelli
HS Moat
I Bordino
I Zheludev
Igor Mozetič
IH Witten
J Bollen
JE Engelberg
JP Bouchaud
JY Campbell
L Gaudette
L Kristoufek
M Alanyali
M Graham
M Juršič
M Piškorec
Miha Grčar
PC Tetlock
PC Tetlock
S Kiritchenko
T Preis
T Varkman
TO Sprenger
TO Sprenger
Tobias Preis
VN Vapnik
WS Chan
Z Da
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2015
Field of study

Social media are increasingly reflecting and influencing behavior of other complex systems. In this paper we investigate the relations between a well-know micro-blogging platform Twitter and financial markets. In particular, we consider, in a period of 15 months, the Twitter volume and sentiment about the 30 stock companies that form the Dow Jones Industrial Average (DJIA) index. We find a relatively low Pearson correlation and Granger causality between the corresponding time series over the entire time period. However, we find a significant dependence between the Twitter sentiment and abnormal returns during the peaks of Twitter volume. This is valid not only for the expected Twitter volume peaks (e.g., quarterly announcements), but also for peaks corresponding to less obvious events. We formalize the procedure by adapting the well-known "event study" from economics and finance to the analysis of Twitter data. The procedure allows to automatically identify events as Twitter volume peaks, to compute the prevailing sentiment (positive or negative) expressed in tweets at these peaks, and finally to apply the "event study" methodology to relate them to stock returns. We show that sentiment polarity of Twitter peaks implies the direction of cumulative abnormal returns. The amount of cumulative abnormal returns is relatively low (about 1-2%), but the dependence is statistically significant for several days after the events

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

PubMed Central

Digital repository of Slovenian research organizations

IMT Institutional Repository

FigShare

Multilingual Twitter Sentiment Classification: The Role of Human Annotators

Author: Grcar Miha
Mozetic Igor
Smailovic Jasmina
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 23/02/2016
Field of study

What are the limits of automated Twitter sentiment classification? We analyze a large set of manually labeled tweets in different languages, use them as training data, and construct automated classification models. It turns out that the quality of classification models depends much more on the quality and size of training data than on the type of the model trained. Experimental results indicate that there is no statistically significant difference between the performance of the top classification models. We quantify the quality of training data by applying various annotator agreement measures, and identify the weakest points of different datasets. We show that the model performance approaches the inter-annotator agreement when the size of the training set is sufficiently large. However, it is crucial to regularly monitor the self- and inter-annotator agreements since this improves the training datasets and consequently the model performance. Finally, we show that there is strong evidence that humans perceive the sentiment classes (negative, neutral, and positive) as ordered

arXiv.org e-Print Archive

Common Language Resources and Technology Infrastructure - Slovenia

Directory of Open Access Journals

PubMed Central

Digital repository of Slovenian research organizations

La structure linguistique de tweets en campagne présidentielle

Author: Ana Zwitter Vitez
Publication venue: 'University of Ljubljana'
Publication date: 01/12/2022
Field of study

L’objectif de cette étude est de faire une analyse systématique de tweets publiés par Emmanuel Macron et Marine Le Pen pendant la campagne présidentielle en 2022. L’analyse est réalisée aux niveaux textuel, syntaxique, énonciatif et thématique. Les résultats montrent de légères différences au niveau textuel et syntaxique et des différences saillantes au niveau énonciatif (jugements de valeur, verbes modaux, emphase) et thématique. La méthodologie proposée permet aux linguistes sans compétences computationnelles d’obtenir des résultats quantifiables et en même temps interprétables par des catégories linguistiques traditionnelles.

Directory of Open Access Journals

Parallel data processing, analysis and visualization using high scalability mechanisms

Author: Gačnik Matevž
Publication venue
Publication date: 24/08/2016
Field of study

In this work we present conceptual and implementation model for scalable, distributed and balanced execution of large number of compute operations running on multiple processing units in the cloud. We provide system development methods for large scale processing with minimal time constraints and limitations in regard to increasing scale-out parallelism in the cloud. Implementation details regarding elastic adjustment to processing units are discussed in connection to required processing power needed in a cloud environment. Work provides filtering approaches for useful data in the described problem domain. We present options for advanced data filtering in multiple stages, which correlate with needed analyses requirements. At the end of this work we present ways of visualization of advanced analysis of gathered data in a form of intuitive and interactive UI components, graphs, word clouds and other user acceptable views

Parallel data processing, analysis and visualization using high scalability mechanisms

Author: Gačnik Matevž
Publication venue
Publication date: 24/08/2016
Field of study

Repository of the University of Ljubljana

ePrints.FRI

Monitoring the Twitter sentiment during the Bulgarian elections

Author: Grčar Miha
Kranjc Janez
Mozetič Igor
Smailović Jasmina
Žnidaršič Martin
Publication venue: IEEE = Institute of Electrical and Electronics Engineers
Publication date: 05/02/2016
Field of study

Crossref

Digital repository of Slovenian research organizations