Search CORE

1,223 research outputs found

BCS SGAI SMA 2013: the BCS SGAI workshop on social media analysis

Author
Publication venue: M. Jeusfeld
Publication date: 01/01/2013
Field of study

Portsmouth University Research Portal (Pure)

Multilingual Twitter Sentiment Classification: The Role of Human Annotators

Author: Grcar Miha
Mozetic Igor
Smailovic Jasmina
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 23/02/2016
Field of study

What are the limits of automated Twitter sentiment classification? We analyze a large set of manually labeled tweets in different languages, use them as training data, and construct automated classification models. It turns out that the quality of classification models depends much more on the quality and size of training data than on the type of the model trained. Experimental results indicate that there is no statistically significant difference between the performance of the top classification models. We quantify the quality of training data by applying various annotator agreement measures, and identify the weakest points of different datasets. We show that the model performance approaches the inter-annotator agreement when the size of the training set is sufficiently large. However, it is crucial to regularly monitor the self- and inter-annotator agreements since this improves the training datasets and consequently the model performance. Finally, we show that there is strong evidence that humans perceive the sentiment classes (negative, neutral, and positive) as ordered

arXiv.org e-Print Archive

Common Language Resources and Technology Infrastructure - Slovenia

Directory of Open Access Journals

PubMed Central

Digital repository of Slovenian research organizations

Hierarchical Character-Word Models for Language Identification

Author: Hathi Shobhit
Jaech Aaron
Mulcaire George
Ostendorf Mari
Smith Noah A.
Publication venue
Publication date: 01/01/2016
Field of study

Social media messages' brevity and unconventional spelling pose a challenge to language identification. We introduce a hierarchical model that learns character and contextualized word-level representations for language identification. Our method performs well against strong base- lines, and can also reveal code-switching

arXiv.org e-Print Archive

Crossref

Feature selection and data sampling methods for learning reputation dimensions: The University of Amsterdam at RepLab 2014

Author: de Rijke M.
Gârbacea C.
Tsagkias M.
Publication venue: CEUR-WS
Publication date: 01/01/2014
Field of study

International Migration, Integration and Social Cohesion online publications

Predicting Rising Follower Counts on Twitter Using Profile Information

Author: Bandari Roja
Gaudeul Alexia
Kaiser Astrid
Noro Tomoya
Oliver J. Eric
Razis Gerasimos
Srinivasan M. S.
Tsur Oren
Twitter
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 09/05/2017
Field of study

When evaluating the cause of one's popularity on Twitter, one thing is considered to be the main driver: Many tweets. There is debate about the kind of tweet one should publish, but little beyond tweets. Of particular interest is the information provided by each Twitter user's profile page. One of the features are the given names on those profiles. Studies on psychology and economics identified correlations of the first name to, e.g., one's school marks or chances of getting a job interview in the US. Therefore, we are interested in the influence of those profile information on the follower count. We addressed this question by analyzing the profiles of about 6 Million Twitter users. All profiles are separated into three groups: Users that have a first name, English words, or neither of both in their name field. The assumption is that names and words influence the discoverability of a user and subsequently his/her follower count. We propose a classifier that labels users who will increase their follower count within a month by applying different models based on the user's group. The classifiers are evaluated with the area under the receiver operator curve score and achieves a score above 0.800.Comment: 10 pages, 3 figures, 8 tables, WebSci '17, June 25--28, 2017, Troy, NY, US

arXiv.org e-Print Archive

Crossref