Search CORE

42 research outputs found

ConStance: Modeling Annotation Contexts to Improve Stance Classification

Author: Friedland Lisa
Hobbs William
Joseph Kenneth
Lazer David
Tsur Oren
Publication venue
Publication date: 01/01/2017
Field of study

Manual annotations are a prerequisite for many applications of machine learning. However, weaknesses in the annotation process itself are easy to overlook. In particular, scholars often choose what information to give to annotators without examining these decisions empirically. For subjective tasks such as sentiment analysis, sarcasm, and stance detection, such choices can impact results. Here, for the task of political stance detection on Twitter, we show that providing too little context can result in noisy and uncertain annotations, whereas providing too strong a context may cause it to outweigh other signals. To characterize and reduce these biases, we develop ConStance, a general model for reasoning about annotations across information conditions. Given conflicting labels produced by multiple annotators seeing the same instances with different contexts, ConStance simultaneously estimates gold standard labels and also learns a classifier for new instances. We show that the classifier learned by ConStance outperforms a variety of baselines at predicting political stance, while the model's interpretable parameters shed light on the effects of each context.Comment: To appear at EMNLP 201

arXiv.org e-Print Archive

Crossref

Global Contagion of Non-Viral Information

Author: Bartal Alon
Ravid Gilad
Tsur Oren
Publication venue: AIS Electronic Library (AISeL)
Publication date: 01/01/2020
Field of study

Contagion in Online Social Networks (OSN) is typically measured by the tendency of users to re-post information or to adopt a new behavior after exposure to that information/behavior. Most contagion research is bound by modeling: (i) only local neighbor-to-neighbor contagion (ii) the spread of viral information. However, most contagion events are non-viral and can also occur globally by non-neighbors through for example, exposure to information by exploratory browsing, or by content recommendation algorithms. This study is the first to address the phenomenon of both global and local contagion of non-viral information in a quantitative way. Analysis of Twitter networks reveals the prevailing nature of global contagion, the different temporal patterns between global and local contagion, and the ways it varies across topical categories. An interesting finding shows that users who retweeted due to global contagion have more Followers than those who retweeted due to local contagion

Crossref

ScholarSpace at University of Hawai'i at Manoa

AIS Electronic Library (AISeL)

Sharp power in social media: Patterns from datasets across electoral campaigns

Author: Hanouna Simo
Neu Omer
Pardo Sharon
Tsur Oren
Zahavi Hila
Publication venue: The European Studies Association of Australia and New Zealand (ESAANZ)
Publication date: 05/02/2021
Field of study

Using Christopher Walker’s and Jessica Ludwig’s ‘sharp power’ theoretical framework, and based on some preliminary findings from the May 2019 European Parliament election and the two 2019 rounds of elections in Israel, this article describes a novel method for the automatic detection of political trolls and bots active in Twitter in the October 2019 federal election in Canada. The research identified thousands of accounts invested in Canadian politics that presented a unique activity pattern, significantly different from accounts in a control group. The large-scale cross-cross-sectional approach enabled a distinctive perspective on foreign political meddling in Twitter during the recent federal election campaign. Thisforeign political meddling, we argue, aims at manipulating and poisoning the democratic process and can challenge democracies and their values, as well as their societal resilience

The University of Sydney: Sydney eScholarship Journals online

Predicting Rising Follower Counts on Twitter Using Profile Information

Author: Bandari Roja
Gaudeul Alexia
Kaiser Astrid
Noro Tomoya
Oliver J. Eric
Razis Gerasimos
Srinivasan M. S.
Tsur Oren
Twitter
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 09/05/2017
Field of study

When evaluating the cause of one's popularity on Twitter, one thing is considered to be the main driver: Many tweets. There is debate about the kind of tweet one should publish, but little beyond tweets. Of particular interest is the information provided by each Twitter user's profile page. One of the features are the given names on those profiles. Studies on psychology and economics identified correlations of the first name to, e.g., one's school marks or chances of getting a job interview in the US. Therefore, we are interested in the influence of those profile information on the follower count. We addressed this question by analyzing the profiles of about 6 Million Twitter users. All profiles are separated into three groups: Users that have a first name, English words, or neither of both in their name field. The assumption is that names and words influence the discoverability of a user and subsequently his/her follower count. We propose a classifier that labels users who will increase their follower count within a month by applying different models based on the user's group. The classifiers are evaluated with the area under the receiver operator curve score and achieves a score above 0.800.Comment: 10 pages, 3 figures, 8 tables, WebSci '17, June 25--28, 2017, Troy, NY, US

arXiv.org e-Print Archive

Crossref

Proceedings of the Second Workshop on Natural Language Processing and Computational Social Science

Author: Bamman David
Doğruöz A. Seza
Hovy Dirk
Jurgens David
O'connor Brendan
Tsur Oren
Volkova Svitlana
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2017
Field of study

Ghent University Academic Bibliography

Proceedings of the First Workshop on NLP and Computational Social Science

Author: Bamnan David
Doğruöz A. Seza
Eisenstein Jason
Hovy Dirk
Jurgens David
O'Connor Brendan
Oh Alice
Tsur Oren
Volkova Svitlana
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2016
Field of study

Ghent University Academic Bibliography

Negative Confidence-Aware Weakly Supervised Binary Classification for Effective Review Helpfulness Classification

Author: Bao Han
Devlin Jacob
Diaz Gerardo Ocampo
Ishida Takashi
Ishida Takashi
Jain Shantanu
Khan Shehroz S
Kiryo Ryuichi
Liu Jingjing
Pennington Jeffrey
Plessis Marthinus Du
Scott Clayton
Sra Suvrit
Tsur Oren
Veropoulos Konstantinos
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 14/08/2020
Field of study

The incompleteness of positive labels and the presence of many unlabelled instances are common problems in binary classification applications such as in review helpfulness classification. Various studies from the classification literature consider all unlabelled instances as negative examples. However, a classification model that learns to classify binary instances with incomplete positive labels while assuming all unlabelled data to be negative examples will often generate a biased classifier. In this work, we propose a novel Negative Confidence-aware Weakly Supervised approach (NCWS), which customises a binary classification loss function by discriminating the unlabelled examples with different negative confidences during the classifier's training. NCWS allows to effectively, unbiasedly identify and separate positive and negative instances after its integration into various binary classifiers from the literature, including SVM, CNN and BERT-based classifiers. We use the review helpfulness classification as a test case for examining the effectiveness of our NCWS approach. We thoroughly evaluate NCWS by using three different datasets, namely one from Yelp (venue reviews), and two from Amazon (Kindle and Electronics reviews). Our results show that NCWS outperforms strong baselines from the literature including an existing SVM-based approach (i.e. SVM-P), the positive and unlabelled learning-based approach (i.e. C-PU) and the positive confidence-based approach (i.e. P-conf) in addressing the classifier's bias problem. Moreover, we further examine the effectiveness of NCWS by using its classified helpful reviews in a state-of-the-art review-based venue recommendation model (i.e. DeepCoNN) and demonstrate the benefits of using NCWS in enhancing venue recommendation effectiveness in comparison to the baselines

arXiv.org e-Print Archive

Crossref

Enlighten

What’s in a Hashtag? Content based Prediction of the Spread of Ideas in Microblogging Communities

Author: Oren Tsur
Publication venue
Publication date: 01/01/2012
Field of study

Current social media research mainly focuses on temporal trends of the information flow and on the topology of the social graph that facilitates the propagation of information. In this paper we study the effect of the content of the idea on the information propagation. We present an efficient hybrid approach based on a linear regression for predicting the spread of an idea in a given time frame. We show that a combination of content features with temporal and topological features minimizes prediction error. Our algorithm is evaluated on Twitter hashtags extracted from a dataset of more than 400 million tweets. We analyze the contribution and the limitations of the various feature types to the spread of information, demonstrating that content aspects can be used as strong predictors thus should not be disregarded. We also study the dependencies between global features such as graph topology and content features

CiteSeerX