8 research outputs found
Seminar Users in the Arabic Twitter Sphere
We introduce the notion of "seminar users", who are social media users
engaged in propaganda in support of a political entity. We develop a framework
that can identify such users with 84.4% precision and 76.1% recall. While our
dataset is from the Arab region, omitting language-specific features has only a
minor impact on classification performance, and thus, our approach could work
for detecting seminar users in other parts of the world and in other languages.
We further explored a controversial political topic to observe the prevalence
and potential potency of such users. In our case study, we found that 25% of
the users engaged in the topic are in fact seminar users and their tweets make
nearly a third of the on-topic tweets. Moreover, they are often successful in
affecting mainstream discourse with coordinated hashtag campaigns.Comment: to appear in SocInfo 201
Trolling e dependência online : que relação?
info:eu-repo/semantics/publishedVersio
Detecting Abusive Language on Online Platforms: A Critical Analysis
Abusive language on online platforms is a major societal problem, often
leading to important societal problems such as the marginalisation of
underrepresented minorities. There are many different forms of abusive language
such as hate speech, profanity, and cyber-bullying, and online platforms seek
to moderate it in order to limit societal harm, to comply with legislation, and
to create a more inclusive environment for their users. Within the field of
Natural Language Processing, researchers have developed different methods for
automatically detecting abusive language, often focusing on specific
subproblems or on narrow communities, as what is considered abusive language
very much differs by context. We argue that there is currently a dichotomy
between what types of abusive language online platforms seek to curb, and what
research efforts there are to automatically detect abusive language. We thus
survey existing methods as well as content moderation policies by online
platforms in this light, and we suggest directions for future work
Makine Öğrenmesi Algoritmaları ile Trol Hesapların Tespiti
Sosyal medya kullanımı gün geçtikçe artmakta ve beraberinde birçok yeni problemi de getirmektedir. Kişilerin düşünce, duygu ve fikirlerini kolaylıkla paylaşabildiği bu ortamlarda aşağılayıcı ve rencide edici saldırılarda bulunan hesaplara son zamanlarda sıkça rastlanmaya başlanmıştır. Siber zorbalık (cyberbullying) olarak adlandırılan bu durum ve eylemi yapan trol hesapların insanların bireysel ve sosyal yaşantılarına verdiği zararların engellenmesi bir ihtiyaç haline gelmiştir. Bu tür kullanıcıların mesajları insanları rahatsız etmekte ve sayıları takip edilemeyecek miktarlara ulaştığı durumlarda yazılımlar ile tespit edilmesi, gerektiğinde engellenmesi ve sınırlandırılması gerekmektedir. Biz bu çalışma ile Twitter üzerinde trol davranışları sergileyen kullanıcı hesaplarını tespit etmek için makine öğrenmesi yöntemlerini kullandık. Support Vector Machine (SVM), Logistic Regression (LR) ve Random Forest Regression (RFR) ile Twitter üzerinden topladığımız veriler ile trol kullanıcıların mesajları üzerinden çıkardığımız özellikler (features) ile kapsamlı deneyler gerçekleştirdik. Elde ettiğimiz sonuçlarda %93.93’lere varan oranlarda trol hesaplarını tespit etmeyi ve engellemeyi başardık
Case studies on the use of sentiment analysis to assess the effectiveness and safety of health technologies : a scoping review
A health technology assessment (HTA) is commonly defined as a multidisciplinary approach used to evaluate medical, social, economic, and ethical issues related to the use of a health technology in a systematic, transparent, unbiased, robust manner. To help inform HTA recommendations, the surveillance of social media platforms can provide important insights to the clinical community and to decision makers on the effectiveness and safety of the use of health technologies on a patient. A scoping review of the published literature was performed to gain some insight on the accuracy and automation of sentiment analysis (SA) used to assess public opinion on the use of health technologies. A literature search of major databases was conducted. The main search concepts were SA, social media, and patient perspective. Among the 1,776 unique citations identified, 12 studies that described the use of SA methods to evaluate public opinion on or experiences with the use of health technologies as posted on social media platforms were included. The SA methods used were either lexicon-or machine learning-based. Two studies focused on medical devices, three examined HPV vaccination, and the remaining studies targeted drug therapies. Due to the limitations and inherent differences among SA tools, the outcomes of these applications should be considered exploratory. The results of our study can initiate discussions on how the automation of algorithms to interpret public opinion of health technologies should be further developed to optimize the use of data available on social media
Trollthrottle -- Raising the Cost of Astroturfing
Astroturfing, i.e., the fabrication of public discourse by private or
state-controlled sponsors via the creation of fake online accounts, has become
incredibly widespread in recent years. It gives a disproportionally strong
voice to wealthy and technology-savvy actors, permits targeted attacks on
public forums and could in the long run harm the trust users have in the
internet as a communication platform. Countering these efforts without
deanonymising the participants has not yet proven effective; however, we can
raise the cost of astroturfing. Following the principle `one person, one
voice', we introduce Trollthrottle, a protocol that limits the number of
comments a single person can post on participating websites. Using direct
anonymous attestation and a public ledger, the user is free to choose any
nickname, but the number of comments is aggregated over all posts on all
websites, no matter which nickname was used. We demonstrate the deployability
of Trollthrottle by retrofitting it to the popular news aggregator website
Reddit and by evaluating the cost of deployment for the scenario of a national
newspaper (168k comments per day), an international newspaper (268k c/d) and
Reddit itself (4.9M c/d)
Explorando a misoginia online : síntese das evidências qualitativas dos discursos de ódio
info:eu-repo/semantics/publishedVersio