8 research outputs found

    Seminar Users in the Arabic Twitter Sphere

    Full text link
    We introduce the notion of "seminar users", who are social media users engaged in propaganda in support of a political entity. We develop a framework that can identify such users with 84.4% precision and 76.1% recall. While our dataset is from the Arab region, omitting language-specific features has only a minor impact on classification performance, and thus, our approach could work for detecting seminar users in other parts of the world and in other languages. We further explored a controversial political topic to observe the prevalence and potential potency of such users. In our case study, we found that 25% of the users engaged in the topic are in fact seminar users and their tweets make nearly a third of the on-topic tweets. Moreover, they are often successful in affecting mainstream discourse with coordinated hashtag campaigns.Comment: to appear in SocInfo 201

    Trolling e dependência online : que relação?

    Get PDF

    Detecting Abusive Language on Online Platforms: A Critical Analysis

    Full text link
    Abusive language on online platforms is a major societal problem, often leading to important societal problems such as the marginalisation of underrepresented minorities. There are many different forms of abusive language such as hate speech, profanity, and cyber-bullying, and online platforms seek to moderate it in order to limit societal harm, to comply with legislation, and to create a more inclusive environment for their users. Within the field of Natural Language Processing, researchers have developed different methods for automatically detecting abusive language, often focusing on specific subproblems or on narrow communities, as what is considered abusive language very much differs by context. We argue that there is currently a dichotomy between what types of abusive language online platforms seek to curb, and what research efforts there are to automatically detect abusive language. We thus survey existing methods as well as content moderation policies by online platforms in this light, and we suggest directions for future work

    Makine Öğrenmesi Algoritmaları ile Trol Hesapların Tespiti

    Get PDF
    Sosyal medya kullanımı gün geçtikçe artmakta ve beraberinde birçok yeni problemi de getirmektedir. Kişilerin düşünce, duygu ve fikirlerini kolaylıkla paylaşabildiği bu ortamlarda aşağılayıcı ve rencide edici saldırılarda bulunan hesaplara son zamanlarda sıkça rastlanmaya başlanmıştır. Siber zorbalık (cyberbullying) olarak adlandırılan bu durum ve eylemi yapan trol hesapların insanların bireysel ve sosyal yaşantılarına verdiği zararların engellenmesi bir ihtiyaç haline gelmiştir. Bu tür kullanıcıların mesajları insanları rahatsız etmekte ve sayıları takip edilemeyecek miktarlara ulaştığı durumlarda yazılımlar ile tespit edilmesi, gerektiğinde engellenmesi ve sınırlandırılması gerekmektedir. Biz bu çalışma ile Twitter üzerinde trol davranışları sergileyen kullanıcı hesaplarını tespit etmek için makine öğrenmesi yöntemlerini kullandık. Support Vector Machine (SVM), Logistic Regression (LR) ve Random Forest Regression (RFR) ile Twitter üzerinden topladığımız veriler ile trol kullanıcıların mesajları üzerinden çıkardığımız özellikler (features) ile kapsamlı deneyler gerçekleştirdik. Elde ettiğimiz sonuçlarda %93.93’lere varan oranlarda trol hesaplarını tespit etmeyi ve engellemeyi başardık

    Case studies on the use of sentiment analysis to assess the effectiveness and safety of health technologies : a scoping review

    Get PDF
    A health technology assessment (HTA) is commonly defined as a multidisciplinary approach used to evaluate medical, social, economic, and ethical issues related to the use of a health technology in a systematic, transparent, unbiased, robust manner. To help inform HTA recommendations, the surveillance of social media platforms can provide important insights to the clinical community and to decision makers on the effectiveness and safety of the use of health technologies on a patient. A scoping review of the published literature was performed to gain some insight on the accuracy and automation of sentiment analysis (SA) used to assess public opinion on the use of health technologies. A literature search of major databases was conducted. The main search concepts were SA, social media, and patient perspective. Among the 1,776 unique citations identified, 12 studies that described the use of SA methods to evaluate public opinion on or experiences with the use of health technologies as posted on social media platforms were included. The SA methods used were either lexicon-or machine learning-based. Two studies focused on medical devices, three examined HPV vaccination, and the remaining studies targeted drug therapies. Due to the limitations and inherent differences among SA tools, the outcomes of these applications should be considered exploratory. The results of our study can initiate discussions on how the automation of algorithms to interpret public opinion of health technologies should be further developed to optimize the use of data available on social media

    Trollthrottle -- Raising the Cost of Astroturfing

    Get PDF
    Astroturfing, i.e., the fabrication of public discourse by private or state-controlled sponsors via the creation of fake online accounts, has become incredibly widespread in recent years. It gives a disproportionally strong voice to wealthy and technology-savvy actors, permits targeted attacks on public forums and could in the long run harm the trust users have in the internet as a communication platform. Countering these efforts without deanonymising the participants has not yet proven effective; however, we can raise the cost of astroturfing. Following the principle `one person, one voice', we introduce Trollthrottle, a protocol that limits the number of comments a single person can post on participating websites. Using direct anonymous attestation and a public ledger, the user is free to choose any nickname, but the number of comments is aggregated over all posts on all websites, no matter which nickname was used. We demonstrate the deployability of Trollthrottle by retrofitting it to the popular news aggregator website Reddit and by evaluating the cost of deployment for the scenario of a national newspaper (168k comments per day), an international newspaper (268k c/d) and Reddit itself (4.9M c/d)