933 research outputs found
Understanding the Roots of Radicalisation on Twitter
In an increasingly digital world, identifying signs of online extremism sits at the top of the priority list for counter-extremist agencies. Researchers and governments are investing in the creation of advanced information technologies to identify and counter extremism through intelligent large-scale analysis of online data. However, to the best of our knowledge, these technologies are neither based on, nor do they take advantage of, the existing theories and studies of radicalisation. In this paper we propose a computational approach for detecting and predicting the radicalisation influence a user is exposed to, grounded on the notion of âroots of radicalisationâ from social science models. This approach has been applied to analyse and compare the radicalisation level of 112 pro-ISIS vs.112 âgeneral" Twitter users. Our results show the effectiveness of our proposed algorithms in detecting and predicting radicalisation influence, obtaining up to 0.9 F-1 measure for detection and between 0.7 and 0.8 precision for prediction. While this is an initial attempt towards the effective combination of social and computational perspectives, more work is needed to bridge these disciplines, and to build on their strengths to target the problem of online radicalisation
AnĂĄlise de vĂdeo sensĂvel
Orientadores: Anderson de Rezende Rocha, Siome Klein GoldensteinTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: VĂdeo sensĂvel pode ser definido como qualquer filme capaz de oferecer ameaças Ă sua audiĂȘncia. Representantes tĂpicos incluem Âż mas nĂŁo estĂŁo limitados a Âż pornografia, violĂȘncia, abuso infantil, crueldade contra animais, etc. Hoje em dia, com o papel cada vez mais pervasivo dos dados digitais em nossa vidas, a anĂĄlise de conteĂșdo sensĂvel representa uma grande preocupação para representantes da lei, empresas, professores, e pais, devido aos potenciais danos que este tipo de conteĂșdo pode infligir a menores, estudantes, trabalhadores, etc. NĂŁo obstante, o emprego de mediadores humanos, para constantemente analisar grandes quantidades de dados sensĂveis, muitas vezes leva a ocorrĂȘncias de estresse e trauma, o que justifica a busca por anĂĄlises assistidas por computador. Neste trabalho, nĂłs abordamos este problema em duas frentes. Na primeira, almejamos decidir se um fluxo de vĂdeo apresenta ou nĂŁo conteĂșdo sensĂvel, Ă qual nos referimos como classificação de vĂdeo sensĂvel. Na segunda, temos como objetivo encontrar os momentos exatos em que um fluxo começa e termina a exibição de conteĂșdo sensĂvel, em nĂvel de quadros de vĂdeo, Ă qual nos referimos como localização de conteĂșdo sensĂvel. Para ambos os casos, projetamos e desenvolvemos mĂ©todos eficazes e eficientes, com baixo consumo de memĂłria, e adequação Ă implantação em dispositivos mĂłveis. Neste contexto, nĂłs fornecemos quatro principais contribuiçÔes. A primeira Ă© uma nova solução baseada em sacolas de palavras visuais, para a classificação eficiente de vĂdeos sensĂveis, apoiada na anĂĄlise de fenĂŽmenos temporais. A segunda Ă© uma nova solução de fusĂŁo multimodal em alto nĂvel semĂąntico, para a localização de conteĂșdo sensĂvel. A terceira, por sua vez, Ă© um novo detector espaço-temporal de pontos de interesse, e descritor de conteĂșdo de vĂdeo. Finalmente, a quarta contribuição diz respeito a uma base de vĂdeos anotados em nĂvel de quadro, que possui 140 horas de conteĂșdo pornogrĂĄfico, e que Ă© a primeira da literatura a ser adequada para a localização de pornografia. Um aspecto relevante das trĂȘs primeiras contribuiçÔes Ă© a sua natureza de generalização, no sentido de poderem ser empregadas Âż sem modificaçÔes no passo a passo Âż para a detecção de tipos diversos de conteĂșdos sensĂveis, tais como os mencionados anteriormente. Para validação, nĂłs escolhemos pornografia e violĂȘncia Âż dois dos tipos mais comuns de material imprĂłprio Âż como representantes de interesse, de conteĂșdo sensĂvel. Nestes termos, realizamos experimentos de classificação e de localização, e reportamos resultados para ambos os tipos de conteĂșdo. As soluçÔes propostas apresentam uma acurĂĄcia de 93% em classificação de pornografia, e permitem a correta localização de 91% de conteĂșdo pornogrĂĄfico em fluxo de vĂdeo. Os resultados para violĂȘncia tambĂ©m sĂŁo interessantes: com as abordagens apresentadas, nĂłs obtivemos o segundo lugar em uma competição internacional de detecção de cenas violentas. Colocando ambas em perspectiva, nĂłs aprendemos que a detecção de pornografia Ă© mais fĂĄcil que a de violĂȘncia, abrindo vĂĄrias oportunidades de pesquisa para a comunidade cientĂfica. A principal razĂŁo para tal diferença estĂĄ relacionada aos nĂveis distintos de subjetividade que sĂŁo inerentes a cada conceito. Enquanto pornografia Ă© em geral mais explĂcita, violĂȘncia apresenta um espectro mais amplo de possĂveis manifestaçÔesAbstract: Sensitive video can be defined as any motion picture that may pose threats to its audience. Typical representatives include Âż but are not limited to Âż pornography, violence, child abuse, cruelty to animals, etc. Nowadays, with the ever more pervasive role of digital data in our lives, sensitive-content analysis represents a major concern to law enforcers, companies, tutors, and parents, due to the potential harm of such contents over minors, students, workers, etc. Notwithstanding, the employment of human mediators for constantly analyzing huge troves of sensitive data often leads to stress and trauma, justifying the search for computer-aided analysis. In this work, we tackle this problem in two ways. In the first one, we aim at deciding whether or not a video stream presents sensitive content, which we refer to as sensitive-video classification. In the second one, we aim at finding the exact moments a stream starts and ends displaying sensitive content, at frame level, which we refer to as sensitive-content localization. For both cases, we aim at designing and developing effective and efficient methods, with low memory footprint and suitable for deployment on mobile devices. In this vein, we provide four major contributions. The first one is a novel Bag-of-Visual-Words-based pipeline for efficient time-aware sensitive-video classification. The second is a novel high-level multimodal fusion pipeline for sensitive-content localization. The third, in turn, is a novel space-temporal video interest point detector and video content descriptor. Finally, the fourth contribution comprises a frame-level annotated 140-hour pornographic video dataset, which is the first one in the literature that is appropriate for pornography localization. An important aspect of the first three contributions is their generalization nature, in the sense that they can be employed Âż without step modifications Âż to the detection of diverse sensitive content types, such as the previously mentioned ones. For validation, we choose pornography and violence Âż two of the commonest types of inappropriate material Âż as target representatives of sensitive content. We therefore perform classification and localization experiments, and report results for both types of content. The proposed solutions present an accuracy of 93% in pornography classification, and allow the correct localization of 91% of pornographic content within a video stream. The results for violence are also compelling: with the proposed approaches, we reached second place in an international competition of violent scenes detection. Putting both in perspective, we learned that pornography detection is easier than its violence counterpart, opening several opportunities for additional investigations by the research community. The main reason for such difference is related to the distinct levels of subjectivity that are inherent to each concept. While pornography is usually more explicit, violence presents a broader spectrum of possible manifestationsDoutoradoCiĂȘncia da ComputaçãoDoutor em CiĂȘncia da Computação1572763, 1197473CAPE
Visual Event Cueing in Linked Spatiotemporal Data
abstract: The media disperses a large amount of information daily pertaining to political events social movements, and societal conflicts. Media pertaining to these topics, no matter the format of publication used, are framed a particular way. Framing is used not for just guiding audiences to desired beliefs, but also to fuel societal change or legitimize/delegitimize social movements. For this reason, tools that can help to clarify when changes in social discourse occur and identify their causes are of great use. This thesis presents a visual analytics framework that allows for the exploration and visualization of changes that occur in social climate with respect to space and time. Focusing on the links between data from the Armed Conflict Location and Event Data Project (ACLED) and a streaming RSS news data set, users can be cued into interesting events enabling them to form and explore hypothesis. This visual analytics framework also focuses on improving intervention detection, allowing users to hypothesize about correlations between events and happiness levels, and supports collaborative analysis.Dissertation/ThesisMasters Thesis Computer Science 201
A Weakly Supervised Classifier and Dataset of White Supremacist Language
We present a dataset and classifier for detecting the language of white
supremacist extremism, a growing issue in online hate speech. Our weakly
supervised classifier is trained on large datasets of text from explicitly
white supremacist domains paired with neutral and anti-racist data from similar
domains. We demonstrate that this approach improves generalization performance
to new domains. Incorporating anti-racist texts as counterexamples to white
supremacist language mitigates bias.Comment: ACL 2023 shor
Deep Learning for User Comment Moderation
Experimenting with a new dataset of 1.6M user comments from a Greek news
portal and existing datasets of English Wikipedia comments, we show that an RNN
outperforms the previous state of the art in moderation. A deep,
classification-specific attention mechanism improves further the overall
performance of the RNN. We also compare against a CNN and a word-list baseline,
considering both fully automatic and semi-automatic moderation
State of the art 2015: a literature review of social media intelligence capabilities for counter-terrorism
Overview
This paper is a review of how information and insight can be drawn from open social media sources. It focuses on the specific research techniques that have emerged, the capabilities they provide, the possible insights they offer, and the ethical and legal questions they raise. These techniques are considered relevant and valuable in so far as they can help to maintain public safety by preventing terrorism, preparing for it, protecting the public from it and pursuing its perpetrators. The report also considers how far this can be achieved against the backdrop of radically changing technology and public attitudes towards surveillance. This is an updated version of a 2013 report paper on the same subject, State of the Art. Since 2013, there have been significant changes in social media, how it is used by terrorist groups, and the methods being developed to make sense of it.
The paper is structured as follows:
Part 1 is an overview of social media use, focused on how it is used by groups of interest to those involved in counter-terrorism. This includes new sections on trends of social media platforms; and a new section on Islamic State (IS).
Part 2 provides an introduction to the key approaches of social media intelligence (henceforth âSOCMINTâ) for counter-terrorism.
Part 3 sets out a series of SOCMINT techniques. For each technique a series of capabilities and insights are considered, the validity and reliability of the method is considered, and how they might be applied to counter-terrorism work explored.
Part 4 outlines a number of important legal, ethical and practical considerations when undertaking SOCMINT work
- âŠ