6,382 research outputs found

    Exploring Cyberterrorism, Topic Models and Social Networks of Jihadists Dark Web Forums: A Computational Social Science Approach

    Get PDF
    This three-article dissertation focuses on cyber-related topics on terrorist groups, specifically Jihadists’ use of technology, the application of natural language processing, and social networks in analyzing text data derived from terrorists\u27 Dark Web forums. The first article explores cybercrime and cyberterrorism. As technology progresses, it facilitates new forms of behavior, including tech-related crimes known as cybercrime and cyberterrorism. In this article, I provide an analysis of the problems of cybercrime and cyberterrorism within the field of criminology by reviewing existing literature focusing on (a) the issues in defining terrorism, cybercrime, and cyberterrorism, (b) ways that cybercriminals commit a crime in cyberspace, and (c) ways that cyberterrorists attack critical infrastructure, including computer systems, data, websites, and servers. The second article is a methodological study examining the application of natural language processing computational techniques, specifically latent Dirichlet allocation (LDA) topic models and topic network analysis of text data. I demonstrate the potential of topic models by inductively analyzing large-scale textual data of Jihadist groups and supporters from three Dark Web forums to uncover underlying topics. The Dark Web forums are dedicated to Islam and the Islamic world discussions. Some members of these forums sympathize with and support terrorist organizations. Results indicate that topic modeling can be applied to analyze text data automatically; the most prevalent topic in all forums was religion. Forum members also discussed terrorism and terrorist attacks, supporting the Mujahideen fighters. A few of the discussions were related to relationships and marriages, advice, seeking help, health, food, selling electronics, and identity cards. LDA topic modeling is significant for finding topics from larger corpora such as the Dark Web forums. Implications for counterterrorism include the use of topic modeling in real-time classification and removal of online terrorist content and the monitoring of religious forums, as terrorist groups use religion to justify their goals and recruit in such forums for supporters. The third article builds on the second article, exploring the network structures of terrorist groups on the Dark Web forums. The two Dark Web forums\u27 interaction networks were created, and network properties were measured using social network analysis. A member is considered connected and interacting with other forum members when they post in the same threads forming an interaction network. Results reveal that the network structure is decentralized, sparse, and divided based on topics (religion, terrorism, current events, and relationships) and the members\u27 interests in participating in the threads. As participation in forums is an active process, users tend to select platforms most compatible with their views, forming a subgroup or community. However, some members are essential and influential in the information and resources flow within the networks. The key members frequently posted about religion, terrorism, and relationships in multiple threads. Identifying key members is significant for counterterrorism, as mapping network structures and key users are essential for removing and destabilizing terrorist networks. Taken together, this dissertation applies a computational social science approach to the analysis of cyberterrorism and the use of Dark Web forums by jihadists

    Using Natural Language Processing to Mine Multiple Perspectives from Social Media and Scientific Literature.

    Full text link
    This thesis studies how Natural Language Processing techniques can be used to mine perspectives from textual data. The first part of the thesis focuses on analyzing the text exchanged by people who participate in discussions on social media sites. We particularly focus on threaded discussions that discuss ideological and political topics. The goal is to identify the different viewpoints that the discussants have with respect to the discussion topic. We use subjectivity and sentiment analysis techniques to identify the attitudes that the participants carry toward one another and toward the different aspects of the discussion topic. This involves identifying opinion expressions and their polarities, and identifying the targets of opinion. We use this information to represent discussions in one of two representations: discussant attitude vectors or signed attitude networks. We use data mining and network analysis techniques to analyze these representations to detect rifts in discussion groups and study how the discussants split into subgroups with contrasting opinions. In the second part of the thesis, we use linguistic analysis to mine scholars perspectives from scientific literature through the lens of citations. We analyze the text adjacent to reference anchors in scientific articles as a means to identify researchers' viewpoints toward previously published work. We propose methods for identifying, extracting, and cleaning citation text. We analyze this text to identify the purpose (author's intention) and polarity (author's sentiment) of citation. Finally, we present several applications that can benefit from this analysis such as generating multi-perspective summaries of scientific articles and predicting future prominence of publications.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/99934/1/amjbara_1.pd

    Exploring community smells in open-source:an automated approach

    Get PDF
    Software engineering is now more than ever a community effort. Its success often weighs on balancing distance, culture, global engineering practices and more. In this scenario many unforeseen socio-technical events may result into additional project cost or ?social" debt, e.g., sudden, collective employee turnover. With industrial research we discovered community smells, that is, sub-optimal patterns across the organisational and social structure in a software development community that are precursors of such nasty socio-technical events. To understand the impact of community smells at large, in this paper we first introduce CodeFace4Smells, an automated approach able to identify four community smell types that reflect socio-technical issues that have been shown to be detrimental both the software engineering and organisational research fields. Then, we perform a large-scale empirical study involving over 100 years worth of releases and communication structures data of 60 open-source communities: we evaluate (i) their diffuseness, i.e., how much are they distributed in open-source, (ii) how developers perceive them, to understand whether practitioners recognize their presence and their negative effects in practice, and (iii) how community smells relate to existing socio-technical factors, with the aim of assessing the inter-relations between them. The key findings of our study highlight that community smells are highly diffused in open-source and are perceived by developers as relevant problems for the evolution of software communities. Moreover, a number of state-of-the-art socio-technical indicators (e.g., socio-technical congruence) can be used to monitor how healthy a community is and possibly avoid the emergence of social debt.</p

    What Goes Around Comes Around: Learning Sentiments in Online Medical Forums

    Get PDF
    Currently 19%-28% of Internet users participate in online health discussions. A 2011 survey of the US population estimated that 59% of all adults have looked online for information about health topics such as a specific disease or treatment. Although empirical evidence strongly supports the importance of emotions in health-related messages, there are few studies of the relationship between a subjective lan-guage and online discussions of personal health. In this work, we study sentiments expressed on online medical forums. As well as considering the predominant sentiments expressed in individual posts, we analyze sequences of sentiments in online discussions. Individual posts are classified into one of five categories. We identified three categories as sentimental (encouragement, gratitude, confusion) and two categories as neutral (facts, endorsement). 1438 messages from 130 threads were annotated manually by two annotators with a strong inter-annotator agreement (Fleiss kappa = 0.737 and 0.763 for posts in se-quence and separate posts respectively). The annotated posts were used to analyse sentiments in consec-utive posts. In four multi-class classification problems, we assessed HealthAffect, a domain-specific af-fective lexicon, as well general sentiment lexicons in their ability to represent messages in sentiment recognition

    Mining user viewpoints in online discussions

    Get PDF
    • …
    corecore