5,420 research outputs found

    Exploring Users’ Interactive Behaviors in Online Group: A Case Study of QQ Group “TuanRenTang”

    Get PDF
    The users’ interactive behaviors of the online group chat and an accurate identification of users’ interaction, which can provide method support for mining user interests and the crowd labeling, was analyzed in this paper. By using social network analysis method, the study took QQ Group “TuanRenTang” as an example to analyze users’ interactive behaviors, discover users’ interaction relationships, construct interaction networks, and explore the interaction types and community detection. The findings suggested that both explicit and implicit interaction exist in the same topic discussion. Users could be classified into four categories: active interaction, general interaction, passive interaction and lurking interaction based on different user activity. Besides, twenty “experts” and eight communities on the basis of interaction networks had been found out from the sample data of “TuanRenTang” chat records

    Solutions to Detect and Analyze Online Radicalization : A Survey

    Full text link
    Online Radicalization (also called Cyber-Terrorism or Extremism or Cyber-Racism or Cyber- Hate) is widespread and has become a major and growing concern to the society, governments and law enforcement agencies around the world. Research shows that various platforms on the Internet (low barrier to publish content, allows anonymity, provides exposure to millions of users and a potential of a very quick and widespread diffusion of message) such as YouTube (a popular video sharing website), Twitter (an online micro-blogging service), Facebook (a popular social networking website), online discussion forums and blogosphere are being misused for malicious intent. Such platforms are being used to form hate groups, racist communities, spread extremist agenda, incite anger or violence, promote radicalization, recruit members and create virtual organi- zations and communities. Automatic detection of online radicalization is a technically challenging problem because of the vast amount of the data, unstructured and noisy user-generated content, dynamically changing content and adversary behavior. There are several solutions proposed in the literature aiming to combat and counter cyber-hate and cyber-extremism. In this survey, we review solutions to detect and analyze online radicalization. We review 40 papers published at 12 venues from June 2003 to November 2011. We present a novel classification scheme to classify these papers. We analyze these techniques, perform trend analysis, discuss limitations of existing techniques and find out research gaps

    Turning Unstructured and Incoherent Group Discussion into DATree: A TBL Coherence Analysis Approach

    Get PDF
    Despite the rapid growth of user-generated unstructured text from online group discussions, business decision-makers are facing the challenge of understanding its highly incoherent content. Coherence analysis attempts to reconstruct the order of discussion messages. However, existing methods only focus on system and cohesion features. While they work with asynchronous discussions, they fail with synchronous discussions because these features rarely appear. We believe that discussion logic features play an important role in coherence analysis. Therefore, we propose a TCA method for coherence analysis, which is composed of a novel message similarity measure algorithm, a subtopic segmentation algorithm and a TBL-based classification algorithm. System, cohesion and discussion logic features are all incorporated into our TCA method. Results from experiments showed that the TCA method achieved significantly better performance than existing methods. Furthermore, we illustrate that the DATree generated by the TCA method can enhance decision-makers’ content analysis capability

    Utilizing Multi-modal Weak Signals to Improve User Stance Inference in Social Media

    Get PDF
    Social media has become an integral component of the daily life. There are millions of various types of content being released into social networks daily. This allows for an interesting view into a users\u27 view on everyday life. Exploring the opinions of users in social media networks has always been an interesting subject for the Natural Language Processing researchers. Knowing the social opinions of a mass will allow anyone to make informed policy or marketing related decisions. This is exactly why it is desirable to find comprehensive social opinions. The nature of social media is complex and therefore obtaining the social opinion becomes a challenging task. Because of how diverse and complex social media networks are, they typically resonate with the actual social connections but in a digital platform. Similar to how users make friends and companions in the real world, the digital platforms enable users to mimic similar social connections. This work mainly looks at how to obtain a comprehensive social opinion out of social media network. Typical social opinion quantifiers will look at text contributions made by users to find the opinions. Currently, it is challenging because the majority of users on social media will be consuming content rather than expressing their opinions out into the world. This makes natural language processing based methods impractical due to not having linguistic features. In our work we look to improve a method named stance inference which can utilize multi-domain features to extract the social opinion. We also introduce a method which can expose users opinions even though they do not have on-topical content. We also note how by introducing weak supervision to an unsupervised task of stance inference we can improve the performance. The weak supervision we bring into the pipeline is through hashtags. We show how hashtags are contextual indicators added by humans which will be much likelier to be related than a topic model. Lastly we introduce disentanglement methods for chronological social media networks which allows one to utilize the methods we introduce above to be applied in these type of platforms

    Discovering topics in Slack message streams

    Get PDF
    Slack is an instant messaging platform intended for the internal communications of companies and other organizations. For organizations that use Slack extensively it may provide an interesting source of insight, but as such the data is difficult to analyze. Topic modeling, primarily latent Dirichlet allocation (LDA), is commonly used to summarize textual data in a meaningful way. Instant messages tend to be very short, which causes problems for conventional topic modeling methods such as LDA. The data sparsity problem can be tackled with data expansion and data combination techniques. For instant messages, data combination is particularly attractive as the messages are not independent of each other, but form implicit, and sometimes expicit, threads as the participants reply to each other. Most of the threads in the Slack data are not explicit, but must be ’untangled’ from the message stream if they are to be used as a basis for a data combination scheme. In this thesis we study the possibility of detecting implicit threads from a slack message stream and leveraging the threads as a data combination scheme in topic modeling. The threads are detected using a hierarchical clustering algorithm which uses word mover’s distance, latent semantic analysis, and metadata to compute the distances between messages. The clusters are then concatenated and used as the input for LDA. It is shown that on a dataset gathered from the Gofore Oyj Slack workspace, the cluster-based model improves on the message-based model, but falls short of being practical
    • …
    corecore