64,867 research outputs found

    Topological Feature Based Classification

    Full text link
    There has been a lot of interest in developing algorithms to extract clusters or communities from networks. This work proposes a method, based on blockmodelling, for leveraging communities and other topological features for use in a predictive classification task. Motivated by the issues faced by the field of community detection and inspired by recent advances in Bayesian topic modelling, the presented model automatically discovers topological features relevant to a given classification task. In this way, rather than attempting to identify some universal best set of clusters for an undefined goal, the aim is to find the best set of clusters for a particular purpose. Using this method, topological features can be validated and assessed within a given context by their predictive performance. The proposed model differs from other relational and semi-supervised learning models as it identifies topological features to explain the classification decision. In a demonstration on a number of real networks the predictive capability of the topological features are shown to rival the performance of content based relational learners. Additionally, the model is shown to outperform graph-based semi-supervised methods on directed and approximately bipartite networks.Comment: Awarded 3rd Best Student Paper at 14th International Conference on Information Fusion 201

    Distinguishing Topical and Social Groups Based on Common Identity and Bond Theory

    Full text link
    Social groups play a crucial role in social media platforms because they form the basis for user participation and engagement. Groups are created explicitly by members of the community, but also form organically as members interact. Due to their importance, they have been studied widely (e.g., community detection, evolution, activity, etc.). One of the key questions for understanding how such groups evolve is whether there are different types of groups and how they differ. In Sociology, theories have been proposed to help explain how such groups form. In particular, the common identity and common bond theory states that people join groups based on identity (i.e., interest in the topics discussed) or bond attachment (i.e., social relationships). The theory has been applied qualitatively to small groups to classify them as either topical or social. We use the identity and bond theory to define a set of features to classify groups into those two categories. Using a dataset from Flickr, we extract user-defined groups and automatically-detected groups, obtained from a community detection algorithm. We discuss the process of manual labeling of groups into social or topical and present results of predicting the group label based on the defined features. We directly validate the predictions of the theory showing that the metrics are able to forecast the group type with high accuracy. In addition, we present a comparison between declared and detected groups along topicality and sociality dimensions.Comment: 10 pages, 6 figures, 2 table

    How did the discussion go: Discourse act classification in social media conversations

    Full text link
    We propose a novel attention based hierarchical LSTM model to classify discourse act sequences in social media conversations, aimed at mining data from online discussion using textual meanings beyond sentence level. The very uniqueness of the task is the complete categorization of possible pragmatic roles in informal textual discussions, contrary to extraction of question-answers, stance detection or sarcasm identification which are very much role specific tasks. Early attempt was made on a Reddit discussion dataset. We train our model on the same data, and present test results on two different datasets, one from Reddit and one from Facebook. Our proposed model outperformed the previous one in terms of domain independence; without using platform-dependent structural features, our hierarchical LSTM with word relevance attention mechanism achieved F1-scores of 71\% and 66\% respectively to predict discourse roles of comments in Reddit and Facebook discussions. Efficiency of recurrent and convolutional architectures in order to learn discursive representation on the same task has been presented and analyzed, with different word and comment embedding schemes. Our attention mechanism enables us to inquire into relevance ordering of text segments according to their roles in discourse. We present a human annotator experiment to unveil important observations about modeling and data annotation. Equipped with our text-based discourse identification model, we inquire into how heterogeneous non-textual features like location, time, leaning of information etc. play their roles in charaterizing online discussions on Facebook

    A customisable pipeline for continuously harvesting socially-minded Twitter users

    Full text link
    On social media platforms and Twitter in particular, specific classes of users such as influencers have been given satisfactory operational definitions in terms of network and content metrics. Others, for instance online activists, are not less important but their characterisation still requires experimenting. We make the hypothesis that such interesting users can be found within temporally and spatially localised contexts, i.e., small but topical fragments of the network containing interactions about social events or campaigns with a significant footprint on Twitter. To explore this hypothesis, we have designed a continuous user profile discovery pipeline that produces an ever-growing dataset of user profiles by harvesting and analysing contexts from the Twitter stream. The profiles dataset includes key network and content-based users metrics, enabling experimentation with user-defined score functions that characterise specific classes of online users. The paper describes the design and implementation of the pipeline and its empirical evaluation on a case study consisting of healthcare-related campaigns in the UK, showing how it supports the operational definitions of online activism, by comparing three experimental ranking functions. The code is publicly available.Comment: Procs. ICWE 2019, June 2019, Kore
    • …
    corecore