15 research outputs found

    Understanding Factors Influencing Users’ Retweeting Behavior---A Theoretical Perspective

    Get PDF
    Currently, a large percentage of tweets in micro-blogging platform are retweets. In this study, we propose to examine the factors that motivate users’ retweeting behavior, leading users to prefer to transform others’ tweets than posting their own. We suggest that Information Sharing Self-Efficacy, Attachment Motivation and Critical Mass are the three antecedents contributing to the users’ retweeting behavior. Both theoretical and practical implications of this study are also discussed


    Get PDF

    Identification of Online Users' Social Status via Mining User-Generated Data

    Get PDF
    With the burst of available online user-generated data, identifying online users’ social status via mining user-generated data can play a significant role in many commercial applications, research and policy-making in many domains. Social status refers to the position of a person in relation to others within a society, which is an abstract concept. The actual definition of social status is specific in terms of specific measure indicator. For example, opinion leadership measures individual social status in terms of influence and expertise in an online society, while socioeconomic status characterizes personal real-life social status based on social and economic factors. Compared with traditional survey method which is time-consuming, expensive and sometimes difficult, some efforts have been made to identify specific social status of users based on specific user-generated data using classic machine learning methods. However, in fact, regarding specific social status identification based on specific user-generated data, the specific case has several specific challenges. However, classic machine learning methods in existing works fail to address these challenges, which lead to low identification accuracy. Given the importance of improving identification accuracy, this thesis studies three specific cases on identification of online and offline social status. For each work, this thesis proposes novel effective identification method to address the specific challenges for improving accuracy. The first work aims at identifying users’ online social status in terms of topic-sensitive influence and knowledge authority in social community question answering sites, namely identifying topical opinion leaders who are both influential and expert. Social community question answering (SCQA) site, an innovative community question answering platform, not only offers traditional question answering (QA) services but also integrates an online social network where users can follow each other. Identifying topical opinion leaders in SCQA has become an important research area due to the significant role of topical opinion leaders. However, most previous related work either focus on using knowledge expertise to find experts for improving the quality of answers, or aim at measuring user influence to identify influential ones. In order to identify the true topical opinion leaders, we propose a topical opinion leader identification framework called QALeaderRank which takes account of both topic-sensitive influence and topical knowledge expertise. In the proposed framework, to measure the topic-sensitive influence of each user, we design a novel influence measure algorithm that exploits both the social and QA features of SCQA, taking into account social network structure, topical similarity and knowledge authority. In addition, we propose three topic-relevant metrics to infer the topical expertise of each user. The extensive experiments along with an online user study show that the proposed QALeaderRank achieves significant improvement compared with the state-of-the-art methods. Furthermore, we analyze the topic interest change behaviors of users over time and examine the predictability of user topic interest through experiments. The second work focuses on predicting individual socioeconomic status from mobile phone data. Socioeconomic Status (SES) is an important social and economic aspect widely concerned. Assessing individual SES can assist related organizations in making a variety of policy decisions. Traditional approach suffers from the extremely high cost in collecting large-scale SES-related survey data. With the ubiquity of smart phones, mobile phone data has become a novel data source for predicting individual SES with low cost. However, the task of predicting individual SES on mobile phone data also proposes some new challenges, including sparse individual records, scarce explicit relationships and limited labeled samples, unconcerned in prior work restricted to regional or household-oriented SES prediction. To address these issues, we propose a semi-supervised Hypergraph based Factor Graph Model (HyperFGM) for individual SES prediction. HyperFGM is able to efficiently capture the associations between SES and individual mobile phone records to handle the individual record sparsity. For the scarce explicit relationships, HyperFGM models implicit high-order relationships among users on the hypergraph structure. Besides, HyperFGM explores the limited labeled data and unlabeled data in a semi-supervised way. Experimental results show that HyperFGM greatly outperforms the baseline methods on individual SES prediction with using a set of anonymized real mobile phone data. The third work is to predict social media users’ socioeconomic status based on their social media content, which is useful for related organizations and companies in a range of applications, such as economic and social policy-making. Previous work leverage manually defined textual features and platform-based user level attributes from social media content and feed them into a machine learning based classifier for SES prediction. However, they ignore some important information of social media content, containing the order and the hierarchical structure of social media text as well as the relationships among user level attributes. To this end, we propose a novel coupled social media content representation model for individual SES prediction, which not only utilizes a hierarchical neural network to incorporate the order and the hierarchical structure of social media text but also employs a coupled attribute representation method to take into account intra-coupled and inter-coupled interaction relationships among user level attributes. The experimental results show that the proposed model significantly outperforms other stat-of-the-art models on a real dataset, which validate the efficiency and robustness of the proposed model

    User behavior in microblogs with a cultural emphasis

    Get PDF
    The main objective of this thesis is to carry out a multidisciplinary study of the behavior of microblog users. To that end we first explore several user behavior patterns employing data mining techniques. Then we use social science theories of culture and socio-economic indicators to better understand differences and similarities of user behavior across countries. We found several insights on user behavior such as (i) social link recommendations made by current friends have a large effect on link formation and the accepted recommendations have more longevity than other links; (ii) as users mature, they evolve to adopt microblogs as a news media rather than a social network; (iii) the collective behavior of users from some countries standout, based on certain special characteristics such as conversations, reciprocity, etc.; (iv) national culture determines the temporal patterns with which users post, or the extent to which they mention, follow, recommend and befriend others; and (v) socio-economic and cultural features improve the prediction of communication strength among users from different countries.El objetivo principal de esta tesis es realizar un estudio multidisciplinario sobre la conducta de los usuarios en microblogs. Para ello primero exploramos varios patrones de comportamiento de usuario usando técnicas de minería de datos. Luego usamos algunas teorías de las ciencias sociales en cultura e indicadores socioeconómicos para comprender mejor las diferencias y similitudes del comportamiento de los usuarios en diferentes países. Encontramos varios resultados interesantes sobre el comportamiento del usuario, tales como, (i) las recomendaciones de enlaces sociales hechas por amigos tienen un gran efecto sobre la formación de enlaces sociales y las recomendaciones aceptadas tienen más longevidad que otros enlaces; (ii) a medida que los usuarios maduran, estos evolucionan a usar los microblogs como un medio de comunicación en lugar de una red social; (iii) el comportamiento colectivo de los usuarios de algunos países se destaca en base a ciertas características peculiares, tales como conversaciones, reciprocidad, etc.; (iv) la cultura nacional determina los patrones temporales con los que los usuarios publican mensajes, o el grado en que se mencionan, recomiendan y siguen los unos a los otros; y (v) las características socioeconómicas y culturales ayudan a mejorar la predicción de la intensidad de la comunicación entre los usuarios de diferentes países

    Association Rules Mining among Interests and Applications for Users on Social Networks

    Full text link
    Interest is an important concept in psychology and pedagogy and is widely studied in many fields. Especially in recent years, the widespread use of many interest-based recommendation systems has greatly promoted research on interest modeling and mining on social networks. However, the existing studies have rarely tried to explore the relationships among interests and their application value, and most similar studies analyze user behavior data. In this paper, we propose and verify two hypotheses about the interests of social network users. We then use association rules to mine users' interests from LinkedIn users' profiles. Finally, based on the interest association rules and user interest distribution on Twitter, we design an approach to mine interests for Twitter users and conduct two experiments to systematically demonstrate the approach's effectiveness. According to our research, we found that there are a large number of association rules between human interests. These rules play a considerable role in our method of interest mining. Our research work not only provides new ideas for interest mining but also reveals the internal relationship between interest and its application value. The research work has certain theoretical and practical value

    Predictive Analysis on Twitter: Techniques and Applications

    Full text link
    Predictive analysis of social media data has attracted considerable attention from the research community as well as the business world because of the essential and actionable information it can provide. Over the years, extensive experimentation and analysis for insights have been carried out using Twitter data in various domains such as healthcare, public health, politics, social sciences, and demographics. In this chapter, we discuss techniques, approaches and state-of-the-art applications of predictive analysis of Twitter data. Specifically, we present fine-grained analysis involving aspects such as sentiment, emotion, and the use of domain knowledge in the coarse-grained analysis of Twitter data for making decisions and taking actions, and relate a few success stories

    Proceedings of the Making Sense of Microposts Workshop (#Microposts2015) at the World Wide Web Conference

    Get PDF

    Mining microblogs for culture-awareness in web adaptation

    Get PDF
    Prior studies in sociology and human-computer interaction indicate that persons from different countries and cultural origins tend to have their preferences in real-life communication and the usage of web and social media applications. With Twitter data, statistical and machine learning tools, this study advances our understand ing of microblogging in respect of cultural differences and demonstrates possible solutions of inferring and exploiting cultural origins for building adaptive web ap plications. Our findings reveal statistically significant differences in Twitter feature usage in respect of geographic locations of users. These differences in microblogger behaviour and user language defined in user profiles enabled us to infer user country origins with an accuracy of more than 90%. Other user origin predictive solutions we proposed do not require other data sources and human involvement for training the models, enabling the high accuracy of user country inference when exploiting information extracted from a user followers’ network, or with data derived from Twitter profiles. With origin predictive models, we analysed communication and privacy preferences and built a culture-aware recommender system. Our analysis of friend responses shows that Twitter users tend to communicate mostly within their cultural regions. Usage of privacy settings showed that privacy perceptions differ across cultures. Finally, we created and evaluated movie recommendation strategies considering user cultural groups, and addressed a cold-start scenario with a new user. We believe that the findings discussed give insights into the sociological and web research, in particular on cultural differences in online communication

    Social informatics

    Get PDF
    5th International Conference, SocInfo 2013, Kyoto, Japan, November 25-27, 2013, Proceedings</p

    Information consumption on social media : efficiency, divisiveness, and trust

    Get PDF
    Over the last decade, the advent of social media has profoundly changed the way people produce and consume information online. On these platforms, users themselves play a role in selecting the sources from which they consume information, overthrowing traditional journalistic gatekeeping. Moreover, advertisers can target users with news stories using users’ personal data. This new model has many advantages: the propagation of news is faster, the number of news sources is large, and the topics covered are diverse. However, in this new model, users are often overloaded with redundant information, and they can get trapped in filter bubbles by consuming divisive and potentially false information. To tackle these concerns, in my thesis, I address the following important questions: (i) How efficient are users at selecting their information sources? We have defined three intuitive notions of users’ efficiency in social media: link, in-flow, and delay efficiency. We use these three measures to assess how good users are at selecting who to follow within the social media system in order to most efficiently acquire information. (ii) How can we break the filter bubbles that users get trapped in? Users on social media sites such as Twitter often get trapped in filter bubbles by being exposed to radical, highly partisan, or divisive information. To prevent users from getting trapped in filter bubbles, we propose an approach to inject diversity in users’ information consumption by identifying non-divisive, yet informative information. (iii) How can we design an efficient framework for fact-checking? Proliferation of false information is a major problem in social media. To counter it, social media platforms typically rely on expert fact-checkers to detect false news. However, human fact-checkers can realistically only cover a tiny fraction of all stories. So, it is important to automatically prioritizing and selecting a small number of stories for human to fact check. However, the goals for prioritizing stories for fact-checking are unclear. We identify three desired objectives to prioritize news for fact-checking. These objectives are based on the users’ perception of truthfulness of stories. Our key finding is that these three objectives are incompatible in practice.In den letzten zehn Jahren haben soziale Medien die Art und Weise, wie Menschen online Informationen generieren und konsumieren, grundlegend verändert. Auf Social Media Plattformen wählen Nutzer selbst aus, von welchen Quellen sie Informationen beziehen hebeln damit das traditionelle Modell journalistischen Gatekeepings aus. Zusätzlich können Werbetreibende Nutzerdaten dazu verwenden, um Nachrichtenartikel gezielt an Nutzer zu verbreiten. Dieses neue Modell bietet einige Vorteile: Nachrichten verbreiten sich schneller, die Zahl der Nachrichtenquellen ist größer, und es steht ein breites Spektrum an Themen zur Verfügung. Das hat allerdings zur Folge, dass Benutzer häufig mit überflüssigen Informationen überladen werden und in Filterblasen geraten können, wenn sie zu einseitige oder falsche Informationen konsumieren. Um diesen Problemen Rechnung zu tragen, gehe ich in meiner Dissertation auf die drei folgenden wichtigen Fragestellungen ein: • (i) Wie effizient sind Nutzer bei der Auswahl ihrer Informationsquellen? Dazu definieren wir drei verschiedene, intuitive Arten von Nutzereffizienz in sozialen Medien: Link-, In-Flowund Delay-Effizienz. Mithilfe dieser drei Metriken untersuchen wir, wie gut Nutzer darin sind auszuwählen, wem sie auf Social Media Plattformen folgen sollen um effizient an Informationen zu gelangen. • (ii) Wie können wir verhindern, dass Benutzer in Filterblasen geraten? Nutzer von Social Media Webseiten werden häufig Teil von Filterblasen, wenn sie radikalen, stark parteiischen oder spalterischen Informationen ausgesetzt sind. Um das zu verhindern, entwerfen wir einen Ansatz mit dem Ziel, den Informationskonsum von Nutzern zu diversifizieren, indem wir Informationen identifizieren, die nicht polarisierend und gleichzeitig informativ sind. • (iii) Wie können wir Nachrichten effizient auf faktische Korrektheit hin überprüfen? Die Verbreitung von Falschinformationen ist eines der großen Probleme sozialer Medien. Um dem entgegenzuwirken, sind Social Media Plattformen in der Regel auf fachkundige Faktenprüfer zur Identifizierung falscher Nachrichten angewiesen. Die manuelle Überprüfung von Fakten kann jedoch realistischerweise nur einen sehr kleinen Teil aller Artikel und Posts abdecken. Daher ist es wichtig, automatisch eine überschaubare Zahl von Artikeln für die manuellen Faktenkontrolle zu priorisieren. Nach welchen Zielen eine solche Priorisierung erfolgen soll, ist jedoch unklar. Aus diesem Grund identifizieren wir drei wünschenswerte Priorisierungskriterien für die Faktenkontrolle. Diese Kriterien beruhen auf der Wahrnehmung des Wahrheitsgehalts von Artikeln durch Nutzer. Unsere Schlüsselbeobachtung ist, dass diese drei Kriterien in der Praxis nicht miteinander vereinbar sind