124,842 research outputs found

    Understanding Social Media Users via Attributes and Links

    Get PDF
    abstract: With the rise of social media, hundreds of millions of people spend countless hours all over the globe on social media to connect, interact, share, and create user-generated data. This rich environment provides tremendous opportunities for many different players to easily and effectively reach out to people, interact with them, influence them, or get their opinions. There are two pieces of information that attract most attention on social media sites, including user preferences and interactions. Businesses and organizations use this information to better understand and therefore provide customized services to social media users. This data can be used for different purposes such as, targeted advertisement, product recommendation, or even opinion mining. Social media sites use this information to better serve their users. Despite the importance of personal information, in many cases people do not reveal this information to the public. Predicting the hidden or missing information is a common response to this challenge. In this thesis, we address the problem of predicting user attributes and future or missing links using an egocentric approach. The current research proposes novel concepts and approaches to better understand social media users in twofold including, a) their attributes, preferences, and interests, and b) their future or missing connections and interactions. More specifically, the contributions of this dissertation are (1) proposing a framework to study social media users through their attributes and link information, (2) proposing a scalable algorithm to predict user preferences; and (3) proposing a novel approach to predict attributes and links with limited information. The proposed algorithms use an egocentric approach to improve the state of the art algorithms in two directions. First by improving the prediction accuracy, and second, by increasing the scalability of the algorithms.Dissertation/ThesisDoctoral Dissertation Computer Science 201

    Identification of Online Users' Social Status via Mining User-Generated Data

    Get PDF
    With the burst of available online user-generated data, identifying online users’ social status via mining user-generated data can play a significant role in many commercial applications, research and policy-making in many domains. Social status refers to the position of a person in relation to others within a society, which is an abstract concept. The actual definition of social status is specific in terms of specific measure indicator. For example, opinion leadership measures individual social status in terms of influence and expertise in an online society, while socioeconomic status characterizes personal real-life social status based on social and economic factors. Compared with traditional survey method which is time-consuming, expensive and sometimes difficult, some efforts have been made to identify specific social status of users based on specific user-generated data using classic machine learning methods. However, in fact, regarding specific social status identification based on specific user-generated data, the specific case has several specific challenges. However, classic machine learning methods in existing works fail to address these challenges, which lead to low identification accuracy. Given the importance of improving identification accuracy, this thesis studies three specific cases on identification of online and offline social status. For each work, this thesis proposes novel effective identification method to address the specific challenges for improving accuracy. The first work aims at identifying users’ online social status in terms of topic-sensitive influence and knowledge authority in social community question answering sites, namely identifying topical opinion leaders who are both influential and expert. Social community question answering (SCQA) site, an innovative community question answering platform, not only offers traditional question answering (QA) services but also integrates an online social network where users can follow each other. Identifying topical opinion leaders in SCQA has become an important research area due to the significant role of topical opinion leaders. However, most previous related work either focus on using knowledge expertise to find experts for improving the quality of answers, or aim at measuring user influence to identify influential ones. In order to identify the true topical opinion leaders, we propose a topical opinion leader identification framework called QALeaderRank which takes account of both topic-sensitive influence and topical knowledge expertise. In the proposed framework, to measure the topic-sensitive influence of each user, we design a novel influence measure algorithm that exploits both the social and QA features of SCQA, taking into account social network structure, topical similarity and knowledge authority. In addition, we propose three topic-relevant metrics to infer the topical expertise of each user. The extensive experiments along with an online user study show that the proposed QALeaderRank achieves significant improvement compared with the state-of-the-art methods. Furthermore, we analyze the topic interest change behaviors of users over time and examine the predictability of user topic interest through experiments. The second work focuses on predicting individual socioeconomic status from mobile phone data. Socioeconomic Status (SES) is an important social and economic aspect widely concerned. Assessing individual SES can assist related organizations in making a variety of policy decisions. Traditional approach suffers from the extremely high cost in collecting large-scale SES-related survey data. With the ubiquity of smart phones, mobile phone data has become a novel data source for predicting individual SES with low cost. However, the task of predicting individual SES on mobile phone data also proposes some new challenges, including sparse individual records, scarce explicit relationships and limited labeled samples, unconcerned in prior work restricted to regional or household-oriented SES prediction. To address these issues, we propose a semi-supervised Hypergraph based Factor Graph Model (HyperFGM) for individual SES prediction. HyperFGM is able to efficiently capture the associations between SES and individual mobile phone records to handle the individual record sparsity. For the scarce explicit relationships, HyperFGM models implicit high-order relationships among users on the hypergraph structure. Besides, HyperFGM explores the limited labeled data and unlabeled data in a semi-supervised way. Experimental results show that HyperFGM greatly outperforms the baseline methods on individual SES prediction with using a set of anonymized real mobile phone data. The third work is to predict social media users’ socioeconomic status based on their social media content, which is useful for related organizations and companies in a range of applications, such as economic and social policy-making. Previous work leverage manually defined textual features and platform-based user level attributes from social media content and feed them into a machine learning based classifier for SES prediction. However, they ignore some important information of social media content, containing the order and the hierarchical structure of social media text as well as the relationships among user level attributes. To this end, we propose a novel coupled social media content representation model for individual SES prediction, which not only utilizes a hierarchical neural network to incorporate the order and the hierarchical structure of social media text but also employs a coupled attribute representation method to take into account intra-coupled and inter-coupled interaction relationships among user level attributes. The experimental results show that the proposed model significantly outperforms other stat-of-the-art models on a real dataset, which validate the efficiency and robustness of the proposed model

    MUFFLE: Multi-Modal Fake News Influence Estimator on Twitter

    Get PDF
    To alleviate the impact of fake news on our society, predicting the popularity of fake news posts on social media is a crucial problem worthy of study. However, most related studies on fake news emphasize detection only. In this paper, we focus on the issue of fake news influence prediction, i.e., inferring how popular a fake news post might become on social platforms. To achieve our goal, we propose a comprehensive framework, MUFFLE, which captures multi-modal dynamics by encoding the representation of news-related social networks, user characteristics, and content in text. The attention mechanism developed in the model can provide explainability for social or psychological analysis. To examine the effectiveness of MUFFLE, we conducted extensive experiments on real-world datasets. The experimental results show that our proposed method outperforms both state-of-the-art methods of popularity prediction and machine-based baselines in top-k NDCG and hit rate. Through the experiments, we also analyze the feature importance for predicting fake news influence via the explainability provided by MUFFLE

    A Prediction Model for Initial Trust Formation in B2C eCommerce

    Get PDF
    This study investigates initial trust formation with an unknown online company. Based on data collected from 628 respondents, the results indicate significant direct effects for trust in the Internet infrastructure, susceptibility to the social influence of media, and the presence of influential site characteristics, on user willingness to provide personal information to unknown Internet firms. This study extends the research on trust in e-commerce by providing a prediction model that is demonstrated to calculate the probability of user willingness to provide information. The utility of the model for identifying the relative importance of factors and predicting outcomes lends insight into important issues in online trust formation

    Study of Approaches to Predict Personality Using Digital Twin

    Get PDF
    With a growing proportion of online activities on social networking sites on different mediums like Facebook, Instagram, Twitter, LinkedIn the requirement for personality prediction associated with this online mediated behavior has also increased significantly. The user generated content on social media can be effectively leveraged to record, analyze and predict personality through different psychological approaches like MBTI, Big Five, and DISC. Predicting personality has displayed an intrinsic influence in multifarious domains like career choice, political influence, brand inclination, customized advertising, improvising learning outcomes, recommender system algorithms and so on. The objective of this paper is to stipulate an overview of different strategies used by researchers to predict personality based on the social media usage and user generated content across prominent social media platforms. It was observed that the personality traits can be accurately inferred from user behavior reflected on social media through attributes like status posted, pictures uploaded, number of friends, groups joined, network density, liked content. As of now, Facebook followed by Twitter are the most prominent social media platforms for conducting the study however, the use other social media platforms like Instagram, LinkedIn are expected to increase exponentially for carrying out personality prediction study

    Influence Level Prediction on Social Media through Multi-Task and Sociolinguistic User Characteristics Modeling

    Full text link
    Prediction of a user’s influence level on social networks has attracted a lot of attention as human interactions move online. Influential users have the ability to influence others’ behavior to achieve their own agenda. As a result, predicting users’ level of influence online can help to understand social networks, forecast trends, prevent misinformation, etc. The research on user influence in social networks has attracted much attention across multiple disciplines, from social sciences to mathematics, yet it is still not well understood. One of the difficulties is that the definition of influence is specific to a particular problem or a domain, and it does not generalize well. Another challenge arises from the fact that all user interactions occur through text. Textual data limits access to non-verbal communication such as voice. These facts make the problem challenging. In this work, we define user influence level as a function of community endorsement, create a strong baseline, and develop new methods that significantly outperform our baseline by leveraging demographic and personality data. This dissertation is divided into three parts. In part one, we introduce the problem of influence level prediction, review influential research across different disciplines, and introduce our hypothesis that leverages user-centric information to improve user influence level prediction on social media. In part two, we answer the question of whether the language provides sufficient information to predict user- related information. We develop new methods that achieve good results on three tasks: relationship prediction, demographic prediction, and hedge sentence detection. In part three, we introduce our dataset, a new ranking algorithm, RankDCG, to assess the performance of ranking problems, and develop new user-centric models for user influence level prediction. These models show significant improvements across eight different domains ranging from politics and news to fitness
    • …
    corecore