1,263 research outputs found

    Signed Latent Factors for Spamming Activity Detection

    Full text link
    Due to the increasing trend of performing spamming activities (e.g., Web spam, deceptive reviews, fake followers, etc.) on various online platforms to gain undeserved benefits, spam detection has emerged as a hot research issue. Previous attempts to combat spam mainly employ features related to metadata, user behaviors, or relational ties. These works have made considerable progress in understanding and filtering spamming campaigns. However, this problem remains far from fully solved. Almost all the proposed features focus on a limited number of observed attributes or explainable phenomena, making it difficult for existing methods to achieve further improvement. To broaden the vision about solving the spam problem and address long-standing challenges (class imbalance and graph incompleteness) in the spam detection area, we propose a new attempt of utilizing signed latent factors to filter fraudulent activities. The spam-contaminated relational datasets of multiple online applications in this scenario are interpreted by the unified signed network. Two competitive and highly dissimilar algorithms of latent factors mining (LFM) models are designed based on multi-relational likelihoods estimation (LFM-MRLE) and signed pairwise ranking (LFM-SPR), respectively. We then explore how to apply the mined latent factors to spam detection tasks. Experiments on real-world datasets of different kinds of Web applications (social media and Web forum) indicate that LFM models outperform state-of-the-art baselines in detecting spamming activities. By specifically manipulating experimental data, the effectiveness of our methods in dealing with incomplete and imbalanced challenges is valid

    Link Prediction with Personalized Social Influence

    Get PDF
    Link prediction in social networks is to infer the new links likely to be formed next or to reconstruct the links that are currently missing. Link prediction is of great interest recently since one of the most important goals of social networks is to connect people, so that they can interact with their friends from real world or make new friend through Internet. So the predicted links in social networks can be helpful for people to have connections with each others. Other than the pure topological network structures, social networks also have rich information of social activities of each user, such as tweeting, retweeting, and replying activities. Social science theories, such as social influence, suggests that the social activities could have potential impacts on the neighbors, and links in social networks are the results of the impacts taking place between different users. It motivates us to perform link prediction by taking advantage of the activity information. There has been a lot of proposed methods to measure the social influence through user activity information. However, traditional methods assigned some social influence measures to users universally based on their social activities, such as number of retweets or mentions the users have. But the social influence of one user towards others may not always remain the same with respect to different neighbors, which demands a personalized learning schema. Moreover, learning social influence from heterogeneous social activities is a nontrivial problem, since the information carried in the social activities is implicit and sometimes even noisy. Motivated by time-series analysis, we investigate the potential of modeling influence patterns based on pure timestamps, i.e., we aim to simplify the problem of processing heterogeneous social activities to a sequence of timestamps. Then we use timestamps as an abstraction of each activity to calculate the reduction of uncertainty of one users social activities given the knowledge of another one. The key idea is that, if a user i has impact on another user j, then given the activity timestamps of user i, the uncertainty in user j’s activity timestamps could be reduced. The uncertainty is measured by entropy in information theory, which is proven useful to detect the significant influence flow in time-series signals in information-theoretic applications. By employing the proposed influence metric, we incorporate the social activity information into the network structure, and learn a unified low-dimensional representation for all users. Thus, we could perform link prediction effectively based on the learned representation. Through comprehensive experiments, we demonstrate that the proposed method can perform better than the state-of-the-art methods in different real-world link prediction tasks

    Fewer Flops at the Top: Accuracy, Diversity, and Regularization in Two-Class Collaborative Filtering

    Full text link
    In most existing recommender systems, implicit or explicit interactions are treated as positive links and all unknown interactions are treated as negative links. The goal is to suggest new links that will be perceived as positive by users. However, as signed social networks and newer content services become common, it is important to distinguish between positive and negative preferences. Even in existing applications, the cost of a negative recommendation could be high when people are looking for new jobs, friends, or places to live. In this work, we develop novel probabilistic latent factor models to recommend positive links and compare them with existing methods on five different openly available datasets. Our models are able to produce better ranking lists and are effective in the task of ranking positive links at the top, with fewer negative links (flops). Moreover, we find that modeling signed social networks and user preferences this way has the advantage of increasing the diversity of recommendations. We also investigate the effect of regularization on the quality of recommendations, a matter that has not received enough attention in the literature. We find that regularization parameter heavily affects the quality of recommendations in terms of both accuracy and diversity

    Unbiased Learning for the Causal Effect of Recommendation

    Full text link
    Increasing users' positive interactions, such as purchases or clicks, is an important objective of recommender systems. Recommenders typically aim to select items that users will interact with. If the recommended items are purchased, an increase in sales is expected. However, the items could have been purchased even without recommendation. Thus, we want to recommend items that results in purchases caused by recommendation. This can be formulated as a ranking problem in terms of the causal effect. Despite its importance, this problem has not been well explored in the related research. It is challenging because the ground truth of causal effect is unobservable, and estimating the causal effect is prone to the bias arising from currently deployed recommenders. This paper proposes an unbiased learning framework for the causal effect of recommendation. Based on the inverse propensity scoring technique, the proposed framework first constructs unbiased estimators for ranking metrics. Then, it conducts empirical risk minimization on the estimators with propensity capping, which reduces variance under finite training samples. Based on the framework, we develop an unbiased learning method for the causal effect extension of a ranking metric. We theoretically analyze the unbiasedness of the proposed method and empirically demonstrate that the proposed method outperforms other biased learning methods in various settings.Comment: accepted at RecSys 2020, updated several experiment

    Link Prediction with Personalized Social Influence

    Get PDF
    Link prediction in social networks is to infer the new links likely to be formed next or to reconstruct the links that are currently missing. Link prediction is of great interest recently since one of the most important goals of social networks is to connect people, so that they can interact with their friends from real world or make new friend through Internet. So the predicted links in social networks can be helpful for people to have connections with each others. Other than the pure topological network structures, social networks also have rich information of social activities of each user, such as tweeting, retweeting, and replying activities. Social science theories, such as social influence, suggests that the social activities could have potential impacts on the neighbors, and links in social networks are the results of the impacts taking place between different users. It motivates us to perform link prediction by taking advantage of the activity information. There has been a lot of proposed methods to measure the social influence through user activity information. However, traditional methods assigned some social influence measures to users universally based on their social activities, such as number of retweets or mentions the users have. But the social influence of one user towards others may not always remain the same with respect to different neighbors, which demands a personalized learning schema. Moreover, learning social influence from heterogeneous social activities is a nontrivial problem, since the information carried in the social activities is implicit and sometimes even noisy. Motivated by time-series analysis, we investigate the potential of modeling influence patterns based on pure timestamps, i.e., we aim to simplify the problem of processing heterogeneous social activities to a sequence of timestamps. Then we use timestamps as an abstraction of each activity to calculate the reduction of uncertainty of one users social activities given the knowledge of another one. The key idea is that, if a user i has impact on another user j, then given the activity timestamps of user i, the uncertainty in user j’s activity timestamps could be reduced. The uncertainty is measured by entropy in information theory, which is proven useful to detect the significant influence flow in time-series signals in information-theoretic applications. By employing the proposed influence metric, we incorporate the social activity information into the network structure, and learn a unified low-dimensional representation for all users. Thus, we could perform link prediction effectively based on the learned representation. Through comprehensive experiments, we demonstrate that the proposed method can perform better than the state-of-the-art methods in different real-world link prediction tasks
    • …
    corecore