6,932 research outputs found

    Protecting attributes and contents in online social networks

    Get PDF
    With the extreme popularity of online social networks, security and privacy issues become critical. In particular, it is important to protect user privacy without preventing them from normal socialization. User privacy in the context of data publishing and structural re-identification attacks has been well studied. However, protection of attributes and data content was mostly neglected in the research community. While social network data is rarely published, billions of messages are shared in various social networks on a daily basis. Therefore, it is more important to protect attributes and textual content in social networks. We first study the vulnerabilities of user attributes and contents, in particular, the identifiability of the users when the adversary learns a small piece of information about the target. We have presented two attribute-reidentification attacks that exploit information retrieval and web search techniques. We have shown that large portions of users with online presence are very identifiable, even with a small piece of seed information, and the seed information could be inaccurate. To protect user attributes and content, we adopt the social circle model derived from the concepts of "privacy as user perception" and "information boundary". Users will have different social circles, and share different information in different circles. We introduce a social circle discovery approach using multi-view clustering. We present our observations on the key features of social circles, including friendship links, content similarity and social interactions. We treat each feature as one view, and propose a one-side co-trained spectral clustering technique, which is tailored for the sparse nature of our data. We also propose two evaluation measurements. One is based on the quantitative measure of similarity ratio, while the other employs human evaluators to examine pairs of users, who are selected by the max-risk active evaluation approach. We evaluate our approach on ego networks of twitter users, and present our clustering results. We also compare our proposed clustering technique with single-view clustering and original co-trained spectral clustering techniques. Our results show that multi-view clustering is more accurate for social circle detection; and our proposed approach gains significantly higher similarity ratio than the original multi-view clustering approach. In addition, we build a proof-of-concept implementation of automatic circle detection and recommendation methods. For a user, the system will return its circle detection result from our proposed multi-view clustering technique, and the key words for each circle are also presented. Users can also enter a message they want to post, and the system will suggest which circle to disseminate the message

    Recommender Systems for Online and Mobile Social Networks: A survey

    Full text link
    Recommender Systems (RS) currently represent a fundamental tool in online services, especially with the advent of Online Social Networks (OSN). In this case, users generate huge amounts of contents and they can be quickly overloaded by useless information. At the same time, social media represent an important source of information to characterize contents and users' interests. RS can exploit this information to further personalize suggestions and improve the recommendation process. In this paper we present a survey of Recommender Systems designed and implemented for Online and Mobile Social Networks, highlighting how the use of social context information improves the recommendation task, and how standard algorithms must be enhanced and optimized to run in a fully distributed environment, as opportunistic networks. We describe advantages and drawbacks of these systems in terms of algorithms, target domains, evaluation metrics and performance evaluations. Eventually, we present some open research challenges in this area

    Homophily and Contagion Are Generically Confounded in Observational Social Network Studies

    Full text link
    We consider processes on social networks that can potentially involve three factors: homophily, or the formation of social ties due to matching individual traits; social contagion, also known as social influence; and the causal effect of an individual's covariates on their behavior or other measurable responses. We show that, generically, all of these are confounded with each other. Distinguishing them from one another requires strong assumptions on the parametrization of the social process or on the adequacy of the covariates used (or both). In particular we demonstrate, with simple examples, that asymmetries in regression coefficients cannot identify causal effects, and that very simple models of imitation (a form of social contagion) can produce substantial correlations between an individual's enduring traits and their choices, even when there is no intrinsic affinity between them. We also suggest some possible constructive responses to these results.Comment: 27 pages, 9 figures. V2: Revised in response to referees. V3: Ditt

    How to Hide One's Relationships from Link Prediction Algorithms

    Get PDF
    Our private connections can be exposed by link prediction algorithms. To date, this threat has only been addressed from the perspective of a central authority, completely neglecting the possibility that members of the social network can themselves mitigate such threats. We fill this gap by studying how an individual can rewire her own network neighborhood to hide her sensitive relationships. We prove that the optimization problem faced by such an individual is NP-complete, meaning that any attempt to identify an optimal way to hide one’s relationships is futile. Based on this, we shift our attention towards developing effective, albeit not optimal, heuristics that are readily-applicable by users of existing social media platforms to conceal any connections they deem sensitive. Our empirical evaluation reveals that it is more beneficial to focus on “unfriending” carefully-chosen individuals rather than befriending new ones. In fact, by avoiding communication with just 5 individuals, it is possible for one to hide some of her relationships in a massive, real-life telecommunication network, consisting of 829,725 phone calls between 248,763 individuals. Our analysis also shows that link prediction algorithms are more susceptible to manipulation in smaller and denser networks. Evaluating the error vs. attack tolerance of link prediction algorithms reveals that rewiring connections randomly may end up exposing one’s sensitive relationships, highlighting the importance of the strategic aspect. In an age where personal relationships continue to leave digital traces, our results empower the general public to proactively protect their private relationships.M.W. was supported by the Polish National Science Centre grant 2015/17/N/ST6/03686. T.P.M. was supported by the Polish National Science Centre grants 2016/23/B/ST6/03599 and 2014/13/B/ST6/01807 (for this and the previous versions of this article, respectively). Y.V. and K.Z. were supported by ARO MURI (grant #W911NF1810208). Y.V. was also supported by the U.S. National Science Foundation (CAREER award IIS- 1905558 and grant IIS-1526860). E.M. acknowledges funding by Ministerio de Economa y Competitividad (Spain) through grant FIS2016-78904-C3-3-P

    Sparsity-aware neural user behavior modeling in online interaction platforms

    Get PDF
    Modern online platforms offer users an opportunity to participate in a variety of content-creation, social networking, and shopping activities. With the rapid proliferation of such online services, learning data-driven user behavior models is indispensable to enable personalized user experiences. Recently, representation learning has emerged as an effective strategy for user modeling, powered by neural networks trained over large volumes of interaction data. Despite their enormous potential, we encounter the unique challenge of data sparsity for a vast majority of entities, e.g., sparsity in ground-truth labels for entities and in entity-level interactions (cold-start users, items in the long-tail, and ephemeral groups). In this dissertation, we develop generalizable neural representation learning frameworks for user behavior modeling designed to address different sparsity challenges across applications. Our problem settings span transductive and inductive learning scenarios, where transductive learning models entities seen during training and inductive learning targets entities that are only observed during inference. We leverage different facets of information reflecting user behavior (e.g., interconnectivity in social networks, temporal and attributed interaction information) to enable personalized inference at scale. Our proposed models are complementary to concurrent advances in neural architectural choices and are adaptive to the rapid addition of new applications in online platforms. First, we examine two transductive learning settings: inference and recommendation in graph-structured and bipartite user-item interactions. In chapter 3, we formulate user profiling in social platforms as semi-supervised learning over graphs given sparse ground-truth labels for node attributes. We present a graph neural network framework that exploits higher-order connectivity structures (network motifs) to learn attributed structural roles of nodes that identify structurally similar nodes with co-varying local attributes. In chapter 4, we design neural collaborative filtering models for few-shot recommendations over user-item interactions. To address item interaction sparsity due to heavy-tailed distributions, our proposed meta-learning framework learns-to-recommend few-shot items by knowledge transfer from arbitrary base recommenders. We show that our framework consistently outperforms state-of-art approaches on overall recommendation (by 5% Recall) while achieving significant gains (of 60-80% Recall) for tail items with fewer than 20 interactions. Next, we explored three inductive learning settings: modeling spread of user-generated content in social networks; item recommendations for ephemeral groups; and friend ranking in large-scale social platforms. In chapter 5, we focus on diffusion prediction in social networks where a vast population of users rarely post content. We introduce a deep generative modeling framework that models users as probability distributions in the latent space with variational priors parameterized by graph neural networks. Our approach enables massive performance gains (over 150% recall) for users with sparse activities while being faster than state-of-the-art neural models by an order of magnitude. In chapter 6, we examine item recommendations for ephemeral groups with limited or no historical interactions together. To overcome group interaction sparsity, we present self-supervised learning strategies that exploit the preference co-variance in observed group memberships for group recommender training. Our framework achieves significant performance gains (over 30% NDCG) over prior state-of-the-art group recommendation models. In chapter 7, we introduce multi-modal inference with graph neural networks that captures knowledge from multiple feature modalities and user interactions for multi-faceted friend ranking. Our approach achieves notable higher performance gains for critical populations of less-active and low degree users

    Detecting Overlapping Communities in ISEBEL

    Get PDF
    Real-world complex networks are evident in the social networks, the internet, and biological networks. The complexities of most real-world networks make them have community structures that can be divided into subgroups based on some statistical features or how strongly the vertices are closely connected. A vertex may be a member of more than one community, resulting in overlapping communities in such a real-world network. The ISEBEL story network is yet another complex network with varieties of folklore of werewolves, witches, and legends which form communities with overlapping vertices. Many algorithms for detecting overlapping communities in real-world networks exist. In this work, we propose a framework built on Apache spark using the BigClam algorithm that is able to detect overlapping communities in ISEBEL dataset

    Statistical Challenges in Online Controlled Experiments: A Review of A/B Testing Methodology

    Full text link
    The rise of internet-based services and products in the late 1990's brought about an unprecedented opportunity for online businesses to engage in large scale data-driven decision making. Over the past two decades, organizations such as Airbnb, Alibaba, Amazon, Baidu, Booking, Alphabet's Google, LinkedIn, Lyft, Meta's Facebook, Microsoft, Netflix, Twitter, Uber, and Yandex have invested tremendous resources in online controlled experiments (OCEs) to assess the impact of innovation on their customers and businesses. Running OCEs at scale has presented a host of challenges requiring solutions from many domains. In this paper we review challenges that require new statistical methodologies to address them. In particular, we discuss the practice and culture of online experimentation, as well as its statistics literature, placing the current methodologies within their relevant statistical lineages and providing illustrative examples of OCE applications. Our goal is to raise academic statisticians' awareness of these new research opportunities to increase collaboration between academia and the online industry
    • …
    corecore