26,688 research outputs found

    An Accuracy-Assured Privacy-Preserving Recommender System for Internet Commerce

    Full text link
    Recommender systems, tool for predicting users' potential preferences by computing history data and users' interests, show an increasing importance in various Internet applications such as online shopping. As a well-known recommendation method, neighbourhood-based collaborative filtering has attracted considerable attention recently. The risk of revealing users' private information during the process of filtering has attracted noticeable research interests. Among the current solutions, the probabilistic techniques have shown a powerful privacy preserving effect. When facing kk Nearest Neighbour attack, all the existing methods provide no data utility guarantee, for the introduction of global randomness. In this paper, to overcome the problem of recommendation accuracy loss, we propose a novel approach, Partitioned Probabilistic Neighbour Selection, to ensure a required prediction accuracy while maintaining high security against kkNN attack. We define the sum of kk neighbours' similarity as the accuracy metric alpha, the number of user partitions, across which we select the kk neighbours, as the security metric beta. We generalise the kk Nearest Neighbour attack to beta k Nearest Neighbours attack. Differing from the existing approach that selects neighbours across the entire candidate list randomly, our method selects neighbours from each exclusive partition of size kk with a decreasing probability. Theoretical and experimental analysis show that to provide an accuracy-assured recommendation, our Partitioned Probabilistic Neighbour Selection method yields a better trade-off between the recommendation accuracy and system security.Comment: replacement for the previous versio

    Recommending with an Agenda: Active Learning of Private Attributes using Matrix Factorization

    Full text link
    Recommender systems leverage user demographic information, such as age, gender, etc., to personalize recommendations and better place their targeted ads. Oftentimes, users do not volunteer this information due to privacy concerns, or due to a lack of initiative in filling out their online profiles. We illustrate a new threat in which a recommender learns private attributes of users who do not voluntarily disclose them. We design both passive and active attacks that solicit ratings for strategically selected items, and could thus be used by a recommender system to pursue this hidden agenda. Our methods are based on a novel usage of Bayesian matrix factorization in an active learning setting. Evaluations on multiple datasets illustrate that such attacks are indeed feasible and use significantly fewer rated items than static inference methods. Importantly, they succeed without sacrificing the quality of recommendations to users.Comment: This is the extended version of a paper that appeared in ACM RecSys 201

    Probabilistic Perspectives on Collecting Human Uncertainty in Predictive Data Mining

    Full text link
    In many areas of data mining, data is collected from humans beings. In this contribution, we ask the question of how people actually respond to ordinal scales. The main problem observed is that users tend to be volatile in their choices, i.e. complex cognitions do not always lead to the same decisions, but to distributions of possible decision outputs. This human uncertainty may sometimes have quite an impact on common data mining approaches and thus, the question of effective modelling this so called human uncertainty emerges naturally. Our contribution introduces two different approaches for modelling the human uncertainty of user responses. In doing so, we develop techniques in order to measure this uncertainty at the level of user inputs as well as the level of user cognition. With support of comprehensive user experiments and large-scale simulations, we systematically compare both methodologies along with their implications for personalisation approaches. Our findings demonstrate that significant amounts of users do submit something completely different (action) than they really have in mind (cognition). Moreover, we demonstrate that statistically sound evidence with respect to algorithm assessment becomes quite hard to realise, especially when explicit rankings shall be built

    Collaborative Deep Learning for Recommender Systems

    Full text link
    Collaborative filtering (CF) is a successful approach commonly used by many recommender systems. Conventional CF-based methods use the ratings given to items by users as the sole source of information for learning to make recommendation. However, the ratings are often very sparse in many applications, causing CF-based methods to degrade significantly in their recommendation performance. To address this sparsity problem, auxiliary information such as item content information may be utilized. Collaborative topic regression (CTR) is an appealing recent method taking this approach which tightly couples the two components that learn from two different sources of information. Nevertheless, the latent representation learned by CTR may not be very effective when the auxiliary information is very sparse. To address this problem, we generalize recent advances in deep learning from i.i.d. input to non-i.i.d. (CF-based) input and propose in this paper a hierarchical Bayesian model called collaborative deep learning (CDL), which jointly performs deep representation learning for the content information and collaborative filtering for the ratings (feedback) matrix. Extensive experiments on three real-world datasets from different domains show that CDL can significantly advance the state of the art
    • …
    corecore