Search CORE

15 research outputs found

Understanding Shilling Attacks and Their Detection Traits: A Comprehensive Survey

Author: Gao Tianchong
Li Feng
Palanisamy Sundar Agnideven
Russomanno Evan D.
Zou Xukai
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/2020
Field of study

The internet is the home for huge volumes of useful data that is constantly being created making it difficult for users to find information relevant to them. Recommendation System is a special type of information filtering system adapted by online vendors to provide recommendations to their customers based on their requirements. Collaborative filtering is one of the most widely used recommendation systems; unfortunately, it is prone to shilling/profile injection attacks. Such attacks alter the recommendation process to promote or demote a particular product. Over the years, multiple attack models and detection techniques have been developed to mitigate the problem. This paper aims to be a comprehensive survey of the shilling attack models, detection attributes, and detection algorithms. Additionally, we unravel and classify the intrinsic traits of the injected profiles that are exploited by the detection algorithms, which has not been explored in previous works. We also briefly discuss recent works in the development of robust algorithms that alleviate the impact of shilling attacks, attacks on multi-criteria systems, and intrinsic feedback based collaborative filtering methods

IUPUIScholarWorks

Cross-systems Personalisierung

Author: Mehta Bhaskar
Publication venue
Publication date: 04/04/2008
Field of study

The World Wide Web provides access to a wealth of information and services to a huge and heterogeneous user population on a global scale. One important and successful design mechanism in dealing with this diversity of users is to personalize Web sites and services, i.e. to customize system content, characteristics, or appearance with respect to a specific user. Each system independently builds up user proﬁles and uses this information to personalize the service offering. Such isolated approaches have two major drawbacks: firstly, investments of users in personalizing a system either through explicit provision of information or through long and regular use are not transferable to other systems. Secondly, users have little or no control over the information that defines their profile, since user data are deeply buried in personalization engines running on the server side. Cross system personalization (CSP) (Mehta, Niederee, & Stewart, 2005) allows for sharing information across different information systems in a user-centric way and can overcome the aforementioned problems. Information about users, which is originally scattered across multiple systems, is combined to obtain maximum leverage and reuse of information. Our initial approaches to cross system personalization relied on each user having a unified profile which different systems can understand. The unified profile contains facets modeling aspects of a multidimensional user which is stored inside a "Context Passport" that the user carries along in his/her journey across information space. The user’s Context Passport is presented to a system, which can then understand the context in which the user wants to use the system. The basis of ’understanding’ in this approach is of a semantic nature, i.e. the semantics of the facets and dimensions of the uniﬁed proﬁle are known, so that the latter can be aligned with the proﬁles maintained internally at a specific site. The results of the personalization process are then transfered back to the user’s Context Passport via a protocol understood by both parties. The main challenge in this approach is to establish some common and globally accepted vocabulary and to create a standard every system will comply with. Machine Learning techniques provide an alternative approach to enable CSP without the need of accepted semantic standards or ontologies. The key idea is that one can try to learn dependencies between proﬁles maintained within one system and profiles maintained within a second system based on data provided by users who use both systems and who are willing to share their proﬁles across systems – which we assume is in the interest of the user. Here, instead of requiring a common semantic framework, it is only required that a sufficient number of users cross between systems and that there is enough regularity among users that one can learn within a user population, a fact that is commonly exploited in collaborative filtering. In this thesis, we aim to provide a principled approach towards achieving cross system personalization. We describe both semantic and learning approaches, with a stronger emphasis on the learning approach. We also investigate the privacy and scalability aspects of CSP and provide solutions to these problems. Finally, we also explore in detail the aspect of robustness in recommender systems. We motivate several approaches for robustifying collaborative filtering and provide the best performing algorithm for detecting malicious attacks reported so far.Die Personalisierung von Software Systemen ist von stetig zunehmender Bedeutung, insbesondere im Zusammenhang mit Web-Applikationen wie Suchmaschinen, Community-Portalen oder Electronic Commerce Sites, die große, stark diversifizierte Nutzergruppen ansprechen. Da explizite Personalisierung typischerweise mit einem erheblichen zeitlichem Aufwand für den Nutzer verbunden ist, greift man in vielen Applikationen auf implizite Techniken zur automatischen Personalisierung zurück, insbesondere auf Empfehlungssysteme (Recommender Systems), die typischerweise Methoden wie das Collaborative oder Social Filtering verwenden. Während diese Verfahren keine explizite Erzeugung von Benutzerprofilen mittels Beantwortung von Fragen und explizitem Feedback erfordern, ist die Qualität der impliziten Personalisierung jedoch stark vom verfügbaren Datenvolumen, etwa Transaktions-, Query- oder Click-Logs, abhängig. Ist in diesem Sinne von einem Nutzer wenig bekannt, so können auch keine zuverlässigen persönlichen Anpassungen oder Empfehlungen vorgenommen werden. Die vorgelegte Dissertation behandelt die Frage, wie Personalisierung über Systemgrenzen hinweg („cross system“) ermöglicht und unterstützt werden kann, wobei hauptsächlich implizite Personalisierungstechniken, aber eingeschränkt auch explizite Methodiken wie der semantische Context Passport diskutiert werden. Damit behandelt die Dissertation eine wichtige Forschungs-frage von hoher praktischer Relevanz, die in der neueren wissenschaftlichen Literatur zu diesem Thema nur recht unvollständig und unbefriedigend gelöst wurde. Automatische Empfehlungssysteme unter Verwendung von Techniken des Social Filtering sind etwas seit Mitte der 90er Jahre mit dem Aufkommen der ersten E-Commerce Welle popularisiert orden, insbesondere durch Projekte wie Information Tapistery, Grouplens und Firefly. In den späten 90er Jahren und Anfang dieses Jahrzehnts lag der Hauptfokus der Forschungsliteratur dann auf verbesserten statistischen Verfahren und fortgeschrittenen Inferenz-Methodiken, mit deren Hilfe die impliziten Beobachtungen auf konkrete Anpassungs- oder Empfehlungsaktionen abgebildet werden können. In den letzten Jahren sind vor allem Fragen in den Vordergrund gerückt, wie Personalisierungssysteme besser auf die praktischen Anforderungen bestimmter Applikationen angepasst werden können, wobei es insbesondere um eine geeignete Anpassung und Erweiterung existierender Techniken geht. In diesem Rahmen stellt sich die vorgelegte Arbeit

Duisburg-Essen Publications Online

Trust-Based Rating Prediction and Malicious Profile Detection in Online Social Recommender Systems

Author: Davoudi Anahita
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2018
Field of study

Online social networks and recommender systems have become an effective channel for influencing millions of users by facilitating exchange and spread of information. This dissertation addresses multiple challenges that are faced by online social recommender systems such as: i) finding the extent of information spread; ii) predicting the rating of a product; and iii) detecting malicious profiles. Most of the research in this area do not capture the social interactions and rely on empirical or statistical approaches without considering the temporal aspects. We capture the temporal spread of information using a probabilistic model and use non-linear differential equations to model the diffusion process. To predict the rating of a product, we propose a social trust model and use the matrix factorization method to estimate user\u27s taste by incorporating user-item rating matrix. The effect of tastes of friends of a user is captured using a trust model which is based on similarities between users and their centralities. Similarity is modeled using Vector Space Similarity and Pearson Correlation Coefficient algorithms, whereas degree, eigen-vector, Katz, and PageRank are used to model centrality. As rating of a product has tremendous influence on its saleability, social recommender systems are vulnerable to profile injection attacks that affect user\u27s opinion towards favorable or unfavorable recommendations for a product. We propose a classification approach for detecting attackers based on attributes that provide the likelihood of a user profile of that of an attacker. To evaluate the performance, we inject push and nuke attacks, and use precision and recall to identify the attackers. All proposed models have been validated using datasets from Facebook, Epinions, and Digg. Results exhibit that the proposed models are able to better predict the information spread, rating of a product, and identify malicious user profiles with high accuracy and low false positives

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Contributions to outlier detection and recommendation systems

Author: Shu Wu
Publication venue: 'Universite de Sherbrooke'
Publication date: 01/01/2012
Field of study

Le forage de données, appelé également "Découverte de connaissance dans les bases de données" , est un jeune domaine de recherche interdisciplinaire. Le forage de données étudie les processus d'analyse de grands ensembles de données pour en extraire des connaissances, et les processus de transformation de ces connaissances en des structures faciles à comprendre et à utiliser par les humains. Cette thèse étudie deux tâches importantes dans le domaine du forage de données : la détection des anomalies et la recommandation de produits. La détection des anomalies est l'identification des données non conformes aux observations normales. La recommandation de produit est la prédiction du niveau d'intérêt d'un client pour des produits en se basant sur des données d'achats antérieurs et des données socio-économiques. Plus précisément, cette thèse porte sur 1) la détection des anomalies dans de grands ensembles de données de type catégorielles; et 2) les techniques de recommandation à partir des données de classements asymétriques. La détection des anomalies dans des données catégorielles de grande échelle est un problème important qui est loin d'être résolu. Les méthodes existantes dans ce domaine souffrnt d'une faible efficience et efficacité en raison de la dimensionnalité élevée des données, de la grande taille des bases de données, de la complexité élevée des tests statistiques, ainsi que des mesures de proximité non adéquates. Cette thèse propose une définition formelle d'anomalie dans les données catégorielles ainsi que deux algorithmes efficaces et efficients pour la détection des anomalies dans les données de grande taille. Ces algorithmes ont besoin d'un seul paramètre : le nombre des anomalies. Pour déterminer la valeur de ce paramètre, nous avons développé un critère en nous basant sur un nouveau concept qui est l'holo-entropie. Plusieurs recherches antérieures sur les systèmes de recommandation ont négligé un type de classements répandu dans les applications Web, telles que le commerce électronique (ex. Amazon, Taobao) et les sites fournisseurs de contenu (ex. YouTube). Les données de classements recueillies par ces sites se différencient de celles de classements des films et des musiques par leur distribution asymétrique élevée. Cette thèse propose un cadre mieux adapté pour estimer les classements et les préférences quantitatives d'ordre supérieur pour des données de classements asymétriques. Ce cadre permet de créer de nouveaux modèles de recommandation en se basant sur la factorisation de matrice ou sur l'estimation de voisinage. Des résultats expérimentaux sur des ensembles de données asymétriques indiquent que les modèles créés avec ce cadre ont une meilleure performance que les modèles conventionnels non seulement pour la prédiction de classements, mais aussi pour la prédiction de la liste des Top-N produits

Savoirs UdeS

Using Data Mining for Facilitating User Contributions in the Social Semantic Web

Author: Ramezani Maryam
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2011
Field of study

This thesis utilizes recommender systems to aid the user in contributing to the Social Semantic Web. In this work, we propose a framework that maps domain properties to recommendation technologies. Next, we develop novel recommendation algorithms for improving personalized tag recommendation and for recommendation of semantic relations. Finally, we introduce a framework to analyze different types of potential attacks against social tagging systems and evaluate their impact on those systems

KITopen

Blockchain-based recommender systems: Applications, challenges and future opportunities

Author: Alsalemi Abdullah
Amira Abbes
Bensaali Faycal
Dimitrakopoulos George
Eirinaki Magdalini
Himeur Yassine
Sardianos Christos
Sayed Aya
Varlamis Iraklis
Publication venue: 'Elsevier BV'
Publication date: 22/11/2021
Field of study

Recommender systems have been widely used in different application domains including energy-preservation, e-commerce, healthcare, social media, etc. Such applications require the analysis and mining of massive amounts of various types of user data, including demographics, preferences, social interactions, etc. in order to develop accurate and precise recommender systems. Such datasets often include sensitive information, yet most recommender systems are focusing on the models' accuracy and ignore issues related to security and the users' privacy. Despite the efforts to overcome these problems using different risk reduction techniques, none of them has been completely successful in ensuring cryptographic security and protection of the users' private information. To bridge this gap, the blockchain technology is presented as a promising strategy to promote security and privacy preservation in recommender systems, not only because of its security and privacy salient features, but also due to its resilience, adaptability, fault tolerance and trust characteristics. This paper presents a holistic review of blockchain-based recommender systems covering challenges, open issues and solutions. Accordingly, a well-designed taxonomy is introduced to describe the security and privacy challenges, overview existing frameworks and discuss their applications and benefits when using blockchain before indicating opportunities for future research. 2021 Elsevier Inc.This paper was made possible by National Priorities Research Program (NPRP) grant No. 10-0130-170288 from the Qatar National Research Fund (a member of Qatar Foundation). The statements made herein are solely the responsibility of the authors.Scopu

arXiv.org e-Print Archive

Qatar University Institutional Repository

SJSU ScholarWorks

Addressing the Issues of Coalitions and Collusion in Multiagent Systems

Author: Kerr Reid Charles
Publication venue: 'University of Waterloo'
Publication date: 01/01/2013
Field of study

In the field of multiagent systems, trust and reputation systems are intended to assist agents in finding trustworthy partners with whom to interact. Earlier work of ours identified in theory a number of security vulnerabilities in trust and reputation systems, weaknesses that might be exploited by malicious agents to bypass the protections offered by such systems. In this work, we begin by developing the TREET testbed, a simulation platform that allows for extensive evaluation and flexible experimentation with trust and reputation technologies. We use this testbed to experimentally validate the practicality and gravity of attacks against vulnerabilities. Of particular interest are attacks that are collusive in nature: groups of agents (coalitions) working together to improve their expected rewards. But the issue of coalitions is not unique to trust and reputation; rather, it cuts across a range of fields in multiagent systems and beyond. In some scenarios, coalitions may be unwanted or forbidden; in others they may be benign or even desirable. In this document, we propose a method for detecting coalitions and identifying coalition members, a capability that is likely to be valuable in many of the diverse fields where coalitions may be of interest. Our method makes use of clustering in benefit space (a high-dimensional space reflecting how agents benefit others in the system) in order to identify groups of agents who benefit similar sets of agents. A statistical technique is then used to identify which clusters contain coalitions. Experimentation using the TREET platform verifies the effectiveness of this approach. A series of enhancements to our method are also introduced, which improve the accuracy and robustness of the algorithm. To demonstrate how this broadly-applicable tool can be used to address domain-specific problems, we focus again on trust and reputation systems. We show how, by incorporating our work into one such system (the existing Beta Reputation System), we can provide resistance to collusion. We conclude with a detailed discussion of the value of our work for a wide range of environments, including a variety of multiagent systems and real-world settings

University of Waterloo's Institutional Repository

Combating Threats to the Quality of Information in Social Systems

Author: Lee Kyumin
Publication venue
Publication date: 16/12/2013
Field of study

Many large-scale social systems such as Web-based social networks, online social media sites and Web-scale crowdsourcing systems have been growing rapidly, enabling millions of human participants to generate, share and consume content on a massive scale. This reliance on users can lead to many positive effects, including large-scale growth in the size and content in the community, bottom-up discovery of “citizen-experts”, serendipitous discovery of new resources beyond the scope of the system designers, and new social-based information search and retrieval algorithms. But the relative openness and reliance on users coupled with the widespread interest and growth of these social systems carries risks and raises growing concerns over the quality of information in these systems. In this dissertation research, we focus on countering threats to the quality of information in self-managing social systems. Concretely, we identify three classes of threats to these systems: (i) content pollution by social spammers, (ii) coordinated campaigns for strategic manipulation, and (iii) threats to collective attention. To combat these threats, we propose three inter-related methods for detecting evidence of these threats, mitigating their impact, and improving the quality of information in social systems. We augment this three-fold defense with an exploration of their origins in “crowdturfing” – a sinister counterpart to the enormous positive opportunities of crowdsourcing. In particular, this dissertation research makes four unique contributions: • The first contribution of this dissertation research is a framework for detecting and filtering social spammers and content polluters in social systems. To detect and filter individual social spammers and content polluters, we propose and evaluate a novel social honeypot-based approach. • Second, we present a set of methods and algorithms for detecting coordinated campaigns in large-scale social systems. We propose and evaluate a content- driven framework for effectively linking free text posts with common “talking points” and extracting campaigns from large-scale social systems. • Third, we present a dual study of the robustness of social systems to collective attention threats through both a data-driven modeling approach and deploy- ment over a real system trace. We evaluate the effectiveness of countermeasures deployed based on the first moments of a bursting phenomenon in a real system. • Finally, we study the underlying ecosystem of crowdturfing for engaging in each of the three threat types. We present a framework for “pulling back the curtain” on crowdturfers to reveal their underlying ecosystem on both crowdsourcing sites and social media

Texas A&M Repository