151 research outputs found
Trust-Based Rating Prediction and Malicious Profile Detection in Online Social Recommender Systems
Online social networks and recommender systems have become an effective channel for influencing millions of users by facilitating exchange and spread of information. This dissertation addresses multiple challenges that are faced by online social recommender systems such as: i) finding the extent of information spread; ii) predicting the rating of a product; and iii) detecting malicious profiles. Most of the research in this area do not capture the social interactions and rely on empirical or statistical approaches without considering the temporal aspects. We capture the temporal spread of information using a probabilistic model and use non-linear differential equations to model the diffusion process. To predict the rating of a product, we propose a social trust model and use the matrix factorization method to estimate user\u27s taste by incorporating user-item rating matrix. The effect of tastes of friends of a user is captured using a trust model which is based on similarities between users and their centralities. Similarity is modeled using Vector Space Similarity and Pearson Correlation Coefficient algorithms, whereas degree, eigen-vector, Katz, and PageRank are used to model centrality. As rating of a product has tremendous influence on its saleability, social recommender systems are vulnerable to profile injection attacks that affect user\u27s opinion towards favorable or unfavorable recommendations for a product. We propose a classification approach for detecting attackers based on attributes that provide the likelihood of a user profile of that of an attacker. To evaluate the performance, we inject push and nuke attacks, and use precision and recall to identify the attackers. All proposed models have been validated using datasets from Facebook, Epinions, and Digg. Results exhibit that the proposed models are able to better predict the information spread, rating of a product, and identify malicious user profiles with high accuracy and low false positives
Robust Recommender System: A Survey and Future Directions
With the rapid growth of information, recommender systems have become
integral for providing personalized suggestions and overcoming information
overload. However, their practical deployment often encounters "dirty" data,
where noise or malicious information can lead to abnormal recommendations.
Research on improving recommender systems' robustness against such dirty data
has thus gained significant attention. This survey provides a comprehensive
review of recent work on recommender systems' robustness. We first present a
taxonomy to organize current techniques for withstanding malicious attacks and
natural noise. We then explore state-of-the-art methods in each category,
including fraudster detection, adversarial training, certifiable robust
training against malicious attacks, and regularization, purification,
self-supervised learning against natural noise. Additionally, we summarize
evaluation metrics and common datasets used to assess robustness. We discuss
robustness across varying recommendation scenarios and its interplay with other
properties like accuracy, interpretability, privacy, and fairness. Finally, we
delve into open issues and future research directions in this emerging field.
Our goal is to equip readers with a holistic understanding of robust
recommender systems and spotlight pathways for future research and development
Contributions to outlier detection and recommendation systems
Le forage de données, appelé également "Découverte de connaissance dans les bases de données" , est un jeune domaine de recherche interdisciplinaire. Le forage de données étudie les processus d'analyse de grands ensembles de données pour en extraire des connaissances, et les processus de transformation de ces connaissances en des structures faciles à comprendre et à utiliser par les humains. Cette thèse étudie deux tâches importantes dans le domaine du forage de données : la détection des anomalies et la recommandation de produits. La détection des anomalies est l'identification des données non conformes aux observations normales. La recommandation de produit est la prédiction du niveau d'intérêt d'un client pour des produits en se basant sur des données d'achats antérieurs et des données socio-économiques. Plus précisément, cette thèse porte sur 1) la détection des anomalies dans de grands ensembles de données de type catégorielles; et 2) les techniques de recommandation à partir des données de classements asymétriques. La détection des anomalies dans des données catégorielles de grande échelle est un problème important qui est loin d'être résolu. Les méthodes existantes dans ce domaine souffrnt d'une faible efficience et efficacité en raison de la dimensionnalité élevée des données, de la grande taille des bases de données, de la complexité élevée des tests statistiques, ainsi que des mesures de proximité non adéquates. Cette thèse propose une définition formelle d'anomalie dans les données catégorielles ainsi que deux algorithmes efficaces et efficients pour la détection des anomalies dans les données de grande taille. Ces algorithmes ont besoin d'un seul paramètre : le nombre des anomalies. Pour déterminer la valeur de ce paramètre, nous avons développé un critère en nous basant sur un nouveau concept qui est l'holo-entropie. Plusieurs recherches antérieures sur les systèmes de recommandation ont négligé un type de classements répandu dans les applications Web, telles que le commerce électronique (ex. Amazon, Taobao) et les sites fournisseurs de contenu (ex. YouTube). Les données de classements recueillies par ces sites se différencient de celles de classements des films et des musiques par leur distribution asymétrique élevée. Cette thèse propose un cadre mieux adapté pour estimer les classements et les préférences quantitatives d'ordre supérieur pour des données de classements asymétriques. Ce cadre permet de créer de nouveaux modèles de recommandation en se basant sur la factorisation de matrice ou sur l'estimation de voisinage. Des résultats expérimentaux sur des ensembles de données asymétriques indiquent que les modèles créés avec ce cadre ont une meilleure performance que les modèles conventionnels non seulement pour la prédiction de classements, mais aussi pour la prédiction de la liste des Top-N produits
Acquisition Data Analytics for Supply Chain Cybersecurity
Acquisition Research Program Sponsored Report SeriesSponsored Acquisition Research & Technical ReportsCybersecurity is a national priority, but the analysis required for acquisition personnel to objectively assess the integrity of the supply chain for cyber compromise is highly complex. This paper presents a process for supply chain data analytics for acquisition decision makers, addressing data collection, assessment, and reporting. The method includes workflows from initial purchase request through vendor selection and maintenance to audits across the lifecycle of an asset. Artificial intelligence can help acquisition decision makers automate the complexity of supply chain information assurance.Approved for public release; distribution is unlimited.Approved for public release; distribution is unlimited
Understanding and Mitigating Multi-sided Exposure Bias in Recommender Systems
Fairness is a critical system-level objective in recommender systems that has
been the subject of extensive recent research. It is especially important in
multi-sided recommendation platforms where it may be crucial to optimize
utilities not just for the end user, but also for other actors such as item
sellers or producers who desire a fair representation of their items. Existing
solutions do not properly address various aspects of multi-sided fairness in
recommendations as they may either solely have one-sided view (i.e. improving
the fairness only for one side), or do not appropriately measure the fairness
for each actor involved in the system. In this thesis, I aim at first
investigating the impact of unfair recommendations on the system and how these
unfair recommendations can negatively affect major actors in the system. Then,
I seek to propose solutions to tackle the unfairness of recommendations. I
propose a rating transformation technique that works as a pre-processing step
before building the recommendation model to alleviate the inherent popularity
bias in the input data and consequently to mitigate the exposure unfairness for
items and suppliers in the recommendation lists. Also, as another solution, I
propose a general graph-based solution that works as a post-processing approach
after recommendation generation for mitigating the multi-sided exposure bias in
the recommendation results. For evaluation, I introduce several metrics for
measuring the exposure fairness for items and suppliers, and show that these
metrics better capture the fairness properties in the recommendation results. I
perform extensive experiments to evaluate the effectiveness of the proposed
solutions. The experiments on different publicly-available datasets and
comparison with various baselines confirm the superiority of the proposed
solutions in improving the exposure fairness for items and suppliers.Comment: Doctoral thesi
RecAD: Towards A Unified Library for Recommender Attack and Defense
In recent years, recommender systems have become a ubiquitous part of our
daily lives, while they suffer from a high risk of being attacked due to the
growing commercial and social values. Despite significant research progress in
recommender attack and defense, there is a lack of a widely-recognized
benchmarking standard in the field, leading to unfair performance comparison
and limited credibility of experiments. To address this, we propose RecAD, a
unified library aiming at establishing an open benchmark for recommender attack
and defense. RecAD takes an initial step to set up a unified benchmarking
pipeline for reproducible research by integrating diverse datasets, standard
source codes, hyper-parameter settings, running logs, attack knowledge, attack
budget, and evaluation results. The benchmark is designed to be comprehensive
and sustainable, covering both attack, defense, and evaluation tasks, enabling
more researchers to easily follow and contribute to this promising field. RecAD
will drive more solid and reproducible research on recommender systems attack
and defense, reduce the redundant efforts of researchers, and ultimately
increase the credibility and practical value of recommender attack and defense.
The project is released at https://github.com/gusye1234/recad
Understanding User Intent Modeling for Conversational Recommender Systems: A Systematic Literature Review
Context: User intent modeling is a crucial process in Natural Language
Processing that aims to identify the underlying purpose behind a user's
request, enabling personalized responses. With a vast array of approaches
introduced in the literature (over 13,000 papers in the last decade),
understanding the related concepts and commonly used models in AI-based systems
is essential. Method: We conducted a systematic literature review to gather
data on models typically employed in designing conversational recommender
systems. From the collected data, we developed a decision model to assist
researchers in selecting the most suitable models for their systems.
Additionally, we performed two case studies to evaluate the effectiveness of
our proposed decision model. Results: Our study analyzed 59 distinct models and
identified 74 commonly used features. We provided insights into potential model
combinations, trends in model selection, quality concerns, evaluation measures,
and frequently used datasets for training and evaluating these models.
Contribution: Our study contributes practical insights and a comprehensive
understanding of user intent modeling, empowering the development of more
effective and personalized conversational recommender systems. With the
Conversational Recommender System, researchers can perform a more systematic
and efficient assessment of fitting intent modeling frameworks
- …