2 research outputs found

    Noisy but non-malicious user detection in social recommender systems

    Full text link
    Social recommender systems largely rely on user-contributed data to infer users' preference. While this feature has enabled many interesting applications in social networking services, it also introduces unreliability to recommenders as users are allowed to insert data freely. Although detecting malicious attacks from social spammers has been studied for years, little work was done for detecting Noisy but Non-Malicious Users (NNMUs), which refers to those genuine users who may provide some untruthful data due to their imperfect behaviors. Unlike colluded malicious attacks that can be detected by finding similarly-behaved user profiles, NNMUs are more difficult to identify since their profiles are neither similar nor correlated from one another. In this article, we study how to detect NNMUs in social recommender systems. Based on the assumption that the ratings provided by a same user on closely correlated items should have similar scores, we propose an effective method for NNMU detection by capturing and accumulating user's "self-contradictions", i.e., the cases that a user provides very different rating scores on closely correlated items. We show that self-contradiction capturing can be formulated as a constrained quadratic optimization problem w.r.t. a set of slack variables, which can be further used to quantify the underlying noise in each test user profile. We adopt three real-world data sets to empirically test the proposed method. The experimental results show that our method (i) is effective in real-world NNMU detection scenarios, (ii) can significantly outperform other noisy-user detection methods, and (iii) can improve recommendation performance for other users after removing detected NNMUs from the recommender system. © 2012 Springer Science+Business Media, LLC

    Considerando o ruído no aprendizado de modelos preditivos robustos para a filtragem colaborativa

    Get PDF
    In Recommendation Systems, it is named natural noise the inconsistencies that are introduced by a user. These inconsistencies affect the overall performance. Until then, data cleansing proposals have emerged with the objective to identify and correct these inconsistencies. However. approaches that consider noise in the learning process present a superior quality. Meanwhile, procedures for changing the cost function have arisen whose solution for the minimization of this with noisy data corresponds to the same solution using the original function with noiseless data. However, these procedures are dependent on previews knowledge of the noise distribution and in order to estimate it, certain assumptions regarding data are required. These conditions are not satisfied in collaborative filtering. In this work it is proposed to use these cost functions to construct a predictive model that considers noise in its learning. In addition, we present: (a) a class noise generation heuristic for collaborative filtering problems; (b) a baseline noise quantitative analysis; (c) robustness analysis of predictive models. In order to validate the proposal, three most representative datasets were selected for the problem. For such datasets, comparisons were made with state-of-the-art. Our results indicate that the proposal obtains superior prediction quality to the other methods in all the datasets and maintains a competitive robustness even when compared with the model that knows a priori the generator of the noise. Finally, a new direction is opened for methods that consider noise to the learning process of predictive models for collaborative filtering.Em sistemas de recomendação, denomina-se ruído natural as inconsistências que são introduzidas por um usuário. Inconsistências estas que são responsáveis por afetar o desempenho geral do recomendador. Até então, surgiram propostas de data cleansing que se baseiam em identificar essas avaliações inconsistentes e corrigi-las. Contudo, abordagens que consideram o ruído no processo de aprendizado apresentam qualidade superior. Neste cenário, surgiram procedimentos de alteração da função de custo, cuja solução para a minimização desta com dados ruidosos, corresponde à mesma solução utilizando a função original com dados sem ruído. Entretanto, estes são dependentes de um conhecimento a priori da distribuição do ruído e, para poder estimá-la, são necessárias certas suposições acerca dos dados. No caso da filtragem colaborativa, estas condições não são satisfeitas. Neste trabalho é proposta a utilização destas funções de custo para construir um modelo preditivo que considere o ruído no seu aprendizado. Adicionalmente, apresentamos: (a) uma heurística de geração de ruído de classe para problemas de filtragem colaborativa; (b) uma análise do quantitativo de ruído em bases; (c) análise da robustez de modelos preditivos. De forma a validar a proposta, foram selecionadas três bases mais representativas ao problema. Para tais bases, foram realizados comparativos com métodos do estado-da-arte. Nossos resultados indicam que a proposta obtém qualidade superior aos demais métodos em todas as bases e mantém uma robustez competitiva até mesmo quando se comparado com o modelo que conhece a priori o gerador do ruído. Por fim, abre-se um novo caminho para métodos que consideram ruído ao processo de aprendizado de modelos preditivos para filtragem colaborativa, e que, pesquisas nesta direção devem ser consideradas
    corecore