3,717 research outputs found

    Recommender system to support comprehensive exploration of large scale scientific datasets

    Get PDF
    Bases de dados de entidades científicas, como compostos químicos, doenças e objetos astronómicos, têm crescido em tamanho e complexidade, chegando a milhares de milhões de itens por base de dados. Os investigadores precisam de ferramentas novas e inovadoras para auxiliar na escolha desses itens. Este trabalho propõe o uso de Sistemas de Recomendação para auxiliar os investigadores a encontrar itens de interesse. Identificamos como um dos maiores desafios para a aplicação de sistemas de recomendação em áreas científicas a falta de conjuntos de dados padronizados e de acesso aberto com informações sobre as preferências dos utilizadores. Para superar esse desafio, desenvolvemos uma metodologia denominada LIBRETTI - Recomendação Baseada em Literatura de Itens Científicos, cujo objetivo é a criação de conjuntos de dados , relacionados com campos científicos. Estes conjuntos de dados são criados com base no principal recurso de conhecimento que a Ciência possui: a literatura científica. A metodologia LIBRETTI permitiu o desenvolvimento de novos algoritmos de recomendação específicos para vários campos científicos. Além do LIBRETTI, as principais contribuições desta tese são conjuntos de dados de recomendação padronizados nas áreas de Astronomia, Química e Saúde (relacionado com a doença COVID-19), um sistema de recomendação semântica híbrido para compostos químicos em conjuntos de dados de grande escala, uma abordagem híbrida baseada no enriquecimento sequencial (SeEn) para recomendações sequenciais, um pipeline baseado em semântica de vários campos para recomendar entidades biomédicas relacionadas com a doença COVID-19.Databases for scientific entities, such as chemical compounds, diseases and astronomical objects, are growing in size and complexity, reaching billions of items per database. Researchers need new and innovative tools for assisting the choice of these items. This work proposes the use of Recommender Systems approaches for helping researchers to find items of interest. We identified as one of the major challenges for applying RS in scientific fields the lack of standard and open-access datasets with information about the preferences of the users. To overcome this challenge, we developed a methodology called LIBRETTI - LIterature Based RecommEndaTion of scienTific Items, whose goal is to create datasets related to scientific fields. These datasets are created based on scientific literature, the major resource of knowledge that Science has. LIBRETTI methodology allowed the development and testing of new recommender algorithms specific for each field. Besides LIBRETTI, the main contributions of this thesis are standard and sequence-aware recommendation datasets in the fields of Astronomy, Chemistry, and Health (related to COVID-19 disease), a hybrid semantic recommender system for chemical compounds in large-scale datasets, a hybrid approach based on sequential enrichment (SeEn) for sequence-aware recommendations, a multi-field semantic-based pipeline for recommending biomedical entities related to COVID-19 disease

    Scalable and interpretable product recommendations via overlapping co-clustering

    Full text link
    We consider the problem of generating interpretable recommendations by identifying overlapping co-clusters of clients and products, based only on positive or implicit feedback. Our approach is applicable on very large datasets because it exhibits almost linear complexity in the input examples and the number of co-clusters. We show, both on real industrial data and on publicly available datasets, that the recommendation accuracy of our algorithm is competitive to that of state-of-art matrix factorization techniques. In addition, our technique has the advantage of offering recommendations that are textually and visually interpretable. Finally, we examine how to implement our technique efficiently on Graphical Processing Units (GPUs).Comment: In IEEE International Conference on Data Engineering (ICDE) 201

    A Systematic Literature Review of Linked Data-based Recommender Systems

    Get PDF
    Recommender Systems (RS) are software tools that use analytic technologies to suggest different items of interest to an end user. Linked Data is a set of best practices for publishing and connecting structured data on the Web. This paper presents a systematic literature review to summarize the state of the art in recommender systems that use structured data published as Linked Data for providing recommendations of items from diverse domains. It considers the most relevant research problems addressed and classifies RS according to how Linked Data has been used to provide recommendations. Furthermore, it analyzes contributions, limitations, application domains, evaluation techniques, and directions proposed for future research. We found that there are still many open challenges with regard to RS based on Linked Data in order to be efficient for real applications. The main ones are personalization of recommendations; use of more datasets considering the heterogeneity introduced; creation of new hybrid RS for adding information; definition of more advanced similarity measures that take into account the large amount of data in Linked Data datasets; and implementation of testbeds to study evaluation techniques and to assess the accuracy scalability and computational complexity of RS

    Biases in scholarly recommender systems: impact, prevalence, and mitigation

    Get PDF
    We create a simulated financial market and examine the effect of different levels of active and passive investment on fundamental market efficiency. In our simulated market, active, passive, and random investors interact with each other through issuing orders. Active and passive investors select their portfolio weights by optimizing Markowitz-based utility functions. We find that higher fractions of active investment within a market lead to an increased fundamental market efficiency. The marginal increase in fundamental market efficiency per additional active investor is lower in markets with higher levels of active investment. Furthermore, we find that a large fraction of passive investors within a market may facilitate technical price bubbles, resulting in market failure. By examining the effect of specific parameters on market outcomes, we find that that lower transaction costs, lower individual forecasting errors of active investors, and less restrictive portfolio constraints tend to increase fundamental market efficiency in the market
    corecore