1,177 research outputs found

    Modelling and analysis of temporal preference drifts using a component-based factorised latent approach

    Get PDF
    In recommender systems, human preferences are identified by a number of individual components with complicated interactions and properties. Recently, the dynamicity of preferences has been the focus of several studies. The changes in user preferences can originate from substantial reasons, like personality shift, or transient and circumstantial ones, like seasonal changes in item popularities. Disregarding these temporal drifts in modelling user preferences can result in unhelpful recommendations. Moreover, different temporal patterns can be associated with various preference domains, and preference components and their combinations. These components comprise preferences over features, preferences over feature values, conditional dependencies between features, socially-influenced preferences, and bias. For example, in the movies domain, the user can change his rating behaviour (bias shift), her preference for genre over language (feature preference shift), or start favouring drama over comedy (feature value preference shift). In this paper, we first propose a novel latent factor model to capture the domain-dependent component-specific temporal patterns in preferences. The component-based approach followed in modelling the aspects of preferences and their temporal effects enables us to arbitrarily switch components on and off. We evaluate the proposed method on three popular recommendation datasets and show that it significantly outperforms the most accurate state-of-the-art static models. The experiments also demonstrate the greater robustness and stability of the proposed dynamic model in comparison with the most successful models to date. We also analyse the temporal behaviour of different preference components and their combinations and show that the dynamic behaviour of preference components is highly dependent on the preference dataset and domain. Therefore, the results also highlight the importance of modelling temporal effects but also underline the advantages of a component-based architecture that is better suited to capture domain-specific balances in the contributions of the aspects

    Improving average ranking precision in user searches for biomedical research datasets

    Full text link
    Availability of research datasets is keystone for health and life science study reproducibility and scientific progress. Due to the heterogeneity and complexity of these data, a main challenge to be overcome by research data management systems is to provide users with the best answers for their search queries. In the context of the 2016 bioCADDIE Dataset Retrieval Challenge, we investigate a novel ranking pipeline to improve the search of datasets used in biomedical experiments. Our system comprises a query expansion model based on word embeddings, a similarity measure algorithm that takes into consideration the relevance of the query terms, and a dataset categorisation method that boosts the rank of datasets matching query constraints. The system was evaluated using a corpus with 800k datasets and 21 annotated user queries. Our system provides competitive results when compared to the other challenge participants. In the official run, it achieved the highest infAP among the participants, being +22.3% higher than the median infAP of the participant's best submissions. Overall, it is ranked at top 2 if an aggregated metric using the best official measures per participant is considered. The query expansion method showed positive impact on the system's performance increasing our baseline up to +5.0% and +3.4% for the infAP and infNDCG metrics, respectively. Our similarity measure algorithm seems to be robust, in particular compared to Divergence From Randomness framework, having smaller performance variations under different training conditions. Finally, the result categorization did not have significant impact on the system's performance. We believe that our solution could be used to enhance biomedical dataset management systems. In particular, the use of data driven query expansion methods could be an alternative to the complexity of biomedical terminologies
    • …
    corecore