92 research outputs found

    IRGAN: A Minimax Game for Unifying Generative and Discriminative Information Retrieval Models

    Get PDF
    This paper provides a unified account of two schools of thinking in information retrieval modelling: the generative retrieval focusing on predicting relevant documents given a query, and the discriminative retrieval focusing on predicting relevancy given a query-document pair. We propose a game theoretical minimax game to iteratively optimise both models. On one hand, the discriminative model, aiming to mine signals from labelled and unlabelled data, provides guidance to train the generative model towards fitting the underlying relevance distribution over documents given the query. On the other hand, the generative model, acting as an attacker to the current discriminative model, generates difficult examples for the discriminative model in an adversarial way by minimising its discrimination objective. With the competition between these two models, we show that the unified framework takes advantage of both schools of thinking: (i) the generative model learns to fit the relevance distribution over documents via the signals from the discriminative model, and (ii) the discriminative model is able to exploit the unlabelled data selected by the generative model to achieve a better estimation for document ranking. Our experimental results have demonstrated significant performance gains as much as 23.96% on Precision@5 and 15.50% on MAP over strong baselines in a variety of applications including web search, item recommendation, and question answering.Comment: 12 pages; appendix adde

    A scalable recommender system : using latent topics and alternating least squares techniques

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced AnalyticsA recommender system is one of the major techniques that handles information overload problem of Information Retrieval. Improves access and proactively recommends relevant information to each user, based on preferences and objectives. During the implementation and planning phases, designers have to cope with several issues and challenges that need proper attention. This thesis aims to show the issues and challenges in developing high-quality recommender systems. A paper solves a current research problem in the field of job recommendations using a distributed algorithmic framework built on top of Spark for parallel computation which allows the algorithm to scale linearly with the growing number of users. The final solution consists of two different recommenders which could be utilised for different purposes. The first method is mainly driven by latent topics among users, meanwhile the second technique utilises a latent factor algorithm that directly addresses the preference-confidence paradigm

    BLC: Private Matrix Factorization Recommenders via Automatic Group Learning

    Get PDF
    We propose a privacy-enhanced matrix factorization recommender that exploits the fact that users can often be grouped together by interest. This allows a form of “hiding in the crowd” privacy. We introduce a novel matrix factorization approach suited to making recommendations in a shared group (or “nym”) setting and the BLC algorithm for carrying out this matrix factorization in a privacy-enhanced manner. We demonstrate that the increased privacy does not come at the cost of reduced recommendation accuracy

    soMLier: A South African Wine Recommender System

    Get PDF
    Though several commercial wine recommender systems exist, they are largely tailored to consumers outside of South Africa (SA). Consequently, these systems are of limited use to novice wine consumers in SA. To address this, the aim of this research is to develop a system for South African consumers that yields high-quality wine recommendations, maximises the accuracy of predicted ratings for those recommendations and provides insights into why those suggestions were made. To achieve this, a hybrid system “soMLier” (pronounced “sommelier”) is built in this thesis that makes use of two datasets. Firstly, a database containing several attributes of South African wines such as the chemical composition, style, aroma, price and description was supplied by wine.co.za (a SA wine retailer). Secondly, for each wine in that database, the numeric 5-star ratings and textual reviews made by users worldwide were further scraped from Vivino.com to serve as a dataset of user preferences. Together, these are used to develop and compare several systems, the most optimal of which are combined in the final system. Item-based collaborative filtering methods are investigated first along with model-based techniques (such as matrix factorisation and neural networks) when applied to the user rating dataset to generate wine recommendations through the ranking of rating predictions. Respectively, these methods are determined to excel at generating lists of relevant wine recommendations and producing accurate corresponding predicted ratings. Next, the wine attribute data is used to explore the efficacy of content-based systems. Numeric features (such as price) are compared along with categorical features (such as style) using various distance measures and the relationships between the textual descriptions of the wines are determined using natural language processing methods. These methods are found to be most appropriate for explaining wine recommendations. Hence, the final hybrid system makes use of collaborative filtering to generate recommendations, matrix factorisation to predict user ratings, and content-based techniques to rationalise the wine suggestions made. This thesis contributes the “soMLier” system that is of specific use to SA wine consumers as it bridges the gap between the technologies used by highly-developed existing systems and the SA wine market. Though this final system would benefit from more explicit user data to establish a richer model of user preferences, it can ultimately assist consumers in exploring unfamiliar wines, discovering wines they will likely enjoy, and understanding their preferences of SA wine

    Collaborative personalised dynamic faceted search

    Get PDF
    Information retrieval systems are facing challenges due to the overwhelming volume of available information online. It leads to the need of search features that have the capability to provide relevant information for searchers. Dynamic faceted search has been one of the potential tools to provide a list of multiple facets for searchers to filter their contents. However, being a dynamic system, some irrelevant or unimportant facets could be produced. To develop an effective dynamic faceted search, personalised facet selection is an important mechanism to create an appropriate personalised facet list. Most current systems have derived the searchers' interests from their own profiles. However, interests from the past may not be adequate to predict current interest due to human information-seeking behaviour. Incorporating current interests from other people's opinions to predict the interests of individual person is an alternative way to develop personalisation which is called Collaborative approach. This research aims to investigate the incorporation of a Collaborative approach to personalise facet selection. This study introduces the Artificial Neural Network (ANN)-based collaborative personalisation architecture framework and Relation-aware Collaborative AutoEncoder model (RCAE) with embedding methodology for modelling and predicting the interests in multiple facets. The study showed that incorporating collaborative approach into the proposed framework for facet selection is capable to enhance the performance of personalisation model in facet selection in comparison to the state-of-the-art techniques

    Improving cold-start recommendations using item-based stereotypes

    Get PDF
    Recommender systems (RSs) have become key components driving the success of e-commerce and other platforms where revenue and customer satisfaction is dependent on the user’s ability to discover desirable items in large catalogues. As the number of users and items on a platform grows, the computational complexity and the sparsity problem constitute important challenges for any recommendation algorithm. In addition, the most widely studied filtering-based RSs, while effective in providing suggestions for established users and items, are known for their poor performance for the new user and new item (cold-start) problems. Stereotypical modelling of users and items is a promising approach to solving these problems. A stereotype represents an aggregation of the characteristics of the items or users which can be used to create general user or item classes. We propose a set of methodologies for the automatic generation of stereotypes to address the cold-start problem. The novelty of the proposed approach rests on the findings that stereotypes built independently of the user-to-item ratings improve both recommendation metrics and computational performance during cold-start phases. The resulting RS can be used with any machine learning algorithm as a solver, and the improved performance gains due to rate-agnostic stereotypes are orthogonal to the gains obtained using more sophisticated solvers. The paper describes how such item-based stereotypes can be evaluated via a series of statistical tests prior to being used for recommendation. The proposed approach improves recommendation quality under a variety of metrics and significantly reduces the dimension of the recommendation model
    • …
    corecore