31 research outputs found
Deriving item features relevance from collaborative domain knowledge
An Item based recommender system works by computing a similarity between
items, which can exploit past user interactions (collaborative filtering) or
item features (content based filtering). Collaborative algorithms have been
proven to achieve better recommendation quality then content based algorithms
in a variety of scenarios, being more effective in modeling user behaviour.
However, they can not be applied when items have no interactions at all, i.e.
cold start items. Content based algorithms, which are applicable to cold start
items, often require a lot of feature engineering in order to generate useful
recommendations. This issue is specifically relevant as the content descriptors
become large and heterogeneous. The focus of this paper is on how to use a
collaborative models domain-specific knowledge to build a wrapper feature
weighting method which embeds collaborative knowledge in a content based
algorithm. We present a comparative study for different state of the art
algorithms and present a more general model. This machine learning approach to
feature weighting shows promising results and high flexibility
Eigenvalue analogy for confidence estimation in item-based recommender systems
Item-item collaborative filtering (CF) models are a well known and studied
family of recommender systems, however current literature does not provide any
theoretical explanation of the conditions under which item-based
recommendations will succeed or fail.
We investigate the existence of an ideal item-based CF method able to make
perfect recommendations. This CF model is formalized as an eigenvalue problem,
where estimated ratings are equivalent to the true (unknown) ratings multiplied
by a user-specific eigenvalue of the similarity matrix. Preliminary experiments
show that the magnitude of the eigenvalue is proportional to the accuracy of
recommendations for that user and therefore it can provide reliable measure of
confidence
Replication of collaborative filtering generative adversarial networks on recommender systems
CFGAN and its family of models (TagRec, MTPR, and CRGAN) learn to generate personalized and fake-but-realistic preferences for top-N recommendations by solely using previous interactions. The work discusses the impact of certain differences between the CFGAN framework and the model used in the original evaluation. The absence of random noise and the use of real user profiles as condition vectors leaves the generator prone to learn a degenerate solution in which the output vector is identical to the input vector, therefore, behaving essentially as a simple auto-encoder. This work further expands the experimental analysis comparing CFGAN against a selection of simple and well-known properly optimized baselines, observing that CFGAN is not consistently competitive against them despite its high computational cost
A Troubling Analysis of Reproducibility and Progress in Recommender Systems Research
The design of algorithms that generate personalized ranked item lists is a
central topic of research in the field of recommender systems. In the past few
years, in particular, approaches based on deep learning (neural) techniques
have become dominant in the literature. For all of them, substantial progress
over the state-of-the-art is claimed. However, indications exist of certain
problems in today's research practice, e.g., with respect to the choice and
optimization of the baselines used for comparison, raising questions about the
published claims. In order to obtain a better understanding of the actual
progress, we have tried to reproduce recent results in the area of neural
recommendation approaches based on collaborative filtering. The worrying
outcome of the analysis of these recent works-all were published at prestigious
scientific conferences between 2015 and 2018-is that 11 out of the 12
reproducible neural approaches can be outperformed by conceptually simple
methods, e.g., based on the nearest-neighbor heuristics. None of the
computationally complex neural methods was actually consistently better than
already existing learning-based techniques, e.g., using matrix factorization or
linear models. In our analysis, we discuss common issues in today's research
practice, which, despite the many papers that are published on the topic, have
apparently led the field to a certain level of stagnation.Comment: Source code and full results available at:
https://github.com/MaurizioFD/RecSys2019_DeepLearning_Evaluatio
Replication of recommender systems with impressions
Impressions are a novel data type in Recommender Systems containing the previously-exposed items, i.e., what was shown on-screen. Due to their novelty, the current literature lacks a characterization of impressions, and replications of previous experiments. Also, previous research works have mainly used impressions in industrial contexts or recommender systems competitions, such as the ACM RecSys Challenges. This work is part of an ongoing study about impressions in recommender systems. It presents an evaluation of impressions recommenders on current open datasets, comparing not only the recommendation quality of impressions recommenders against strong baselines, but also determining if previous progress claims can be replicated
Impression-Aware Recommender Systems
Novel data sources bring new opportunities to improve the quality of
recommender systems. Impressions are a novel data source containing past
recommendations (shown items) and traditional interactions. Researchers may use
impressions to refine user preferences and overcome the current limitations in
recommender systems research. The relevance and interest of impressions have
increased over the years; hence, the need for a review of relevant work on this
type of recommenders. We present a systematic literature review on recommender
systems using impressions, focusing on three fundamental angles in research:
recommenders, datasets, and evaluation methodologies. We provide three
categorizations of papers describing recommenders using impressions, present
each reviewed paper in detail, describe datasets with impressions, and analyze
the existing evaluation methodologies. Lastly, we present open questions and
future directions of interest, highlighting aspects missing in the literature
that can be addressed in future works.Comment: 34 pages, 103 references, 6 tables, 2 figures, ACM UNDER REVIE
Virtual Network Function Embedding with Quantum Annealing
In recent years, the growing number of devices connected to the internet led network operators to continuously expand their own infrastructures. In order to simplify this scaling process, the research community is currently investigating the opportunity to move the complexity from a hardware to a software domain, through the introduction of a new paradigm, called Network Functions Virtualisation (NFV). It considers standard hardware platforms where many virtual instances are allocated to implement specific network services. However, despite the theoretical benefits, the mapping of the different virtual instances to the available physical resources represents a complex problem, difficult to be solved classically. The present work proposes a Quadratic Unconstrained Binary Optimisation (QUBO) formulation of this embedding process, exploring the implementation possibilities on D-Wave’s Quantum Annealers. Many test cases, with realistic constraints, have been considered to validate and characterise the potential of the model, and the promising results achieved are discussed throughout the document. The technical discussion is enriched with comparisons of the results obtained through heuristic algorithms, highlighting the strengths and the limitations in the resolution of the QUBO formulation proposed on current quantum machines
Towards Evaluating User Profiling Methods Based on Explicit Ratings on Item Features
In order to improve the accuracy of recommendations, many recommender systems
nowadays use side information beyond the user rating matrix, such as item
content. These systems build user profiles as estimates of users' interest on
content (e.g., movie genre, director or cast) and then evaluate the performance
of the recommender system as a whole e.g., by their ability to recommend
relevant and novel items to the target user. The user profile modelling stage,
which is a key stage in content-driven RS is barely properly evaluated due to
the lack of publicly available datasets that contain user preferences on
content features of items.
To raise awareness of this fact, we investigate differences between explicit
user preferences and implicit user profiles. We create a dataset of explicit
preferences towards content features of movies, which we release publicly. We
then compare the collected explicit user feature preferences and implicit user
profiles built via state-of-the-art user profiling models. Our results show a
maximum average pairwise cosine similarity of 58.07\% between the explicit
feature preferences and the implicit user profiles modelled by the best
investigated profiling method and considering movies' genres only. For actors
and directors, this maximum similarity is only 9.13\% and 17.24\%,
respectively. This low similarity between explicit and implicit preference
models encourages a more in-depth study to investigate and improve this
important user profile modelling step, which will eventually translate into
better recommendations
ContentWise Impressions: An Industrial Dataset with Impressions Included
In this article, we introduce the ContentWise Impressions dataset, a
collection of implicit interactions and impressions of movies and TV series
from an Over-The-Top media service, which delivers its media contents over the
Internet. The dataset is distinguished from other already available multimedia
recommendation datasets by the availability of impressions, i.e., the
recommendations shown to the user, its size, and by being open-source. We
describe the data collection process, the preprocessing applied, its
characteristics, and statistics when compared to other commonly used datasets.
We also highlight several possible use cases and research questions that can
benefit from the availability of user impressions in an open-source dataset.
Furthermore, we release software tools to load and split the data, as well as
examples of how to use both user interactions and impressions in several common
recommendation algorithms.Comment: 8 pages, 2 figure