1,825 research outputs found
BanditMF: Multi-Armed Bandit Based Matrix Factorization Recommender System
Multi-armed bandits (MAB) provide a principled online learning approach to
attain the balance between exploration and exploitation.Due to the superior
performance and low feedback learning without the learning to act in multiple
situations, Multi-armed Bandits drawing widespread attention in applications
ranging such as recommender systems. Likewise, within the recommender system,
collaborative filtering (CF) is arguably the earliest and most influential
method in the recommender system. Crucially, new users and an ever-changing
pool of recommended items are the challenges that recommender systems need to
address. For collaborative filtering, the classical method is training the
model offline, then perform the online testing, but this approach can no longer
handle the dynamic changes in user preferences which is the so-called
\textit{cold start}. So how to effectively recommend items to users in the
absence of effective information? To address the aforementioned problems, a
multi-armed bandit based collaborative filtering recommender system has been
proposed, named BanditMF. BanditMF is designed to address two challenges in the
multi-armed bandits algorithm and collaborative filtering: (1) how to solve the
cold start problem for collaborative filtering under the condition of scarcity
of valid information, (2) how to solve the sub-optimal problem of bandit
algorithms in strong social relations domains caused by independently
estimating unknown parameters associated with each user and ignoring
correlations between users.Comment: MSc dissertatio
Connections Between Adaptive Control and Optimization in Machine Learning
This paper demonstrates many immediate connections between adaptive control
and optimization methods commonly employed in machine learning. Starting from
common output error formulations, similarities in update law modifications are
examined. Concepts in stability, performance, and learning, common to both
fields are then discussed. Building on the similarities in update laws and
common concepts, new intersections and opportunities for improved algorithm
analysis are provided. In particular, a specific problem related to higher
order learning is solved through insights obtained from these intersections.Comment: 18 page
A Latent Source Model for Online Collaborative Filtering
Despite the prevalence of collaborative filtering in recommendation systems,
there has been little theoretical development on why and how well it works,
especially in the "online" setting, where items are recommended to users over
time. We address this theoretical gap by introducing a model for online
recommendation systems, cast item recommendation under the model as a learning
problem, and analyze the performance of a cosine-similarity collaborative
filtering method. In our model, each of users either likes or dislikes each
of items. We assume there to be types of users, and all the users of a
given type share a common string of probabilities determining the chance of
liking each item. At each time step, we recommend an item to each user, where a
key distinction from related bandit literature is that once a user consumes an
item (e.g., watches a movie), then that item cannot be recommended to the same
user again. The goal is to maximize the number of likable items recommended to
users over time. Our main result establishes that after nearly
initial learning time steps, a simple collaborative filtering algorithm
achieves essentially optimal performance without knowing . The algorithm has
an exploitation step that uses cosine similarity and two types of exploration
steps, one to explore the space of items (standard in the literature) and the
other to explore similarity between users (novel to this work).Comment: Advances in Neural Information Processing Systems (NIPS 2014
- …