33,204 research outputs found
Probabilistic Models for Unified Collaborative and Content-Based Recommendation in Sparse-Data Environments
Recommender systems leverage product and community information to target
products to consumers. Researchers have developed collaborative recommenders,
content-based recommenders, and (largely ad-hoc) hybrid systems. We propose a
unified probabilistic framework for merging collaborative and content-based
recommendations. We extend Hofmann's [1999] aspect model to incorporate
three-way co-occurrence data among users, items, and item content. The relative
influence of collaboration data versus content data is not imposed as an
exogenous parameter, but rather emerges naturally from the given data sources.
Global probabilistic models coupled with standard Expectation Maximization (EM)
learning algorithms tend to drastically overfit in sparse-data situations, as
is typical in recommendation applications. We show that secondary content
information can often be used to overcome sparsity. Experiments on data from
the ResearchIndex library of Computer Science publications show that
appropriate mixture models incorporating secondary data produce significantly
better quality recommenders than k-nearest neighbors (k-NN). Global
probabilistic models also allow more general inferences than local methods like
k-NN.Comment: Appears in Proceedings of the Seventeenth Conference on Uncertainty
in Artificial Intelligence (UAI2001
Collaborative Ensemble Learning: Combining Collaborative and Content-Based Information Filtering via Hierarchical Bayes
Collaborative filtering (CF) and content-based filtering (CBF) have widely
been used in information filtering applications. Both approaches have their
strengths and weaknesses which is why researchers have developed hybrid
systems. This paper proposes a novel approach to unify CF and CBF in a
probabilistic framework, named collaborative ensemble learning. It uses
probabilistic SVMs to model each user's profile (as CBF does).At the prediction
phase, it combines a society OF users profiles, represented by their respective
SVM models, to predict an active users preferences(the CF idea).The combination
scheme is embedded in a probabilistic framework and retains an intuitive
explanation.Moreover, collaborative ensemble learning does not require a global
training stage and thus can incrementally incorporate new data.We report
results based on two data sets. For the Reuters-21578 text data set, we
simulate user ratings under the assumption that each user is interested in only
one category. In the second experiment, we use users' opinions on a set of 642
art images that were collected through a web-based survey. For both data sets,
collaborative ensemble achieved excellent performance in terms of
recommendation accuracy.Comment: Appears in Proceedings of the Nineteenth Conference on Uncertainty in
Artificial Intelligence (UAI2003
A Theory of Information Matching
In this work, we propose a theory for information matching. It is motivated
by the observation that retrieval is about the relevance matching between two
sets of properties (features), namely, the information need representation and
information item representation. However, many probabilistic retrieval models
rely on fixing one representation and optimizing the other (e.g. fixing the
single information need and tuning the document) but not both. Therefore, it is
difficult to use the available related information on both the document and the
query at the same time in calculating the probability of relevance. In this
paper, we address the problem by hypothesizing the relevance as a logical
relationship between the two sets of properties; the relationship is defined on
two separate mappings between these properties. By using the hypothesis we
develop a unified probabilistic relevance model which is capable of using all
the available information. We validate the proposed theory by formulating and
developing probabilistic relevance ranking functions for both ad-hoc text
retrieval and collaborative filtering. Our derivation in text retrieval
illustrates the use of the theory in the situation where no relevance
information is available. In collaborative filtering, we show that the
resulting recommender model unifies the user and item information into a
relevance ranking function without applying any dimensionality reduction
techniques or computing explicit similarity between two different users (or
items), in contrast to the state-of-the-art recommender models
Stochastic Block Models with Multiple Continuous Attributes
The stochastic block model (SBM) is a probabilistic model for community
structure in networks. Typically, only the adjacency matrix is used to perform
SBM parameter inference. In this paper, we consider circumstances in which
nodes have an associated vector of continuous attributes that are also used to
learn the node-to-community assignments and corresponding SBM parameters. While
this assumption is not realistic for every application, our model assumes that
the attributes associated with the nodes in a network's community can be
described by a common multivariate Gaussian model. In this augmented,
attributed SBM, the objective is to simultaneously learn the SBM connectivity
probabilities with the multivariate Gaussian parameters describing each
community. While there are recent examples in the literature that combine
connectivity and attribute information to inform community detection, our model
is the first augmented stochastic block model to handle multiple continuous
attributes. This provides the flexibility in biological data to, for example,
augment connectivity information with continuous measurements from multiple
experimental modalities. Because the lack of labeled network data often makes
community detection results difficult to validate, we highlight the usefulness
of our model for two network prediction tasks: link prediction and
collaborative filtering. As a result of fitting this attributed stochastic
block model, one can predict the attribute vector or connectivity patterns for
a new node in the event of the complementary source of information
(connectivity or attributes, respectively). We also highlight two biological
examples where the attributed stochastic block model provides satisfactory
performance in the link prediction and collaborative filtering tasks
Generative Interest Estimation for Document Recommendations
Learning distributed representations of documents has pushed the
state-of-the-art in several natural language processing tasks and was
successfully applied to the field of recommender systems recently. In this
paper, we propose a novel content-based recommender system based on learned
representations and a generative model of user interest. Our method works as
follows: First, we learn representations on a corpus of text documents. Then,
we capture a user's interest as a generative model in the space of the document
representations. In particular, we model the distribution of interest for each
user as a Gaussian mixture model (GMM). Recommendations can be obtained
directly by sampling from a user's generative model. Using Latent semantic
analysis (LSA) as comparison, we compute and explore document representations
on the Delicious bookmarks dataset, a standard benchmark for recommender
systems. We then perform density estimation in both spaces and show that
learned representations outperform LSA in terms of predictive performance
Temporal Proximity induces Attributes Similarity
Users consume their favorite content in temporal proximity of consumption
bundles according to their preferences and tastes. Thus, the underlying
attributes of items implicitly match user preferences, however, current
recommender systems largely ignore this fundamental driver in identifying
matching items. In this work, we introduce a novel temporal proximity filtering
method to enable items-matching. First, we demonstrate that proximity
preferences exist. Second, we present an induced similarity metric in temporal
proximity driven by user tastes and third, we show that this induced similarity
can be used to learn items pairwise similarity in attribute space. The proposed
model does not rely on any knowledge outside users' consumption bundles and
provide a novel way to devise user preferences and tastes driven novel items
recommender
Selective Transfer Learning for Cross Domain Recommendation
Collaborative filtering (CF) aims to predict users' ratings on items
according to historical user-item preference data. In many real-world
applications, preference data are usually sparse, which would make models
overfit and fail to give accurate predictions. Recently, several research works
show that by transferring knowledge from some manually selected source domains,
the data sparseness problem could be mitigated. However for most cases, parts
of source domain data are not consistent with the observations in the target
domain, which may misguide the target domain model building. In this paper, we
propose a novel criterion based on empirical prediction error and its variance
to better capture the consistency across domains in CF settings. Consequently,
we embed this criterion into a boosting framework to perform selective
knowledge transfer. Comparing to several state-of-the-art methods, we show that
our proposed selective transfer learning framework can significantly improve
the accuracy of rating prediction tasks on several real-world recommendation
tasks
Recommendation System based on Semantic Scholar Mining and Topic modeling: A behavioral analysis of researchers from six conferences
Recommendation systems have an important place to help online users in the
internet society. Recommendation Systems in computer science are of very
practical use these days in various aspects of the Internet portals, such as
social networks, and library websites. There are several approaches to
implement recommendation systems, Latent Dirichlet Allocation (LDA) is one the
popular techniques in Topic Modeling. Recently, researchers have proposed many
approaches based on Recommendation Systems and LDA. According to importance of
the subject, in this paper we discover the trends of the topics and find
relationship between LDA topics and Scholar-Context-documents. In fact, We
apply probabilistic topic modeling based on Gibbs sampling algorithms for a
semantic mining from six conference publications in computer science from DBLP
dataset. According to our experimental results, our semantic framework can be
effective to help organizations to better organize these conferences and cover
future research topics
Collaborative Filtering for Predicting User Preferences for Organizing Objects
As service robots become more and more capable of performing useful tasks for
us, there is a growing need to teach robots how we expect them to carry out
these tasks. However, different users typically have their own preferences, for
example with respect to arranging objects on different shelves. As many of
these preferences depend on a variety of factors including personal taste,
cultural background, or common sense, it is challenging for an expert to
pre-program a robot in order to accommodate all potential users. At the same
time, it is impractical for robots to constantly query users about how they
should perform individual tasks. In this work, we present an approach to learn
patterns in user preferences for the task of tidying up objects in containers,
e.g., shelves or boxes. Our method builds upon the paradigm of collaborative
filtering for making personalized recommendations and relies on data from
different users that we gather using crowdsourcing. To deal with novel objects
for which we have no data, we propose a method that compliments standard
collaborative filtering by leveraging information mined from the Web. When
solving a tidy-up task, we first predict pairwise object preferences of the
user. Then, we subdivide the objects in containers by modeling a spectral
clustering problem. Our solution is easy to update, does not require complex
modeling, and improves with the amount of user data. We evaluate our approach
using crowdsourcing data from over 1,200 users and demonstrate its
effectiveness for two tidy-up scenarios. Additionally, we show that a real
robot can reliably predict user preferences using our approach.Comment: Submission to The International Journal of Robotics Research.
Relevant material can be found at
http://www2.informatik.uni-freiburg.de/~abdon/task_preferences.htm
Exploiting Social Tags for Cross-Domain Collaborative Filtering
One of the most challenging problems in recommender systems based on the
collaborative filtering (CF) concept is data sparseness, i.e., limited user
preference data is available for making recommendations. Cross-domain
collaborative filtering (CDCF) has been studied as an effective mechanism to
alleviate data sparseness of one domain using the knowledge about user
preferences from other domains. A key question to be answered in the context of
CDCF is what common characteristics can be deployed to link different domains
for effective knowledge transfer. In this paper, we assess the usefulness of
user-contributed (social) tags in this respect. We do so by means of the
Generalized Tag-induced Cross-domain Collaborative Filtering (GTagCDCF)
approach that we propose in this paper and that we developed based on the
general collective matrix factorization framework. Assessment is done by a
series of experiments, using publicly available CF datasets that represent
three cross-domain cases, i.e., two two-domain cases and one three-domain case.
A comparative analysis on two-domain cases involving GTagCDCF and several
state-of-the-art CDCF approaches indicates the increased benefit of using
social tags as representatives of explicit links between domains for CDCF as
compared to the implicit links deployed by the existing CDCF methods. In
addition, we show that users from different domains can already benefit from
GTagCDCF if they only share a few common tags. Finally, we use the three-domain
case to validate the robustness of GTagCDCF with respect to the scale of
datasets and the varying number of domains.Comment: Manuscript under revie
- …