2,601 research outputs found
Scalable Recommendation with Poisson Factorization
We develop a Bayesian Poisson matrix factorization model for forming
recommendations from sparse user behavior data. These data are large user/item
matrices where each user has provided feedback on only a small subset of items,
either explicitly (e.g., through star ratings) or implicitly (e.g., through
views or purchases). In contrast to traditional matrix factorization
approaches, Poisson factorization implicitly models each user's limited
attention to consume items. Moreover, because of the mathematical form of the
Poisson likelihood, the model needs only to explicitly consider the observed
entries in the matrix, leading to both scalable computation and good predictive
performance. We develop a variational inference algorithm for approximate
posterior inference that scales up to massive data sets. This is an efficient
algorithm that iterates over the observed entries and adjusts an approximate
posterior over the user/item representations. We apply our method to large
real-world user data containing users rating movies, users listening to songs,
and users reading scientific papers. In all these settings, Bayesian Poisson
factorization outperforms state-of-the-art matrix factorization methods
Recurrent Poisson Factorization for Temporal Recommendation
Poisson factorization is a probabilistic model of users and items for
recommendation systems, where the so-called implicit consumer data is modeled
by a factorized Poisson distribution. There are many variants of Poisson
factorization methods who show state-of-the-art performance on real-world
recommendation tasks. However, most of them do not explicitly take into account
the temporal behavior and the recurrent activities of users which is essential
to recommend the right item to the right user at the right time. In this paper,
we introduce Recurrent Poisson Factorization (RPF) framework that generalizes
the classical PF methods by utilizing a Poisson process for modeling the
implicit feedback. RPF treats time as a natural constituent of the model and
brings to the table a rich family of time-sensitive factorization models. To
elaborate, we instantiate several variants of RPF who are capable of handling
dynamic user preferences and item specification (DRPF), modeling the
social-aspect of product adoption (SRPF), and capturing the consumption
heterogeneity among users and items (HRPF). We also develop a variational
algorithm for approximate posterior inference that scales up to massive data
sets. Furthermore, we demonstrate RPF's superior performance over many
state-of-the-art methods on synthetic dataset, and large scale real-world
datasets on music streaming logs, and user-item interactions in M-Commerce
platforms.Comment: Submitted to KDD 2017 | Halifax, Nova Scotia - Canada - sigkdd, Codes
are available at https://github.com/AHosseini/RP
Poisson factorization for peer-based anomaly detection
Anomaly detection systems are a promising tool to identify compromised user credentials and malicious insiders in enterprise networks. Most existing approaches for modelling user behaviour rely on either independent observations for each user or on pre-defined user peer groups. A method is proposed based on recommender system algorithms to learn overlapping user peer groups and to use this learned structure to detect anomalous activity. Results analysing the authentication and process-running activities of thousands of users show that the proposed method can detect compromised user accounts during a red team exercise
A Probabilistic Model for the Cold-Start Problem in Rating Prediction using Click Data
One of the most efficient methods in collaborative filtering is matrix
factorization, which finds the latent vector representations of users and items
based on the ratings of users to items. However, a matrix factorization based
algorithm suffers from the cold-start problem: it cannot find latent vectors
for items to which previous ratings are not available. This paper utilizes
click data, which can be collected in abundance, to address the cold-start
problem. We propose a probabilistic item embedding model that learns item
representations from click data, and a model named EMB-MF, that connects it
with a probabilistic matrix factorization for rating prediction. The
experiments on three real-world datasets demonstrate that the proposed model is
not only effective in recommending items with no previous ratings, but also
outperforms competing methods, especially when the data is very sparse.Comment: ICONIP 201
Economic Complexity Unfolded: Interpretable Model for the Productive Structure of Economies
Economic complexity reflects the amount of knowledge that is embedded in the
productive structure of an economy. It resides on the premise of hidden
capabilities - fundamental endowments underlying the productive structure. In
general, measuring the capabilities behind economic complexity directly is
difficult, and indirect measures have been suggested which exploit the fact
that the presence of the capabilities is expressed in a country's mix of
products. We complement these studies by introducing a probabilistic framework
which leverages Bayesian non-parametric techniques to extract the dominant
features behind the comparative advantage in exported products. Based on
economic evidence and trade data, we place a restricted Indian Buffet Process
on the distribution of countries' capability endowment, appealing to a culinary
metaphor to model the process of capability acquisition. The approach comes
with a unique level of interpretability, as it produces a concise and
economically plausible description of the instantiated capabilities
- …