1,969 research outputs found
Learning from Multi-View Multi-Way Data via Structural Factorization Machines
Real-world relations among entities can often be observed and determined by
different perspectives/views. For example, the decision made by a user on
whether to adopt an item relies on multiple aspects such as the contextual
information of the decision, the item's attributes, the user's profile and the
reviews given by other users. Different views may exhibit multi-way
interactions among entities and provide complementary information. In this
paper, we introduce a multi-tensor-based approach that can preserve the
underlying structure of multi-view data in a generic predictive model.
Specifically, we propose structural factorization machines (SFMs) that learn
the common latent spaces shared by multi-view tensors and automatically adjust
the importance of each view in the predictive model. Furthermore, the
complexity of SFMs is linear in the number of parameters, which make SFMs
suitable to large-scale problems. Extensive experiments on real-world datasets
demonstrate that the proposed SFMs outperform several state-of-the-art methods
in terms of prediction accuracy and computational cost.Comment: 10 page
Fast ALS-based tensor factorization for context-aware recommendation from implicit feedback
Albeit, the implicit feedback based recommendation problem - when only the
user history is available but there are no ratings - is the most typical
setting in real-world applications, it is much less researched than the
explicit feedback case. State-of-the-art algorithms that are efficient on the
explicit case cannot be straightforwardly transformed to the implicit case if
scalability should be maintained. There are few if any implicit feedback
benchmark datasets, therefore new ideas are usually experimented on explicit
benchmarks. In this paper, we propose a generic context-aware implicit feedback
recommender algorithm, coined iTALS. iTALS apply a fast, ALS-based tensor
factorization learning method that scales linearly with the number of non-zero
elements in the tensor. The method also allows us to incorporate diverse
context information into the model while maintaining its computational
efficiency. In particular, we present two such context-aware implementation
variants of iTALS. The first incorporates seasonality and enables to
distinguish user behavior in different time intervals. The other views the user
history as sequential information and has the ability to recognize usage
pattern typical to certain group of items, e.g. to automatically tell apart
product types or categories that are typically purchased repetitively
(collectibles, grocery goods) or once (household appliances). Experiments
performed on three implicit datasets (two proprietary ones and an implicit
variant of the Netflix dataset) show that by integrating context-aware
information with our factorization framework into the state-of-the-art implicit
recommender algorithm the recommendation quality improves significantly.Comment: Accepted for ECML/PKDD 2012, presented on 25th September 2012,
Bristol, U
Online Tensor Methods for Learning Latent Variable Models
We introduce an online tensor decomposition based approach for two latent
variable modeling problems namely, (1) community detection, in which we learn
the latent communities that the social actors in social networks belong to, and
(2) topic modeling, in which we infer hidden topics of text articles. We
consider decomposition of moment tensors using stochastic gradient descent. We
conduct optimization of multilinear operations in SGD and avoid directly
forming the tensors, to save computational and storage costs. We present
optimized algorithm in two platforms. Our GPU-based implementation exploits the
parallelism of SIMD architectures to allow for maximum speed-up by a careful
optimization of storage and data transfer, whereas our CPU-based implementation
uses efficient sparse matrix computations and is suitable for large sparse
datasets. For the community detection problem, we demonstrate accuracy and
computational efficiency on Facebook, Yelp and DBLP datasets, and for the topic
modeling problem, we also demonstrate good performance on the New York Times
dataset. We compare our results to the state-of-the-art algorithms such as the
variational method, and report a gain of accuracy and a gain of several orders
of magnitude in the execution time.Comment: JMLR 201
From Frequency to Meaning: Vector Space Models of Semantics
Computers understand very little of the meaning of human language. This
profoundly limits our ability to give instructions to computers, the ability of
computers to explain their actions to us, and the ability of computers to
analyse and process text. Vector space models (VSMs) of semantics are beginning
to address these limits. This paper surveys the use of VSMs for semantic
processing of text. We organize the literature on VSMs according to the
structure of the matrix in a VSM. There are currently three broad classes of
VSMs, based on term-document, word-context, and pair-pattern matrices, yielding
three classes of applications. We survey a broad range of applications in these
three categories and we take a detailed look at a specific open source project
in each category. Our goal in this survey is to show the breadth of
applications of VSMs for semantics, to provide a new perspective on VSMs for
those who are already familiar with the area, and to provide pointers into the
literature for those who are less familiar with the field
Personalized Video Recommendation Using Rich Contents from Videos
Video recommendation has become an essential way of helping people explore
the massive videos and discover the ones that may be of interest to them. In
the existing video recommender systems, the models make the recommendations
based on the user-video interactions and single specific content features. When
the specific content features are unavailable, the performance of the existing
models will seriously deteriorate. Inspired by the fact that rich contents
(e.g., text, audio, motion, and so on) exist in videos, in this paper, we
explore how to use these rich contents to overcome the limitations caused by
the unavailability of the specific ones. Specifically, we propose a novel
general framework that incorporates arbitrary single content feature with
user-video interactions, named as collaborative embedding regression (CER)
model, to make effective video recommendation in both in-matrix and
out-of-matrix scenarios. Our extensive experiments on two real-world
large-scale datasets show that CER beats the existing recommender models with
any single content feature and is more time efficient. In addition, we propose
a priority-based late fusion (PRI) method to gain the benefit brought by the
integrating the multiple content features. The corresponding experiment shows
that PRI brings real performance improvement to the baseline and outperforms
the existing fusion methods
Dictionary-based Tensor Canonical Polyadic Decomposition
To ensure interpretability of extracted sources in tensor decomposition, we
introduce in this paper a dictionary-based tensor canonical polyadic
decomposition which enforces one factor to belong exactly to a known
dictionary. A new formulation of sparse coding is proposed which enables high
dimensional tensors dictionary-based canonical polyadic decomposition. The
benefits of using a dictionary in tensor decomposition models are explored both
in terms of parameter identifiability and estimation accuracy. Performances of
the proposed algorithms are evaluated on the decomposition of simulated data
and the unmixing of hyperspectral images
- …