205 research outputs found
Kernel Methods for Collaborative Filtering
The goal of the thesis is to extend the kernel methods to matrix factorization(MF) for collaborative ltering(CF). In current literature, MF methods usually assume that the correlated data is distributed on a linear hyperplane, which is not always the case. The best known member of kernel methods is support vector machine (SVM) on linearly non-separable data. In this thesis, we apply kernel methods on MF, embedding the data into a possibly higher dimensional space and conduct factorization in that space. To improve kernelized matrix factorization, we apply multi-kernel learning methods to select optimal kernel functions from the candidates and introduce L2-norm regularization on the weight learning process. In our empirical study, we conduct experiments on three real-world datasets. The results suggest that the proposed method can improve the accuracy of the prediction surpassing state-of-art CF methods
Learning Output Kernels for Multi-Task Problems
Simultaneously solving multiple related learning tasks is beneficial under a
variety of circumstances, but the prior knowledge necessary to correctly model
task relationships is rarely available in practice. In this paper, we develop a
novel kernel-based multi-task learning technique that automatically reveals
structural inter-task relationships. Building over the framework of output
kernel learning (OKL), we introduce a method that jointly learns multiple
functions and a low-rank multi-task kernel by solving a non-convex
regularization problem. Optimization is carried out via a block coordinate
descent strategy, where each subproblem is solved using suitable conjugate
gradient (CG) type iterative methods for linear operator equations. The
effectiveness of the proposed approach is demonstrated on pharmacological and
collaborative filtering data
Signed Distance-based Deep Memory Recommender
Personalized recommendation algorithms learn a user's preference for an item
by measuring a distance/similarity between them. However, some of the existing
recommendation models (e.g., matrix factorization) assume a linear relationship
between the user and item. This approach limits the capacity of recommender
systems, since the interactions between users and items in real-world
applications are much more complex than the linear relationship. To overcome
this limitation, in this paper, we design and propose a deep learning framework
called Signed Distance-based Deep Memory Recommender, which captures non-linear
relationships between users and items explicitly and implicitly, and work well
in both general recommendation task and shopping basket-based recommendation
task. Through an extensive empirical study on six real-world datasets in the
two recommendation tasks, our proposed approach achieved significant
improvement over ten state-of-the-art recommendation models
Kernelized Sparse Self-Representation for Clustering and Recommendation
Sparse models have demonstrated substantial success in applications for data analysis such as clustering, classification and denoising. However, most of the current work is built upon the assumption that data is distributed in a union of subspaces, whereas limited work has been conducted on nonlinear datasets where data reside in a union of manifolds rather than a union of subspaces. To understand data nonlinearity using sparse models, in this paper, we propose to exploit the self-representation property of nonlinear data in an implicit feature space using kernel methods. We propose a kernelized sparse self-representation model, denoted as KSSR, and a novel Kernelized Fast Iterative Soft-Thresholding Algorithm, denoted as K-FISTA, to recover the underlying nonlinear structure among the data. We evaluate our method for clustering problems on both synthetic and real-world datasets, and demonstrate its superior performance compared to the other state-of-the-art methods. We also apply our method for collaborative filtering in recommender systems, and demonstrate its great potential for novel applications beyond clustering
A Comparative Study of Pairwise Learning Methods based on Kernel Ridge Regression
Many machine learning problems can be formulated as predicting labels for a
pair of objects. Problems of that kind are often referred to as pairwise
learning, dyadic prediction or network inference problems. During the last
decade kernel methods have played a dominant role in pairwise learning. They
still obtain a state-of-the-art predictive performance, but a theoretical
analysis of their behavior has been underexplored in the machine learning
literature.
In this work we review and unify existing kernel-based algorithms that are
commonly used in different pairwise learning settings, ranging from matrix
filtering to zero-shot learning. To this end, we focus on closed-form efficient
instantiations of Kronecker kernel ridge regression. We show that independent
task kernel ridge regression, two-step kernel ridge regression and a linear
matrix filter arise naturally as a special case of Kronecker kernel ridge
regression, implying that all these methods implicitly minimize a squared loss.
In addition, we analyze universality, consistency and spectral filtering
properties. Our theoretical results provide valuable insights in assessing the
advantages and limitations of existing pairwise learning methods.Comment: arXiv admin note: text overlap with arXiv:1606.0427
Incorporating Side Information in Probabilistic Matrix Factorization with Gaussian Processes
Probabilistic matrix factorization (PMF) is a powerful method for modeling
data associated with pairwise relationships, finding use in collaborative
filtering, computational biology, and document analysis, among other areas. In
many domains, there is additional information that can assist in prediction.
For example, when modeling movie ratings, we might know when the rating
occurred, where the user lives, or what actors appear in the movie. It is
difficult, however, to incorporate this side information into the PMF model. We
propose a framework for incorporating side information by coupling together
multiple PMF problems via Gaussian process priors. We replace scalar latent
features with functions that vary over the space of side information. The GP
priors on these functions require them to vary smoothly and share information.
We successfully use this new method to predict the scores of professional
basketball games, where side information about the venue and date of the game
are relevant for the outcome.Comment: 18 pages, 4 figures, Submitted to UAI 201
- …