5,818 research outputs found
Factoring nonnegative matrices with linear programs
This paper describes a new approach, based on linear programming, for
computing nonnegative matrix factorizations (NMFs). The key idea is a
data-driven model for the factorization where the most salient features in the
data are used to express the remaining features. More precisely, given a data
matrix X, the algorithm identifies a matrix C such that X approximately equals
CX and some linear constraints. The constraints are chosen to ensure that the
matrix C selects features; these features can then be used to find a low-rank
NMF of X. A theoretical analysis demonstrates that this approach has guarantees
similar to those of the recent NMF algorithm of Arora et al. (2012). In
contrast with this earlier work, the proposed method extends to more general
noise models and leads to efficient, scalable algorithms. Experiments with
synthetic and real datasets provide evidence that the new approach is also
superior in practice. An optimized C++ implementation can factor a
multigigabyte matrix in a matter of minutes.Comment: 17 pages, 10 figures. Modified theorem statement for robust recovery
conditions. Revised proof techniques to make arguments more elementary.
Results on robustness when rows are duplicated have been superseded by
arxiv.org/1211.668
Fast Matrix Factorization for Online Recommendation with Implicit Feedback
This paper contributes improvements on both the effectiveness and efficiency
of Matrix Factorization (MF) methods for implicit feedback. We highlight two
critical issues of existing works. First, due to the large space of unobserved
feedback, most existing works resort to assign a uniform weight to the missing
data to reduce computational complexity. However, such a uniform assumption is
invalid in real-world settings. Second, most methods are also designed in an
offline setting and fail to keep up with the dynamic nature of online data. We
address the above two issues in learning MF models from implicit feedback. We
first propose to weight the missing data based on item popularity, which is
more effective and flexible than the uniform-weight assumption. However, such a
non-uniform weighting poses efficiency challenge in learning the model. To
address this, we specifically design a new learning algorithm based on the
element-wise Alternating Least Squares (eALS) technique, for efficiently
optimizing a MF model with variably-weighted missing data. We exploit this
efficiency to then seamlessly devise an incremental update strategy that
instantly refreshes a MF model given new feedback. Through comprehensive
experiments on two public datasets in both offline and online protocols, we
show that our eALS method consistently outperforms state-of-the-art implicit MF
methods. Our implementation is available at
https://github.com/hexiangnan/sigir16-eals.Comment: 10 pages, 8 figure
The Incremental Multiresolution Matrix Factorization Algorithm
Multiresolution analysis and matrix factorization are foundational tools in
computer vision. In this work, we study the interface between these two
distinct topics and obtain techniques to uncover hierarchical block structure
in symmetric matrices -- an important aspect in the success of many vision
problems. Our new algorithm, the incremental multiresolution matrix
factorization, uncovers such structure one feature at a time, and hence scales
well to large matrices. We describe how this multiscale analysis goes much
farther than what a direct global factorization of the data can identify. We
evaluate the efficacy of the resulting factorizations for relative leveraging
within regression tasks using medical imaging data. We also use the
factorization on representations learned by popular deep networks, providing
evidence of their ability to infer semantic relationships even when they are
not explicitly trained to do so. We show that this algorithm can be used as an
exploratory tool to improve the network architecture, and within numerous other
settings in vision.Comment: Computer Vision and Pattern Recognition (CVPR) 2017, 10 page
DID: Distributed Incremental Block Coordinate Descent for Nonnegative Matrix Factorization
Nonnegative matrix factorization (NMF) has attracted much attention in the
last decade as a dimension reduction method in many applications. Due to the
explosion in the size of data, naturally the samples are collected and stored
distributively in local computational nodes. Thus, there is a growing need to
develop algorithms in a distributed memory architecture. We propose a novel
distributed algorithm, called \textit{distributed incremental block coordinate
descent} (DID), to solve the problem. By adapting the block coordinate descent
framework, closed-form update rules are obtained in DID. Moreover, DID performs
updates incrementally based on the most recently updated residual matrix. As a
result, only one communication step per iteration is required. The correctness,
efficiency, and scalability of the proposed algorithm are verified in a series
of numerical experiments.Comment: Accepted by AAAI 201
Algorithms and Architecture for Real-time Recommendations at News UK
Recommendation systems are recognised as being hugely important in industry,
and the area is now well understood. At News UK, there is a requirement to be
able to quickly generate recommendations for users on news items as they are
published. However, little has been published about systems that can generate
recommendations in response to changes in recommendable items and user
behaviour in a very short space of time. In this paper we describe a new
algorithm for updating collaborative filtering models incrementally, and
demonstrate its effectiveness on clickstream data from The Times. We also
describe the architecture that allows recommendations to be generated on the
fly, and how we have made each component scalable. The system is currently
being used in production at News UK.Comment: Accepted for presentation at AI-2017 Thirty-seventh SGAI
International Conference on Artificial Intelligence. Cambridge, England 12-14
December 201
- …