1 research outputs found
Online Convex Matrix Factorization with Representative Regions
Matrix factorization (MF) is a versatile learning method that has found wide
applications in various data-driven disciplines. Still, many MF algorithms do
not adequately scale with the size of available datasets and/or lack
interpretability. To improve the computational efficiency of the method, an
online (streaming) MF algorithm was proposed in Mairal et al. [2010]. To enable
data interpretability, a constrained version of MF, termed convex MF, was
introduced in Ding et al. [2010]. In the latter work, the basis vectors are
required to lie in the convex hull of the data samples, thereby ensuring that
every basis can be interpreted as a weighted combination of data samples. No
current algorithmic solutions for online convex MF are known as it is
challenging to find adequate convex bases without having access to the complete
dataset. We address both problems by proposing the first online convex MF
algorithm that maintains a collection of constant-size sets of representative
data samples needed for interpreting each of the basis (Ding et al. [2010]) and
has the same almost sure convergence guarantees as the online learning
algorithm of Mairal et al. [2010]. Our proof techniques combine random
coordinate descent algorithms with specialized quasi-martingale convergence
analysis. Experiments on synthetic and real world datasets show significant
computational savings of the proposed online convex MF method compared to
classical convex MF. Since the proposed method maintains small representative
sets of data samples needed for convex interpretations, it is related to a body
of work in theoretical computer science, pertaining to generating point sets
(Blum et al. [2016]), and in computer vision, pertaining to archetypal analysis
(Mei et al. [2018]). Nevertheless, it differs from these lines of work both in
terms of the objective and algorithmic implementations