263 research outputs found
QuicK-means: Acceleration of K-means by learning a fast transform
K-means -- and the celebrated Lloyd algorithm -- is more than the clustering method it was originally designed to be. It has indeed proven pivotal to help increase the speed of many machine learning and data analysis techniques such as indexing, nearest-neighbor search and prediction, data compression, Radial Basis Function networks; its beneficial use has been shown to carry over to the acceleration of kernel machines (when using the Nyström method). Here, we propose a fast extension of K-means, dubbed QuicK-means, that rests on the idea of expressing the matrix of the centroids as a product of sparse matrices, a feat made possible by recent results devoted to find approximations of matrices as a product of sparse factors. Using such a decomposition squashes the complexity of the matrix-vector product between the factorized centroid matrix and any vector from to , with and , where is the dimension of the training data. This drastic computational saving has a direct impact in the assignment process of a point to a cluster, meaning that it is not only tangible at prediction time, but also at training time, provided the factorization procedure is performed during Lloyd's algorithm. We precisely show that resorting to a factorization step at each iteration does not impair the convergence of the optimization scheme and that, depending on the context, it may entail a reduction of the training time. Finally, we provide discussions and numerical simulations that show the versatility of our computationally-efficient QuicK-means algorithm
Learning from DPPs via Sampling: Beyond HKPV and symmetry
Determinantal point processes (DPPs) have become a significant tool for
recommendation systems, feature selection, or summary extraction, harnessing
the intrinsic ability of these probabilistic models to facilitate sample
diversity. The ability to sample from DPPs is paramount to the empirical
investigation of these models. Most exact samplers are variants of a spectral
meta-algorithm due to Hough, Krishnapur, Peres and Vir\'ag (henceforth HKPV),
which is in general time and resource intensive. For DPPs with symmetric
kernels, scalable HKPV samplers have been proposed that either first downsample
the ground set of items, or force the kernel to be low-rank, using e.g.
Nystr\"om-type decompositions.
In the present work, we contribute a radically different approach than HKPV.
Exploiting the fact that many statistical and learning objectives can be
effectively accomplished by only sampling certain key observables of a DPP
(so-called linear statistics), we invoke an expression for the Laplace
transform of such an observable as a single determinant, which holds in
complete generality. Combining traditional low-rank approximation techniques
with Laplace inversion algorithms from numerical analysis, we show how to
directly approximate the distribution function of a linear statistic of a DPP.
This distribution function can then be used in hypothesis testing or to
actually sample the linear statistic, as per requirement. Our approach is
scalable and applies to very general DPPs, beyond traditional symmetric
kernels
High-order, Dispersionless "Fast-Hybrid" Wave Equation Solver. Part I: Sampling Cost via Incident-Field Windowing and Recentering
This paper proposes a frequency/time hybrid integral-equation method for the
time dependent wave equation in two and three-dimensional spatial domains.
Relying on Fourier Transformation in time, the method utilizes a fixed
(time-independent) number of frequency-domain integral-equation solutions to
evaluate, with superalgebraically-small errors, time domain solutions for
arbitrarily long times. The approach relies on two main elements, namely, 1) A
smooth time-windowing methodology that enables accurate band-limited
representations for arbitrarily-long time signals, and 2) A novel Fourier
transform approach which, in a time-parallel manner and without causing
spurious periodicity effects, delivers numerically dispersionless
spectrally-accurate solutions. A similar hybrid technique can be obtained on
the basis of Laplace transforms instead of Fourier transforms, but we do not
consider the Laplace-based method in the present contribution. The algorithm
can handle dispersive media, it can tackle complex physical structures, it
enables parallelization in time in a straightforward manner, and it allows for
time leaping---that is, solution sampling at any given time at
-bounded sampling cost, for arbitrarily large values of ,
and without requirement of evaluation of the solution at intermediate times.
The proposed frequency-time hybridization strategy, which generalizes to any
linear partial differential equation in the time domain for which
frequency-domain solutions can be obtained (including e.g. the time-domain
Maxwell equations), and which is applicable in a wide range of scientific and
engineering contexts, provides significant advantages over other available
alternatives such as volumetric discretization, time-domain integral equations,
and convolution-quadrature approaches.Comment: 33 pages, 8 figures, revised and extended manuscript (and now
including direct comparisons to existing CQ and TDIE solver implementations)
(Part I of II
Block subsampled randomized Hadamard transform for low-rank approximation on distributed architectures
This article introduces a novel structured random matrix composed blockwise from subsampled randomized Hadamard transforms (SRHTs). The block SRHT is expected to outperform well-known dimension reduction maps, including SRHT and Gaussian matrices, on distributed architectures with not too many cores compared to the dimension. We prove that a block SRHT with enough rows is an oblivious subspace embedding, i.e., an approximate isometry for an arbitrary low-dimensional subspace with high probability. Our estimate of the required number of rows is similar to that of the standard SRHT. This suggests that the two transforms should provide the same accuracy of approximation in the algorithms. The block SRHT can be readily incorporated into randomized methods, for instance to compute a low-rank approximation of a large-scale matrix. For completeness, we revisit some common randomized approaches for this problem such as Randomized Singular Value Decomposition and Nyström approximation, with a discussion of their accuracy and implementation on distributed architectures
- …