696 research outputs found
Kernel Spectral Clustering and applications
In this chapter we review the main literature related to kernel spectral
clustering (KSC), an approach to clustering cast within a kernel-based
optimization setting. KSC represents a least-squares support vector machine
based formulation of spectral clustering described by a weighted kernel PCA
objective. Just as in the classifier case, the binary clustering model is
expressed by a hyperplane in a high dimensional space induced by a kernel. In
addition, the multi-way clustering can be obtained by combining a set of binary
decision functions via an Error Correcting Output Codes (ECOC) encoding scheme.
Because of its model-based nature, the KSC method encompasses three main steps:
training, validation, testing. In the validation stage model selection is
performed to obtain tuning parameters, like the number of clusters present in
the data. This is a major advantage compared to classical spectral clustering
where the determination of the clustering parameters is unclear and relies on
heuristics. Once a KSC model is trained on a small subset of the entire data,
it is able to generalize well to unseen test points. Beyond the basic
formulation, sparse KSC algorithms based on the Incomplete Cholesky
Decomposition (ICD) and , , Group Lasso regularization are
reviewed. In that respect, we show how it is possible to handle large scale
data. Also, two possible ways to perform hierarchical clustering and a soft
clustering method are presented. Finally, real-world applications such as image
segmentation, power load time-series clustering, document clustering and big
data learning are considered.Comment: chapter contribution to the book "Unsupervised Learning Algorithms
The Matrix Ridge Approximation: Algorithms and Applications
We are concerned with an approximation problem for a symmetric positive
semidefinite matrix due to motivation from a class of nonlinear machine
learning methods. We discuss an approximation approach that we call {matrix
ridge approximation}. In particular, we define the matrix ridge approximation
as an incomplete matrix factorization plus a ridge term. Moreover, we present
probabilistic interpretations using a normal latent variable model and a
Wishart model for this approximation approach. The idea behind the latent
variable model in turn leads us to an efficient EM iterative method for
handling the matrix ridge approximation problem. Finally, we illustrate the
applications of the approximation approach in multivariate data analysis.
Empirical studies in spectral clustering and Gaussian process regression show
that the matrix ridge approximation with the EM iteration is potentially
useful
kernlab - An S4 Package for Kernel Methods in R
kernlab is an extensible package for kernel-based machine learning methods in R. It takes advantage of R's new S4 ob ject model and provides a framework for creating and using kernel-based algorithms. The package contains dot product primitives (kernels), implementations of support vector machines and the relevance vector machine, Gaussian processes, a ranking algorithm, kernel PCA, kernel CCA, and a spectral clustering algorithm. Moreover it provides a general purpose quadratic programming solver, and an incomplete Cholesky decomposition method.
Superfast Line Spectral Estimation
A number of recent works have proposed to solve the line spectral estimation
problem by applying off-the-grid extensions of sparse estimation techniques.
These methods are preferable over classical line spectral estimation algorithms
because they inherently estimate the model order. However, they all have
computation times which grow at least cubically in the problem size, thus
limiting their practical applicability in cases with large dimensions. To
alleviate this issue, we propose a low-complexity method for line spectral
estimation, which also draws on ideas from sparse estimation. Our method is
based on a Bayesian view of the problem. The signal covariance matrix is shown
to have Toeplitz structure, allowing superfast Toeplitz inversion to be used.
We demonstrate that our method achieves estimation accuracy at least as good as
current methods and that it does so while being orders of magnitudes faster.Comment: 16 pages, 7 figures, accepted for IEEE Transactions on Signal
Processin
- …