165 research outputs found
Fast Linear Algorithms for Machine Learning
Nowadays linear methods like Regression, Principal Component Analysis and Canoni- cal Correlation Analysis are well understood and widely used by the machine learning community for predictive modeling and feature generation. Generally speaking, all these methods aim at capturing interesting subspaces in the original high dimensional feature space. Due to the simple linear structures, these methods all have a closed form solution which makes computation and theoretical analysis very easy for small datasets. However, in modern machine learning problems it\u27s very common for a dataset to have millions or billions of features and samples. In these cases, pursuing the closed form solution for these linear methods can be extremely slow since it requires multiplying two huge matrices and computing inverse, inverse square root, QR decomposition or Singular Value Decomposition (SVD) of huge matrices. In this thesis, we consider three fast al- gorithms for computing Regression and Canonical Correlation Analysis approximate for huge datasets
Recommended from our members
Quadratic maximization under combinatorial constraints and related applications
Motivated primarily by restricted variants of Principal Component Analysis (PCA), we study quadratic maximization problems subject to sparsity, nonnegativity and other combinatorial constraints. Intuitively, a key technical challenge is determining the support of the optimal solution. We develop a method that can surprisingly solve the maximization exactly when the argument matrix of the quadratic objective is positive semidefinite and has constant rank. Our approach relies on a hyper-spherical transformation of the low-rank space and has complexity that scales exponentially in the rank of the input, but polynomially in the ambient dimension. Extending these observations, we describe a simpler {approximation} algorithm based on exploring the low-rank space with an [epsilon]-net, drastically improving the dependence on the ambient dimension, implying a Polynomial Time Approximation Scheme (PTAS) for inputs with rank scaling up to logarithmically in the dimension or sufficiently sharp spectral decay. We discuss extensions of our approach to jointly computing multiple principal components under combinatorial constraints, such as the problem of extracting multiple orthogonal nonnegative components, or sparse components with common or disjoint supports, and related approximate matrix factorization problems. We further extend our quadratic maximization framework to bilinear optimization problems and employ it in the context of specific applications, e.g., to develop a provable approximation algorithm for the NP-hard problem of Bipartite Correlation Clustering (BCC). Real datasets will typically produce covariance matrices that have full rank, rendering our algorithms not applicable. Our approach is to first obtain a low-rank approximation of the input data and subsequently solve the low-rank problem using our framework. Although this approach is not always suitable, from an optimization perspective it yields provable, data-dependent performance bounds that rely on the spectral decay of the input and the employed approximation technique. Interestingly, most real matrices can be well approximated by low-rank surrogates since the eigenvalues display a significant decay. Empirical evaluation shows that our algorithms have excellent performance and in many cases outperform the previous state of the art. Finally, utilizing our framework, we develop algorithms with interesting theoretical guarantees in the context of specific applications, such as approximate Orthogonal Nonnegative Matrix Factorization or Bipartite Correlation Clustering.Electrical and Computer Engineerin
- …