Search CORE

16,558 research outputs found

High Dimensional Low Rank plus Sparse Matrix Decomposition

Author: Atia George
Rahmani Mostafa
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/03/2017
Field of study

This paper is concerned with the problem of low rank plus sparse matrix decomposition for big data. Conventional algorithms for matrix decomposition use the entire data to extract the low-rank and sparse components, and are based on optimization problems with complexity that scales with the dimension of the data, which limits their scalability. Furthermore, existing randomized approaches mostly rely on uniform random sampling, which is quite inefficient for many real world data matrices that exhibit additional structures (e.g. clustering). In this paper, a scalable subspace-pursuit approach that transforms the decomposition problem to a subspace learning problem is proposed. The decomposition is carried out using a small data sketch formed from sampled columns/rows. Even when the data is sampled uniformly at random, it is shown that the sufficient number of sampled columns/rows is roughly O(r\mu), where \mu is the coherency parameter and r the rank of the low rank component. In addition, adaptive sampling algorithms are proposed to address the problem of column/row sampling from structured data. We provide an analysis of the proposed method with adaptive sampling and show that adaptive sampling makes the required number of sampled columns/rows invariant to the distribution of the data. The proposed approach is amenable to online implementation and an online scheme is proposed.Comment: IEEE Transactions on Signal Processin

arXiv.org e-Print Archive

Crossref

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Incremental multi-domain learning with network latent tensor factorization

Author: Bulat Adrian
Kossaifi Jean
Pantic Maja
Tzimiropoulos Georgios
Publication venue
Publication date: 22/11/2019
Field of study

The prominence of deep learning, large amount of annotated data and increasingly powerful hardware made it possible to reach remarkable performance for supervised classification tasks, in many cases saturating the training sets. However the resulting models are specialized to a single very specific task and domain. Adapting the learned classification to new domains is a hard problem due to at least three reasons: (1) the new domains and the tasks might be drastically different; (2) there might be very limited amount of annotated data on the new domain and (3) full training of a new model for each new task is prohibitive in terms of computation and memory, due to the sheer number of parameters of deep CNNs. In this paper, we present a method to learn new-domains and tasks incrementally, building on prior knowledge from already learned tasks and without catastrophic forgetting. We do so by jointly parametrizing weights across layers using low-rank Tucker structure. The core is task agnostic while a set of task specific factors are learnt on each new domain. We show that leveraging tensor structure enables better performance than simply using matrix operations. Joint tensor modelling also naturally leverages correlations across different layers. Compared with previous methods which have focused on adapting each layer separately, our approach results in more compact representations for each new task/domain. We apply the proposed method to the 10 datasets of the Visual Decathlon Challenge and show that our method offers on average about 7.5x reduction in number of parameters and competitive performance in terms of both classification accuracy and Decathlon score.Comment: AAAI2

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Randomized Robust Subspace Recovery for High Dimensional Data Matrices

Author: Atia George
Rahmani Mostafa
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/04/2016
Field of study

This paper explores and analyzes two randomized designs for robust Principal Component Analysis (PCA) employing low-dimensional data sketching. In one design, a data sketch is constructed using random column sampling followed by low dimensional embedding, while in the other, sketching is based on random column and row sampling. Both designs are shown to bring about substantial savings in complexity and memory requirements for robust subspace learning over conventional approaches that use the full scale data. A characterization of the sample and computational complexity of both designs is derived in the context of two distinct outlier models, namely, sparse and independent outlier models. The proposed randomized approach can provably recover the correct subspace with computational and sample complexity that are almost independent of the size of the data. The results of the mathematical analysis are confirmed through numerical simulations using both synthetic and real data

arXiv.org e-Print Archive

Crossref

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

CUR Decompositions, Similarity Matrices, and Subspace Clustering

Author: Aldroubi Akram
Hamm Keaton
Koku Ahmet Bugra
Sekmen Ali
Publication venue
Publication date: 11/12/2018
Field of study

A general framework for solving the subspace clustering problem using the CUR decomposition is presented. The CUR decomposition provides a natural way to construct similarity matrices for data that come from a union of unknown subspaces

\mathscr{U}=\underset{i=1}{\overset{M}\bigcup}S_i

. The similarity matrices thus constructed give the exact clustering in the noise-free case. Additionally, this decomposition gives rise to many distinct similarity matrices from a given set of data, which allow enough flexibility to perform accurate clustering of noisy data. We also show that two known methods for subspace clustering can be derived from the CUR decomposition. An algorithm based on the theoretical construction of similarity matrices is presented, and experiments on synthetic and real data are presented to test the method. Additionally, an adaptation of our CUR based similarity matrices is utilized to provide a heuristic algorithm for subspace clustering; this algorithm yields the best overall performance to date for clustering the Hopkins155 motion segmentation dataset.Comment: Approximately 30 pages. Current version contains improved algorithm and numerical experiments from the previous versio

arXiv.org e-Print Archive

Directory of Open Access Journals

Digital Scholarship @ Tennessee State University

OpenMETU (Middle East Technical University)

Finding a low-rank basis in a matrix subspace

Author: Nakatsukasa Yuji
Soma Tasuku
Uschmajew André
Publication venue
Publication date: 01/01/2016
Field of study

For a given matrix subspace, how can we find a basis that consists of low-rank matrices? This is a generalization of the sparse vector problem. It turns out that when the subspace is spanned by rank-1 matrices, the matrices can be obtained by the tensor CP decomposition. For the higher rank case, the situation is not as straightforward. In this work we present an algorithm based on a greedy process applicable to higher rank problems. Our algorithm first estimates the minimum rank by applying soft singular value thresholding to a nuclear norm relaxation, and then computes a matrix with that rank using the method of alternating projections. We provide local convergence results, and compare our algorithm with several alternative approaches. Applications include data compression beyond the classical truncated SVD, computing accurate eigenvectors of a near-multiple eigenvalue, image separation and graph Laplacian eigenproblems

arXiv.org e-Print Archive

OPUS Augsburg

Crossref

Oxford University Research Archive