Search CORE

3,582 research outputs found

Robust Subspace Learning: Robust PCA, Robust Subspace Tracking, and Robust Subspace Recovery

Author: Bouwmans Thierry
Javed Sajid
Narayanamurthy Praneeth
Vaswani Namrata
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 05/07/2018
Field of study

PCA is one of the most widely used dimension reduction techniques. A related easier problem is "subspace learning" or "subspace estimation". Given relatively clean data, both are easily solved via singular value decomposition (SVD). The problem of subspace learning or PCA in the presence of outliers is called robust subspace learning or robust PCA (RPCA). For long data sequences, if one tries to use a single lower dimensional subspace to represent the data, the required subspace dimension may end up being quite large. For such data, a better model is to assume that it lies in a low-dimensional subspace that can change over time, albeit gradually. The problem of tracking such data (and the subspaces) while being robust to outliers is called robust subspace tracking (RST). This article provides a magazine-style overview of the entire field of robust subspace learning and tracking. In particular solutions for three problems are discussed in detail: RPCA via sparse+low-rank matrix decomposition (S+LR), RST via S+LR, and "robust subspace recovery (RSR)". RSR assumes that an entire data vector is either an outlier or an inlier. The S+LR formulation instead assumes that outliers occur on only a few data vector indices and hence are well modeled as sparse corruptions.Comment: To appear, IEEE Signal Processing Magazine, July 201

arXiv.org e-Print Archive

Open Access Repository of IISc Research Publications

Efficient piecewise linear classifiers and applications

Author: Webb Dean
Publication venue
Publication date
Field of study

Supervised learning has become an essential part of data mining for industry, military, science and academia. Classification, a type of supervised learning allows a machine to learn from data to then predict certain behaviours, variables or outcomes. Classification can be used to solve many problems including the detection of malignant cancers, potentially bad creditors and even enabling autonomy in robots. The ability to collect and store large amounts of data has increased significantly over the past few decades. However, the ability of classification techniques to deal with large scale data has not been matched. Many data transformation and reduction schemes have been tried with mixed success. This problem is further exacerbated when dealing with real time classification in embedded systems. The real time classifier must classify using only limited processing, memory and power resources. Piecewise linear boundaries are known to provide efficient real time classifiers. They have low memory requirements, require little processing effort, are parameterless and classify in real time. Piecewise linear functions are used to approximate non-linear decision boundaries between pattern classes. Finding these piecewise linear boundaries is a difficult optimization problem that can require a long training time. Multiple optimization approaches have been used for real time classification, but can lead to suboptimal piecewise linear boundaries. This thesis develops three real time piecewise linear classifiers that deal with large scale data. Each classifier uses a single optimization algorithm in conjunction with an incremental approach that reduces the number of points as the decision boundaries are built. Two of the classifiers further reduce complexity by augmenting the incremental approach with additional schemes. One scheme uses hyperboxes to identify points inside the so-called “indeterminate” regions. The other uses a polyhedral conic set to identify data points lying on or close to the boundary. All other points are excluded from the process of building the decision boundaries. The three classifiers are applied to real time data classification problems and the results of numerical experiments on real world data sets are reported. These results demonstrate that the new classifiers require a reasonable training time and their test set accuracy is consistently good on most data sets compared with current state of the art classifiers.Doctor of Philosoph

Federation ResearchOnline

A Detailed Investigation into Low-Level Feature Detection in Spectrogram Images

Author: Belhumeur
Belkin
Bengio
Bishop
Chen
Di Martino
Duda
Duda
Egan
Fawcett
Fukunaga
Ghosh
Gillespie
Gonzalez
Hinton
Jia
Jolliffe
Karhunen
Kendall
Koenig
Kohonen
Kohonen
Lampert
Law
Le
Lee
Mellinger
Mitchell
Mitchell
Morrissey
Nayar
Paris
Pearson
Potter
Quinn
Rife
Scharf
Simon E.M. O’Keefe
Thomas A. Lampert
Van der Maaten
von Gioi
Yan
Yang
Zhang
Publication venue: 'Elsevier BV'
Publication date: 01/09/2011
Field of study

Being the first stage of analysis within an image, low-level feature detection is a crucial step in the image analysis process and, as such, deserves suitable attention. This paper presents a systematic investigation into low-level feature detection in spectrogram images. The result of which is the identification of frequency tracks. Analysis of the literature identifies different strategies for accomplishing low-level feature detection. Nevertheless, the advantages and disadvantages of each are not explicitly investigated. Three model-based detection strategies are outlined, each extracting an increasing amount of information from the spectrogram, and, through ROC analysis, it is shown that at increasing levels of extraction the detection rates increase. Nevertheless, further investigation suggests that model-based detection has a limitation—it is not computationally feasible to fully evaluate the model of even a simple sinusoidal track. Therefore, alternative approaches, such as dimensionality reduction, are investigated to reduce the complex search space. It is shown that, if carefully selected, these techniques can approach the detection rates of model-based strategies that perform the same level of information extraction. The implementations used to derive the results presented within this paper are available online from http://stdetect.googlecode.com

Crossref

White Rose Research Online

Diffeomorphic Transformations for Time Series Analysis: An Efficient Approach to Nonlinear Warping

Author: Martinez Iñigo
Publication venue
Publication date: 25/09/2023
Field of study

The proliferation and ubiquity of temporal data across many disciplines has sparked interest for similarity, classification and clustering methods specifically designed to handle time series data. A core issue when dealing with time series is determining their pairwise similarity, i.e., the degree to which a given time series resembles another. Traditional distance measures such as the Euclidean are not well-suited due to the time-dependent nature of the data. Elastic metrics such as dynamic time warping (DTW) offer a promising approach, but are limited by their computational complexity, non-differentiability and sensitivity to noise and outliers. This thesis proposes novel elastic alignment methods that use parametric \& diffeomorphic warping transformations as a means of overcoming the shortcomings of DTW-based metrics. The proposed method is differentiable \& invertible, well-suited for deep learning architectures, robust to noise and outliers, computationally efficient, and is expressive and flexible enough to capture complex patterns. Furthermore, a closed-form solution was developed for the gradient of these diffeomorphic transformations, which allows an efficient search in the parameter space, leading to better solutions at convergence. Leveraging the benefits of these closed-form diffeomorphic transformations, this thesis proposes a suite of advancements that include: (a) an enhanced temporal transformer network for time series alignment and averaging, (b) a deep-learning based time series classification model to simultaneously align and classify signals with high accuracy, (c) an incremental time series clustering algorithm that is warping-invariant, scalable and can operate under limited computational and time resources, and finally, (d) a normalizing flow model that enhances the flexibility of affine transformations in coupling and autoregressive layers.Comment: PhD Thesis, defended at the University of Navarra on July 17, 2023. 277 pages, 8 chapters, 1 appendi

arXiv.org e-Print Archive

Standard Bundle Methods: Untrusted Models and Duality

Author: Antonio Frangioni
Publication venue: University of Pisa
Publication date: 15/06/2018
Field of study

We review the basic ideas underlying the vast family of algorithms for nonsmooth convex optimization known as "bundle methods|. In a nutshell, these approaches are based on constructing models of the function, but lack of continuity of first-order information implies that these models cannot be trusted, not even close to an optimum. Therefore, many different forms of stabilization have been proposed to try to avoid being led to areas where the model is so inaccurate as to result in almost useless steps. In the development of these methods, duality arguments are useful, if not outright necessary, to better analyze the behaviour of the algorithms. Also, in many relevant applications the function at hand is itself a dual one, so that duality allows to map back algorithmic concepts and results into a "primal space" where they can be exploited; in turn, structure in that space can be exploited to improve the algorithms' behaviour, e.g. by developing better models. We present an updated picture of the many developments around the basic idea along at least three different axes: form of the stabilization, form of the model, and approximate evaluation of the function

UnipiEprints

Locally optimal Delaunay-refinement and optimisation-based mesh generation

Author: Engwirda Darren
Publication venue: Faculty of Science, School of Mathematics and Statistics
Publication date: 01/01/2015
Field of study

The field of mesh generation concerns the development of efficient algorithmic techniques to construct high-quality tessellations of complex geometrical objects. In this thesis, I investigate the problem of unstructured simplicial mesh generation for problems in two- and three-dimensional spaces, in which meshes consist of collections of triangular and tetrahedral elements. I focus on the development of efficient algorithms and computer programs to produce high-quality meshes for planar, surface and volumetric objects of arbitrary complexity. I develop and implement a number of new algorithms for mesh construction based on the Frontal-Delaunay paradigm - a hybridisation of conventional Delaunay-refinement and advancing-front techniques. I show that the proposed algorithms are a significant improvement on existing approaches, typically outperforming the Delaunay-refinement technique in terms of both element shape- and size-quality, while offering significantly improved theoretical robustness compared to advancing-front techniques. I verify experimentally that the proposed methods achieve the same element shape- and size-guarantees that are typically associated with conventional Delaunay-refinement techniques. In addition to mesh construction, methods for mesh improvement are also investigated. I develop and implement a family of techniques designed to improve the element shape quality of existing simplicial meshes, using a combination of optimisation-based vertex smoothing, local topological transformation and vertex insertion techniques. These operations are interleaved according to a new priority-based schedule, and I show that the resulting algorithms are competitive with existing state-of-the-art approaches in terms of mesh quality, while offering significant improvements in computational efficiency. Optimised C++ implementations for the proposed mesh generation and mesh optimisation algorithms are provided in the JIGSAW and JITTERBUG software libraries

Sydney eScholarship

On discriminative semi-supervised incremental learning with a multi-view perspective for image concept modeling

Author: Byun Byungki
Publication venue: Georgia Institute of Technology
Publication date: 17/01/2012
Field of study

This dissertation presents the development of a semi-supervised incremental learning framework with a multi-view perspective for image concept modeling. For reliable image concept characterization, having a large number of labeled images is crucial. However, the size of the training set is often limited due to the cost required for generating concept labels associated with objects in a large quantity of images. To address this issue, in this research, we propose to incrementally incorporate unlabeled samples into a learning process to enhance concept models originally learned with a small number of labeled samples. To tackle the sub-optimality problem of conventional techniques, the proposed incremental learning framework selects unlabeled samples based on an expected error reduction function that measures contributions of the unlabeled samples based on their ability to increase the modeling accuracy. To improve the convergence property of the proposed incremental learning framework, we further propose a multi-view learning approach that makes use of multiple features such as color, texture, etc., of images when including unlabeled samples. For robustness to mismatches between training and testing conditions, a discriminative learning algorithm, namely a kernelized maximal- figure-of-merit (kMFoM) learning approach is also developed. Combining individual techniques, we conduct a set of experiments on various image concept modeling problems, such as handwritten digit recognition, object recognition, and image spam detection to highlight the effectiveness of the proposed framework.PhDCommittee Chair: Lee, Chin-Hui; Committee Member: Clements, Mark; Committee Member: Lee, Hsien-Hsin; Committee Member: McClellan, James; Committee Member: Yuan, Min

Scholarly Materials And Research @ Georgia Tech