Search CORE

8,153 research outputs found

Non-linear and Sparse Discriminant Analysis with Data Compression

Author: Lapanowski Alexander Frank
Publication venue
Publication date: 22/02/2021
Field of study

Large-sample data became prevalent as data acquisition became cheaper and easier. While a large sample size has theoretical advantages for many statistical methods, it presents computational challenges either in the form of a large number of features or a large number of training samples. We consider the two-group classification problem and adapt Linear Discriminant Analysis to the problems above. Linear Discriminant Analysis is a linear classifier and will under-fit when the true decision boundary is non-linear. To address non-linearity and sparse feature selection, we propose a kernel classifier based on the optimal scoring framework which trains a non-linear classifier. Unlike previous approaches, we provide theoretical guarantees on the expected risk consistency of the method. We also allow for feature selection by imposing structured sparsity using weighted kernels. We propose fully-automated methods for selection of all tuning parameters, and in particular adapt kernel shrinkage ideas for ridge parameter selection. Numerical studies demonstrate the superior classification performance of the proposed approach compared to existing nonparametric classifiers. We also propose automatic methods for ridge parameter selection and guassian kernel parameter selection. To address the computational challenges of a large sample size, we adapt compression to the classification setting. Sketching, or compression, is a well-studied approach to address sample reduction in regression settings, but considerably less is known about its performance in classification settings. Here we consider the computational issues due to large sample size within the discriminant analysis framework. We propose a new compression approach for reducing the number of training samples for linear and quadratic discriminant analysis, in contrast to existing compression methods which focus on reducing the number of features. We support our approach with a theoretical bound on the misclassification error rate compared to the Bayes classifier. Empirical studies confirm the significant computational gains of the proposed method and its superior predictive ability compared to random sub-sampling

Rhythmic Representations: Learning Periodic Patterns for Scalable Place Recognition at a Sub-Linear Storage Cost

Author: Jacobson Adam
Milford Michael
Yu Litao
Publication venue
Publication date: 21/12/2017
Field of study

Robotic and animal mapping systems share many challenges and characteristics: they must function in a wide variety of environmental conditions, enable the robot or animal to navigate effectively to find food or shelter, and be computationally tractable from both a speed and storage perspective. With regards to map storage, the mammalian brain appears to take a diametrically opposed approach to all current robotic mapping systems. Where robotic mapping systems attempt to solve the data association problem to minimise representational aliasing, neurons in the brain intentionally break data association by encoding large (potentially unlimited) numbers of places with a single neuron. In this paper, we propose a novel method based on supervised learning techniques that seeks out regularly repeating visual patterns in the environment with mutually complementary co-prime frequencies, and an encoding scheme that enables storage requirements to grow sub-linearly with the size of the environment being mapped. To improve robustness in challenging real-world environments while maintaining storage growth sub-linearity, we incorporate both multi-exemplar learning and data augmentation techniques. Using large benchmark robotic mapping datasets, we demonstrate the combined system achieving high-performance place recognition with sub-linear storage requirements, and characterize the performance-storage growth trade-off curve. The work serves as the first robotic mapping system with sub-linear storage scaling properties, as well as the first large-scale demonstration in real-world environments of one of the proposed memory benefits of these neurons.Comment: Pre-print of article that will appear in the IEEE Robotics and Automation Letter

arXiv.org e-Print Archive

Dimension-adaptive bounds on compressive FLD Classification

Author: G. Biau
Geoffrey J. McLachlan
N. Halko
R. Vershynin
R.J. Durrant
S. Dasgupta
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Efficient dimensionality reduction by random projections (RP) gains popularity, hence the learning guarantees achievable in RP spaces are of great interest. In finite dimensional setting, it has been shown for the compressive Fisher Linear Discriminant (FLD) classifier that forgood generalisation the required target dimension grows only as the log of the number of classes and is not adversely affected by the number of projected data points. However these bounds depend on the dimensionality d of the original data space. In this paper we give further guarantees that remove d from the bounds under certain conditions of regularity on the data density structure. In particular, if the data density does not fill the ambient space then the error of compressive FLD is independent of the ambient dimension and depends only on a notion of ‘intrinsic dimension'

CiteSeerX

Detection of internal quality in kiwi with time-domain diffuse reflectance spectroscopy

Author: Cubeddu Rinaldo
Dover Colin
Johnson David
Pifferi Antonio
Ruiz-Altisent Margarita
Taroni Paola
Torricelli Alessandro
Valentini Gianluca
Valero Ubierna Constantino
Publication venue: E.T.S.I. Agrónomos (UPM)
Publication date: 01/01/2004
Field of study

Time-domain diffuse reflectance spectroscopy (TRS), a medical sensing technique, was used to evaluate internal kiwi fruit quality. The application of this pulsed laser spectroscopic technique was studied as a new, possible non-destructive, method to detect optically different quality parameters: firmness, sugar content, and acidity. The main difference with other spectroscopic techniques is that TRS estimates separately and at the same time absorbed light and scattering inside the sample, at each wavelength, allowing simultaneous estimations of firmness and chemical contents. Standard tests (flesh puncture, compression with ball, .Brix, total acidity, skin color) have been used as references to build estimative models, using a multivariate statistical approach. Classification functions of the fruits into three groups achieved a performance of 75% correctly classified fruits for firmness, 60% for sugar content, and 97% for acidity. Results demonstrate good potential for this technique to be used in the development of new sensors for non-destructive quality assessment

Archivio istituzionale della ricerca - Politecnico di Milano

Abeliants and their application to an elementary construction of Jacobians

Author: Anderson Greg W.
Publication venue
Publication date: 04/12/2001
Field of study

The {\em abeliant} is a polynomial rule for producing an

n

n

matrix with entries in a given ring from an

n

n

n+2

array of elements of that ring. The theory of abeliants, first introduced in an earlier paper of the author, is redeveloped here in a simpler way. Then this theory is exploited to give an explicit elementary construction of the Jacobian of a nonsingular projective algebraic curve defined over an algebraically closed field. The standard of usefulness and aptness we strive toward is that set by Mumford's elementary construction of the Jacobian of a hyperelliptic curve. This paper has appeared as Advances in Math 172 (2002) 169-205

arXiv.org e-Print Archive

CiteSeerX

Radar data processing and analysis

Author: Ausherman D.
Larson R.
Liskow C.
Publication venue
Publication date
Field of study

Digitized four-channel radar images corresponding to particular areas from the Phoenix and Huntington test sites were generated in conjunction with prior experiments performed to collect X- and L-band synthetic aperture radar imagery of these two areas. The methods for generating this imagery are documented. A secondary objective was the investigation of digital processing techniques for extraction of information from the multiband radar image data. Following the digitization, the remaining resources permitted a preliminary machine analysis to be performed on portions of the radar image data. The results, although necessarily limited, are reported