8 research outputs found

    Adaptive Kernel Matching Pursuit for Pattern Classification

    Get PDF
    A sparse classifier is guaranteed to generalize better than a denser one, given they perform identical on the training set. However, methods like Support Vector Machine, even if they produce relatively sparse models, are known to scale linearly as the number of training examples increases. A recent proposed method, the Kernel Matching Pursuit, presents a number of advantages over th

    Some greedy learning algorithms for sparse regression and classification with mercer kernels

    No full text
    We present greedy learning algorithms for building sparse nonlinear regression and classification models from observational data using Mercer kernels. Our objective is to develop efficient numerical schemes for reducing the training and runtime complexities of kernel-based algorithms applied to large datasets. In the spirit of Natarajan's greedy algorithm (Natarajan, 1995), we iteratively minimize the L2 loss function subject to a specified constraint on the degree of sparsity required of the final model or till a specified stopping criterion is reached. We discuss various greedy criteria for basis selection and numerical schemes for improving the robustness and computational efficiency. Subsequently, algorithms based on residual minimization and thin QR factorization are presented for constructing sparse regression and classification models. During the course of the incremental model construction, the algorithms are terminated using model selection principles such as the minimum descriptive length (MDL) and Akaike's information criterion (AIC). Finally, experimental results on benchmark data are presented to demonstrate the competitiveness of the algorithms developed in this paper

    Some Greedy Learning Algorithms for Sparse Regression and Classification with Mercer Kernels

    No full text
    We present some greedy learning algorithms for building sparse nonlinear regression and classification models from observational data using Mercer kernels. Our objective is to develop efficient numerical schemes for reducing the training and runtime complexities of kernel-based algorithms applied to large datasets. In the spirit of Natarajan's greedy algorithm (Natarajan, 1995), we iteratively minimize the L 2 loss function subject to a specified constraint on the degree of sparsity required of the final model until a specified stopping criterion is reached. We discuss various greedy criteria for basis selection and numerical schemes for improving the robustness and computational efficiency. Subsequently, algorithms based on residual minimization and thin QR factorization are presented for constructing sparse regression and classification models. During the course of the incremental model construction, the algorithms are terminated using model selection principles such as the minimum descriptive length (MDL) and Akaike's information criterion (AIC). Finally, experimental results on benchmark data are presented to demonstrate the competitiveness of the algorithms developed in this paper

    Calibration of Flush Air Data Sensing Systems Using Surrogate Modeling Techniques

    Get PDF
    In this work the problem of calibrating Flush Air Data Sensing (FADS) has been addressed. The inverse problem of extracting freestream wind speed and angle of attack from pressure measurements has been solved. The aim of this work was to develop machine learning and statistical tools to optimize design and calibration of FADS systems. Experimental and Computational Fluid Dynamics (EFD and CFD) solve the forward problem of determining the pressure distribution given the wind velocity profile and bluff body geometry. In this work three ways are presented in which machine learning techniques can improve calibration of FADS systems. First, a scattered data approximation scheme, called Sequential Function Approximation (SFA) that successfully solved the current inverse problem was developed. The proposed scheme is a greedy and self-adaptive technique that constructs reliable and robust estimates without any user-interaction. Wind speed and direction prediction algorithms were developed for two FADS problems. One where pressure sensors are installed on a surface vessel and the other where sensors are installed on the Runway Assisted Landing Site (RALS) control tower. Second, a Tikhonov regularization based data-model fusion technique with SFA was developed to fuse low fidelity CFD solutions with noisy and sparse wind tunnel data. The purpose of this data model fusion approach was to obtain high fidelity, smooth and noiseless flow field solutions by using only a few discrete experimental measurements and a low fidelity numerical solution. This physics based regularization technique gave better flow field solutions compared to smoothness based solutions when wind tunnel data is sparse and incomplete. Third, a sequential design strategy was developed with SFA using Active Learning techniques from the machine learning theory and Optimal Design of Experiments from statistics for regression and classification problems. Uncertainty Sampling was used with SFA to demonstrate the effectiveness of active learning versus passive learning on a cavity flow classification problem. A sequential G-optimal design procedure was also developed with SFA for regression problems. The effectiveness of this approach was demonstrated on a simulated problem and the above mentioned FADS problem

    Sparse image approximation with application to flexible image coding

    Get PDF
    Natural images are often modeled through piecewise-smooth regions. Region edges, which correspond to the contours of the objects, become, in this model, the main information of the signal. Contours have the property of being smooth functions along the direction of the edge, and irregularities on the perpendicular direction. Modeling edges with the minimum possible number of terms is of key importance for numerous applications, such as image coding, segmentation or denoising. Standard separable basis fail to provide sparse enough representation of contours, due to the fact that this kind of basis do not see the regularity of edges. In order to be able to detect this regularity, a new method based on (possibly redundant) sets of basis functions able to capture the geometry of images is needed. This thesis presents, in a first stage, a study about the features that basis functions should have in order to provide sparse representations of a piecewise-smooth image. This study emphasizes the need for edge-adapted basis functions, capable to accurately capture local orientation and anisotropic scaling of image structures. The need of different anisotropy degrees and orientations in the basis function set leads to the use of redundant dictionaries. However, redundant dictionaries have the inconvenience of giving no unique sparse image decompositions, and from all the possible decompositions of a signal in a redundant dictionary, just the sparsest is needed. There are several algorithms that allow to find sparse decompositions over redundant dictionaries, but most of these algorithms do not always guarantee that the optimal approximation has been recovered. To cope with this problem, a mathematical study about the properties of sparse approximations is performed. From this, a test to check whether a given sparse approximation is the sparsest is provided. The second part of this thesis presents a novel image approximation scheme, based on the use of a redundant dictionary. This scheme allows to have a good approximation of an image with a number of terms much smaller than the dimension of the signal. This novel approximation scheme is based on a dictionary formed by a combination of anisotropically refined and rotated wavelet-like mother functions and Gaussians. An efficient Full Search Matching Pursuit algorithm to perform the image decomposition in such a dictionary is designed. Finally, a geometric image coding scheme based on the image approximated over the anisotropic and rotated dictionary of basis functions is designed. The coding performances of this dictionary are studied. Coefficient quantization appears to be of crucial importance in the design of a Matching Pursuit based coding scheme. Thus, a quantization scheme for the MP coefficients has been designed, based on the theoretical energy upper bound of the MP algorithm and the empirical observations of the coefficient distribution and evolution. Thanks to this quantization, our image coder provides low to medium bit-rate image approximations, while it allows for on the fly resolution switching and several other affine image transformations to be performed directly in the transformed domain