351 research outputs found
Recommended from our members
Sparse Linear Discriminant Analysis with more Variables than Observations
It is known that classical linear discriminant analysis (LDA) performs classification well when the number of observations is much larger than the number of variables. However, when the number of variables is larger than the number of observations, classical LDA cannot be performed because the within-group covariance matrix is singular. Recently proposed LDA methods that can handle singular within-group covariance matrix were reviewed. Most of these methods focus on regularizing the within-class covariance matrix. However, they give less attention to sparsity ( selecting variables), interpretation and computational cost, which are important in high-dimensional problems. The fact that most of the original variables may be irrelevant or redundant suggests looking for sparse solutions that involve only a small portion of the variables. In the present work, new sparse LDA methods are proposed that are suited to high-dimensional data. The first two methods assume groups share a common within-group covariance matrix and approximate this matrix by a diagonal matrix. One of these methods is a variant of the other that sacrifices some accuracy for greater computational speed. Both methods obtain sparsity by minimizing an l1 norm and maximizing discrimination power under a common loss function with a tuning parameter. The third method assumes that groups share common eigenvector in eigenvector-eigenvalue decomposition of their within-group covariance matrices, while their eigenvalues may differ. The fourth method assumes the within-group covariance matrices are proportional to each other. The fifth method is derived from the Dantzig selector and uses optimal scoring to construct discriminant function. The third and fourth methods achieve sparsity by imposing a cardinality constraint with the cardinality level determined by cross-validation. All the new methods reduce their computation time by sequentially determining individual discriminant functions. The methods are applied to six real data sets and perform well when compared with two existing methods
A VISION-BASED QUALITY INSPECTION SYSTEM FOR FABRIC DEFECT DETECTION AND CLASSIFICATION
Published ThesisQuality inspection of textile products is an important issue for fabric manufacturers. It is desirable to produce the highest quality goods in the shortest amount of time possible. Fabric faults or defects are responsible for nearly 85% of the defects found by the garment industry. Manufacturers recover only 45 to 65% of their profits from second or off-quality goods. There is a need for reliable automated woven fabric inspection methods in the textile industry.
Numerous methods have been proposed for detecting defects in textile. The methods are generally grouped into three main categories according to the techniques they use for texture feature extraction, namely statistical approaches, spectral approaches and model-based approaches.
In this thesis, we study one method from each category and propose their combinations in order to get improved fabric defect detection and classification accuracy. The three chosen methods are the grey level co-occurrence matrix (GLCM) from the statistical category, the wavelet transform from the spectral category and the Markov random field (MRF) from the model-based category. We identify the most effective texture features for each of those methods and for different fabric types in order to combine them.
Using GLCM, we identify the optimal number of features, the optimal quantisation level of the original image and the optimal intersample distance to use. We identify the optimal GLCM features for different types of fabrics and for three different classifiers.
Using the wavelet transform, we compare the defect detection and classification performance of features derived from the undecimated discrete wavelet and those derived from the dual-tree complex wavelet transform. We identify the best features for different types of fabrics.
Using the Markov random field, we study the performance for fabric defect detection and classification of features derived from different models of Gaussian Markov random fields of order from 1 through 9. For each fabric type we identify the best model order.
Finally, we propose three combination schemes of the best features identified from the three methods and study their fabric detection and classification performance. They lead generally to improved performance as compared to the individual methods, but two of them need further improvement
Penalized Orthogonal Iteration for Sparse Estimation of Generalized Eigenvalue Problem
We propose a new algorithm for sparse estimation of eigenvectors in
generalized eigenvalue problems (GEP). The GEP arises in a number of modern
data-analytic situations and statistical methods, including principal component
analysis (PCA), multiclass linear discriminant analysis (LDA), canonical
correlation analysis (CCA), sufficient dimension reduction (SDR) and invariant
co-ordinate selection. We propose to modify the standard generalized orthogonal
iteration with a sparsity-inducing penalty for the eigenvectors. To achieve
this goal, we generalize the equation-solving step of orthogonal iteration to a
penalized convex optimization problem. The resulting algorithm, called
penalized orthogonal iteration, provides accurate estimation of the true
eigenspace, when it is sparse. Also proposed is a computationally more
efficient alternative, which works well for PCA and LDA problems. Numerical
studies reveal that the proposed algorithms are competitive, and that our
tuning procedure works well. We demonstrate applications of the proposed
algorithm to obtain sparse estimates for PCA, multiclass LDA, CCA and SDR.
Supplementary materials are available online
Interactive game for the training of portuguese vowels
Tese de mestrado integrado. Engenharia Electrotécnica e de Computadores. Faculdade de Engenharia. Universidade do Porto. 200
Neighborhood analysis methods in acoustic modeling for automatic speech recognition
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010.Cataloged from PDF version of thesis.Includes bibliographical references (p. 121-134).This thesis investigates the problem of using nearest-neighbor based non-parametric methods for performing multi-class class-conditional probability estimation. The methods developed are applied to the problem of acoustic modeling for speech recognition. Neighborhood components analysis (NCA) (Goldberger et al. [2005]) serves as the departure point for this study. NCA is a non-parametric method that can be seen as providing two things: (1) low-dimensional linear projections of the feature space that allow nearest-neighbor algorithms to perform well, and (2) nearest-neighbor based class-conditional probability estimates. First, NCA is used to perform dimensionality reduction on acoustic vectors, a commonly addressed problem in speech recognition. NCA is shown to perform competitively with another commonly employed dimensionality reduction technique in speech known as heteroscedastic linear discriminant analysis (HLDA) (Kumar [1997]). Second, a nearest neighbor-based model related to NCA is created to provide a class-conditional estimate that is sensitive to the possible underlying relationship between the acoustic-phonetic labels. An embedding of the labels is learned that can be used to estimate the similarity or confusability between labels. This embedding is related to the concept of error-correcting output codes (ECOC) and therefore the proposed model is referred to as NCA-ECOC. The estimates provided by this method along with nearest neighbor information is shown to provide improvements in speech recognition performance (2.5% relative reduction in word error rate). Third, a model for calculating class-conditional probability estimates is proposed that generalizes GMM, NCA, and kernel density approaches. This model, called locally-adaptive neighborhood components analysis, LA-NCA, learns different low-dimensional projections for different parts of the space. The models exploits the fact that in different parts of the space different directions may be important for discrimination between the classes. This model is computationally intensive and prone to over-fitting, so methods for sub-selecting neighbors used for providing the classconditional estimates are explored. The estimates provided by LA-NCA are shown to give significant gains in speech recognition performance (7-8% relative reduction in word error rate) as well as phonetic classification.by Natasha Singh-Miller.Ph.D
DYNAMIC SIMULATION AND ANALYSIS OF GAIT MODIFICATION FOR TREATING KNEE OSTEOARTHRITIS
Roughly 47.5 million people in the US have a disability, with 8.6 million reporting arthritis as their main cause of disability, making arthritis the leading cause of physical disability. With decreased mortality rates and a large, aging baby boomer generation, there will be more adults living with chronic musculoskeletal conditions causing disabilities that limit walking. Since walking ability is directly related to an individual’s independence at home and in the community, losing this ability is a major setback for patients with arthritis. Knee osteoarthritis (OA) is the most prevalent form of arthritis affecting approximately 27 million adults and accounts for over 55% of all arthritis-related hospital admissions. OA is a highly painful disease with treatments limited to pain management. However, gait modification has recently shown promise as an early intervention treatment strategy to slow disease progression. Thus, the objective of this dissertation is to investigate subject-specific gait modifications to minimize joint loads for treating patients with knee OA.
The first study in this dissertation relies heavily on the development of subject-specific musculoskeletal models to analyze muscle forces and joint contact loads during toe-in gait modification for subjects with knee OA. This study will generate muscle-actuated, dynamic simulations to estimate muscle forces and internal joint contact loads during gait. The results of this study will aid in the advancement of gait modification as a treatment strategy for knee OA. The last two studies will employ machine learning and optimization techniques— specifically, forward sequential feature selection and surrogate-based optimization— to evaluate toe-in gait modification and improve its efficacy for use as a treatment strategy for knee OA. The goal will be to develop testable subject-specific gait modification patterns that reduce joint loads.
The use of both dynamic simulations and data mining techniques provides a unique approach to investigating the relationship between joint biomechanics and muscle function and joint contact loads with respect to gait modification. This approach has the potential to gain much needed insight into the underlying mechanism of gait modification and help advance research to create subject-specific gait modification patterns for treating knee OA in the future
Context-dependent fusion with application to landmine detection.
Traditional machine learning and pattern recognition systems use a feature descriptor to describe the sensor data and a particular classifier (also called expert or learner ) to determine the true class of a given pattern. However, for complex detection and classification problems, involving data with large intra-class variations and noisy inputs, no single source of information can provide a satisfactory solution. As a result, combination of multiple classifiers is playing an increasing role in solving these complex pattern recognition problems, and has proven to be viable alternative to using a single classifier. In this thesis we introduce a new Context-Dependent Fusion (CDF) approach, We use this method to fuse multiple algorithms which use different types of features and different classification methods on multiple sensor data. The proposed approach is motivated by the observation that there is no single algorithm that can consistently outperform all other algorithms. In fact, the relative performance of different algorithms can vary significantly depending on several factions such as extracted features, and characteristics of the target class. The CDF method is a local approach that adapts the fusion method to different regions of the feature space. The goal is to take advantages of the strengths of few algorithms in different regions of the feature space without being affected by the weaknesses of the other algorithms and also avoiding the loss of potentially valuable information provided by few weak classifiers by considering their output as well. The proposed fusion has three main interacting components. The first component, called Context Extraction, partitions the composite feature space into groups of similar signatures, or contexts. Then, the second component assigns an aggregation weight to each detector\u27s decision in each context based on its relative performance within the context. The third component combines the multiple decisions, using the learned weights, to make a final decision. For Context Extraction component, a novel algorithm that performs clustering and feature discrimination is used to cluster the composite feature space and identify the relevant features for each cluster. For the fusion component, six different methods were proposed and investigated. The proposed approached were applied to the problem of landmine detection. Detection and removal of landmines is a serious problem affecting civilians and soldiers worldwide. Several detection algorithms on landmine have been proposed. Extensive testing of these methods has shown that the relative performance of different detectors can vary significantly depending on the mine type, geographical site, soil and weather conditions, and burial depth, etc. Therefore, multi-algorithm, and multi-sensor fusion is a critical component in land mine detection. Results on large and diverse real data collections show that the proposed method can identify meaningful and coherent clusters and that different expert algorithms can be identified for the different contexts. Our experiments have also indicated that the context-dependent fusion outperforms all individual detectors and several global fusion methods
- …