88 research outputs found

    Mental state estimation for brain-computer interfaces

    Get PDF
    Mental state estimation is potentially useful for the development of asynchronous brain-computer interfaces. In this study, four mental states have been identified and decoded from the electrocorticograms (ECoGs) of six epileptic patients, engaged in a memory reach task. A novel signal analysis technique has been applied to high-dimensional, statistically sparse ECoGs recorded by a large number of electrodes. The strength of the proposed technique lies in its ability to jointly extract spatial and temporal patterns, responsible for encoding mental state differences. As such, the technique offers a systematic way of analyzing the spatiotemporal aspects of brain information processing and may be applicable to a wide range of spatiotemporal neurophysiological signals

    PARAMETRIC LINK MODELS FOR KNOWLEDGE TRANSFER IN STATISTICAL LEARNING

    Get PDF
    International audienceWhen a statistical model is designed in a prediction purpose, a major assumption is the absence of evolution in the modeled phenomenon between the training and the prediction stages. Thus, training and future data must be in the same feature space and must have the same distribution. Unfortunately, this assumption turns out to be often false in real-world applications. For instance, biological motivations could lead to classify individuals from a given species when only individuals from another species are available for training. In regression, we would sometimes use a predictive model for data having not exactly the same distribution that the training data used for estimating the model. This chapter presents techniques for transfering a statistical model estimated from a source population to a target population. Three tasks of statistical learning are considered: Probabilistic classification (parametric and semi-parametric), linear regression (includingmixture of regressions) and model-based clustering (Gaussian and Student). In each situation, the knowledge transfer is carried out by introducing parametric links between both populations. The use of such transfer techniques would improve the performance of learning by avoiding much expensive data labeling efforts

    The effectiveness of features in pattern recognition

    Get PDF
    Imperial Users onl

    Sparse Modeling for Image and Vision Processing

    Get PDF
    In recent years, a large amount of multi-disciplinary research has been conducted on sparse models and their applications. In statistics and machine learning, the sparsity principle is used to perform model selection---that is, automatically selecting a simple model among a large collection of them. In signal processing, sparse coding consists of representing data with linear combinations of a few dictionary elements. Subsequently, the corresponding tools have been widely adopted by several scientific communities such as neuroscience, bioinformatics, or computer vision. The goal of this monograph is to offer a self-contained view of sparse modeling for visual recognition and image processing. More specifically, we focus on applications where the dictionary is learned and adapted to data, yielding a compact representation that has been successful in various contexts.Comment: 205 pages, to appear in Foundations and Trends in Computer Graphics and Visio

    Maximum Likelihood Pitch Estimation Using Sinusoidal Modeling

    Get PDF
    The aim of the work presented in this thesis is to automatically extract the fundamental frequency of a periodic signal from noisy observations, a task commonly referred to as pitch estimation. An algorithm for optimal pitch estimation using a maximum likelihood formulation is presented. The speech waveform is modeled using sinusoidal basis functions that are harmonically tied together to explicitly capture the periodic structure of voiced speech. The problem of pitch estimation is casted as a model selection problem and the Akaike Information Criterion is used to estimate the pitch. The algorithm is compared with several existing pitch detection algorithms (PDAs) on a reference pitch database. The results indicate the superior performance of the algorithm in comparison with most of the PDAs. The application of parametric modeling in single channel speech segregation and the use of mel-frequency cepstral coefficients for sequential grouping are analyzed in the speech separation challenge database

    Explainable AI for Machine Fault Diagnosis: Understanding Features' Contribution in Machine Learning Models for Industrial Condition Monitoring

    Get PDF
    Although the effectiveness of machine learning (ML) for machine diagnosis has been widely established, the interpretation of the diagnosis outcomes is still an open issue. Machine learning models behave as black boxes; therefore, the contribution given by each of the selected features to the diagnosis is not transparent to the user. This work is aimed at investigating the capabilities of the SHapley Additive exPlanation (SHAP) to identify the most important features for fault detection and classification in condition monitoring programs for rotating machinery. The authors analyse the case of medium-sized bearings of industrial interest. Namely, vibration data were collected for different health states from the test rig for industrial bearings available at the Mechanical Engineering Laboratory of Politecnico di Torino. The Support Vector Machine (SVM) and k-Nearest Neighbour (kNN) diagnosis models are explained by means of the SHAP. Accuracies higher than 98.5% are achieved for both the models using the SHAP as a criterion for feature selection. It is found that the skewness and the shape factor of the vibration signal have the greatest impact on the models’ outcomes

    Instance selection with threshold clustering for support vector machines

    Get PDF
    Doctor of PhilosophyDepartment of StatisticsMichael J HigginsTremendous advances in computing power have allowed the size of datasets to grow massively. Many machine learning approaches have been developed to deal with massive data by reducing the number of features, observations, or both. Instance selection (IS) is a data mining process that relies on scaling down the number of observations of a dataset. In this research, we focus on IS methods that rely on clustering algorithms, particularly, on threshold clustering (TC). TC is a recent efficient clustering method. Given a fixed size threshold t*, TC forms clusters of t* or more units while ensuring that the maximum within- cluster dissimilarity is small. Unlike most traditional clustering methods, TC is designed to form many small clusters of units, making it ideal for IS. Support vector machines (SVM) is a powerful method for classification. However, train- ing SVM may be computationally infeasible for large datasets—training SVM requires O(N3) runtime, where N is size of the training data. In this dissertation, we propose a method for IS for training SVM under big data settings called support vector machines with threshold clustering (SVMTC). Our proposed method begins by clustering each class in the training set separately using TC. Then, centroids of all clusters are formed the re- duced set. If the data reduction is insufficient, TC may be repeated. SVM is then applied on the reduced dataset. In this way, our proposed method can reduce the training set for SVM by factor (t*)^r or more, where r is the number of iterations of TC, dramatically reducing the runtime required to train SVM. Furthermore, we prove under the Gaussian radial basis kernel, that the maximum distance between the Gram matrix for the original data—which is used to find support vectors—and the Gram matrix for the reduced data is bounded by a function of the maximum within-cluster distance for TC. Then, we show, via simulation and application to datasets, that SVMTC efficiently reduces the size of training sets with- out sacrificing the prediction accuracy of SVM. Moreover, it often outperforms competing methods for IS in terms of the runtime, memory usage, and prediction accuracy. Next, we explore best practices for applying feature reduction methods for SVMTC when the number of features is large. We investigate the usefulness of various feature selection and feature extraction methods, including principal component analysis (PCA), linear discriminant analysis (LDA), LASSO, and Fisher Scores, as an initial step of SVMTC. For feature reduction methods that select a linear combination of the original features— for example, PCA—we also investigate forming prototypes using the original features or the transformed features. We compare, via application to datasets, the performance of SVMTC under feature reduction methods. We find that LASSO tends to be an effective feature selection method, and overall, show that SVMTC is improved significantly under the proposed methods. Finally, we perform a comparative study of iterative threshold instance selection (ITIS) and other IS methods. ITIS is a recent extension method of TC that is used as IS. We use simulation to compare between ITIS and competing methods. The results illustrate that ITIS is effective in massive data settings when compared against other instance selection methods like k-means and its variations. In addition, we demonstrate the efficacy of hybrid clustering algorithms that utilize ITIS as an initial step, and show via simulation study that these methods outperform other hybrid clustering methods in terms of runtime and memory without sacrificing performance

    Visual scene recognition with biologically relevant generative models

    No full text
    This research focuses on developing visual object categorization methodologies that are based on machine learning techniques and biologically inspired generative models of visual scene recognition. Modelling the statistical variability in visual patterns, in the space of features extracted from them by an appropriate low level signal processing technique, is an important matter of investigation for both humans and machines. To study this problem, we have examined in detail two recent probabilistic models of vision: a simple multivariate Gaussian model as suggested by (Karklin & Lewicki, 2009) and a restricted Boltzmann machine (RBM) proposed by (Hinton, 2002). Both the models have been widely used for visual object classification and scene analysis tasks before. This research highlights that these models on their own are not plausible enough to perform the classification task, and suggests Fisher kernel as a means of inducing discrimination into these models for classification power. Our empirical results on standard benchmark data sets reveal that the classification performance of these generative models could be significantly boosted near to the state of the art performance, by drawing a Fisher kernel from compact generative models that computes the data labels in a fraction of total computation time. We compare the proposed technique with other distance based and kernel based classifiers to show how computationally efficient the Fisher kernels are. To the best of our knowledge, Fisher kernel has not been drawn from the RBM before, so the work presented in the thesis is novel in terms of its idea and application to vision problem

    Data-independent vs. data-dependent dimension reduction for pattern recognition in high dimensional spaces

    Get PDF
    There has been a rapid emergence of new pattern recognition/classification techniques in a variety of real world applications over the last few decades. In most of the pattern recognition/classification applications, the pattern of interest is modelled by a data vector/array of very high dimension. The main challenges in such applications are related to the efficiency of retrieval, analysis, and verifying/classifying the pattern/object of interest. The “Curse of Dimension” is a reference to these challenges and is commonly addressed by Dimension Reduction (DR) techniques. Several DR techniques has been developed and implemented in a variety of applications. The most common DR schemes are dependent on a dataset of “typical samples” (e.g. the Principal Component Analysis (PCA), and Linear Discriminant Analysis (LDA)). However, data-independent DR schemes (e.g. Discrete Wavelet Transform (DWT), and Random Projections (RP)) are becoming more desirable due to lack of density ratio of samples to dimension. In this thesis, we critically review both types of techniques, and highlight advantages and disadvantages in terms of efficiency and impact on recognition accuracy. We shall study the theoretical justification for the existence of DR transforms that preserve, within tolerable error, distances between would be feature vectors modelling objects of interest. We observe that data-dependent DRs do not specifically attempts to preserve distances, and the problems of overfitting and biasness are consequences of low density ratio of samples to dimension. Accordingly, the focus of our investigations is more on data-independent DR schemes and in particular on the different ways of generating RPs as an efficient DR tool. RPs suitable for pattern recognition applications are only restricted by a lower bound on the reduced dimension that depends on the tolerable error. Besides, the known RPs that are generated in accordance to some probability distributions, we investigate and test the performance of differently constructed over-complete Hadamard mxn (m<<n) submatrices, using the inductive Sylvester and Walsh-Paley methods. Our experimental work conducted for 2 case studies (Speech Emotion Recognition (SER) and Gait-based Gender Classification (GBGC)) demonstrate that these matrices perform as well, if not better, than data-dependent DR schemes. Moreover, dictionaries obtained by sampling the top rows of Walsh Paley matrices outperform matrices constructed more randomly but this may be influenced by the type of biometric and/or recognition schemes. We shall, also propose the feature-block (FB) based DR as an innovative way to overcome the problem of low density ratio applications and demonstrate its success for the SER case study
    • …
    corecore