44 research outputs found

    Experimental Evaluation of Latent Variable Models for Dimensionality Reduction

    Get PDF
    We use electropalatographic (EPG) data as a test bed for dimensionality reduction methods based in latent variable modelling, in which an underlying lower dimension representation is inferred directly from the data. Several models (and mixtures of them) are investigated, including factor analysis and the generative topographic mapping. Experiments indicate that nonlinear latent variable modelling reveals a low-dimensional structure in the data inaccessible to the investigated linear model

    Feature subset selection and ranking for data dimensionality reduction

    Get PDF
    A new unsupervised forward orthogonal search (FOS) algorithm is introduced for feature selection and ranking. In the new algorithm, features are selected in a stepwise way, one at a time, by estimating the capability of each specified candidate feature subset to represent the overall features in the measurement space. A squared correlation function is employed as the criterion to measure the dependency between features and this makes the new algorithm easy to implement. The forward orthogonalization strategy, which combines good effectiveness with high efficiency, enables the new algorithm to produce efficient feature subsets with a clear physical interpretation

    A multiple sequential orthogonal least squares algorithm for feature ranking and subset selection

    Get PDF
    High-dimensional data analysis involving a large number of variables or features is commonly encountered in multiple regression and multivariate pattern recognition. It has been noted that in many cases not all the original variables are necessary for characterizing the overall features. More often only a subset of a small number of significant variables is required. The detection of significant variables from a library consisting of all the original variables is therefore a key and challenging step for dimensionality reduction. Principal component analysis is a useful tool for dimensionality reduction. Principal components, however, suffer from two main deficiencies: Principal components always involve all the original variables and are usually difficult to physically interpret. This study introduces a new multiple sequential orthogonal least squares algorithm for feature ranking and subset selection. The new method detects in a stepwise way the capability of each candidate feature to recover the first few principal components. At each step, only the significant variable with the strongest capability to represent the first few principal components is selected. Unlike principal components, which carry no clear physical meanings, features selected by the new method preserve the original measurement meanings

    Selección efectiva de características para bioseñales utilizando el análisis de componentes principales

    Get PDF
    Este articulo presenta algunos resultado parciales de una reciente investigación que comparó varias técnicas lineales y no lineales del análisis multivariado de datos con el objeto de seleccionar y extraer de manera efectiva un grupo de características basadas en señales electrocardiográficas orientadas a la identificación del infarto agudo de miocardio. Específicamente en este artículo se presentan los resultados obtenidos al aplicar el método lineal de análisis en componentes principales para generar un subespacio de características de menor dimensión que el original. Se presentan también los resultados obtenidos al evaluar la precisión de la clasificación de estados funcionales normales y patológicos del miocardio utilizando un clasificador bayesiano. Además se estimó también su costo computacional.In this article some partial results of comparison results from a recent investigation are presented, in this investigation a comparation between linear and non linear methods from multivariate analysis is made with the main purpose of selection and feature extraction from electrocardiographic signals, this all oriented to identification of accute infarction of the myocardium. Specifically this article summarizes the results from having applied the multivariate method of analysis known as analysis of principal components to generate a subspace of characteristics of minorless dimension that the original one. The precision of the classification of normal and pathological functional states of the myocardium using a Bayesian classifier was also computed. Its associated computational cost was also estimated

    Modelo de variables latentes para la identificación del infarto agudo del miocardio análisis de componentes independientes

    Get PDF
    Este articulo presenta algunos resultado parciales de una reciente investigación [1] que comparó varias técnicas lineales y no lineales del análisis multivariado de datos con el objeto de seleccionar y extraer de manera efectiva un grupo de características basadas en señales electrocardiográficas orientadas a la identificación del infarto agudo de miocardio. Específicamente en este artículo se presentan los resultados obtenidos al aplicar el Análisis de Componentes Independientes (Independent Component Analysis-ICA) para generar un subespacio de características de menor dimensión que el original. Se presentan también los resultados obtenidos al evaluar la precisión de la clasificación de estados funcionales normales y patológicos del miocardio utilizando un clasificador bayesiano. Además, se estimó también su costo computacional.In this article some partial results from a recent investigation are presented [1], in this investigation a comparation between linear and non linear methods from multivariate analysis is made with the main purpose of selection and feature extraction from electrocardiography signals, this all oriented to identification of sharp infarction of the myocardium. Specifically this article summarizes the results from having applied the Multivariate method of analysis known as Independent Component Analysis to generate a subspace of characteristics of minor dimension that the original one. The precision of the classification of normal and pathological functional states of the myocardium using a bayesian classifier was also compute. Its associated computational cost was also estimated

    Neural network methods for one-to-many multi-valued mapping problems

    Get PDF
    An investigation of the applicability of neural network-based methods in predicting the values of multiple parameters, given the value of a single parameter within a particular problem domain is presented. In this context, the input parameter may be an important source of variation that is related with a complex mapping function to the remaining sources of variation within a multivariate distribution. The definition of the relationship between the variables of a multivariate distribution and a single source of variation allows the estimation of the values of multiple variables given the value of the single variable, addressing in that way an ill-conditioned one-to-many mapping problem. As part of our investigation, two problem domains are considered: predicting the values of individual stock shares, given the value of the general index, and predicting the grades received by high school pupils, given the grade for a single course or the average grade. With our work, the performance of standard neural network-based methods and in particular multilayer perceptrons (MLPs), radial basis functions (RBFs), mixture density networks (MDNs) and a latent variable method, the general topographic mapping (GTM), is compared. According to the results, MLPs and RBFs outperform MDNs and the GTM for these one-to-many mapping problems
    corecore