12 research outputs found

    Coherent averaging estimation autoencoders applied to evoked potentials processing

    Get PDF
    The success of machine learning algorithms strongly depends on the feature extraction and data representation stages. Classification and estimation of small repetitive signals masked by relatively large noise usually requires recording and processing several different realizations of the signal of interest. This is one of the main signal processing problems to solve when estimating or classifying P300 evoked potentials in brain-computer interfaces. To cope with this issue we propose a novel autoencoder variation, called Coherent Averaging Estimation Autoencoder with a new multiobjective cost function. We illustrate its use and analyze its performance in the problem of event related potentials processing. Experimental results showing the advantages of the proposed approach are finally presented.Fil: Gareis, Iván Emilio. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; Argentina. Universidad Nacional de Entre Ríos; ArgentinaFil: Vignolo, Leandro Daniel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Spies, Ruben Daniel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Matemática Aplicada del Litoral. Universidad Nacional del Litoral. Instituto de Matemática Aplicada del Litoral; ArgentinaFil: Rufiner, Hugo Leonardo. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; Argentina. Universidad Nacional de Entre Ríos; Argentin

    Evolutionary optimization of signal representations for automatic speech recognition

    No full text
    La dificultad para resolver los problemas asociados al reconocimiento del habla está dada por las características de las señales implicadas, ya que las mismas presentan complejas funciones de densidad de probabilidad, son no estacionarias y generalmente se encuentran contaminadas con ruidos de naturaleza e intensidad muy diversa. Es por ésto que los sistemas de reconocimiento automático requieren de una etapa de procesamiento que ponga en evidencia las características distintivas de cada fonema, permitiendo mejorar los resultados. El objetivo de esta tesis es el desarrollo de un método para optimizar la etapa de procesamiento de la señal de voz, de manera que permita mejorar los resultados de un sistema de reconocimiento automático del habla. Dicha metodología consiste en la aplicación de algoritmos evolutivos para optimizar el vector de características utilizado para representar las señales de voz. Se parte de la hipótesis de que cuanto mejor sea el análisis o proceso utilizado para generar los patrones a identificar, más separadas quedarán las clases en el espacio de características y la tarea de clasificación resultará más sencilla. Más precisamente, en esta tesis se proponen dos alternativas evolutivas para la búsqueda de un conjunto robusto de características. En la primera propuesta se aborda la optimización de una representación basada en coeficientes cepstrales. La segunda propuesta consiste en la optimización de una descomposición no convencional para el reconocimiento del habla, denominada paquetes de onditas, que provee características interesantes para el análisis de este tipo de señales.The key issue on speech recognition is given by the characteristics of the signals involved, as these are governed by complex probability density functions, are non-stationary and generally contaminated with noise of diverse nature and intensity. This is why the automatic recognition systems need a processing stage in order to bring out the key features of phonemes, allowing to improve their performance. The goal of this thesis is the development of a methodology for the optimization of the signal processing stage, in order to improve the results of an automatic speech recognition system. This methodology consists in the use of evolutionary algorithms for the optimization of the feature vector used for speech signal representation. The hypothesis is that the better the analysis or process applied to the patterns that are to be classified, the more separated would the classes result in the features space and, therefore, the classification task would be simpler. In this thesis, the first proposal is to continue the search for an optimal representation based on cepstral coefficients, by the optimization of the filterbank involved in this feature extraction procedure. On the other hand, wavelets have characteristics that are useful for the analysis of non-stationary signals. These features present discriminative information, however, the large number of coefficients makes the task of the classifier more difficult. Because of this, the use of an evolutionary algorithm is proposed to search for a subset of coefficients which maximizes the discrimination capability.Universidad Nacional del Litoral Consejo Nacional de Investigaciones Científicas y Técnica

    Feature selection for face recognition based on multi-objective evolutionary wrappers

    Get PDF
    Feature selection is a key issue in pattern recognition, specially when prior knowledge of the most discriminant features is not available. Moreover, in order to perform the classification task with reduced complexity and acceptable performance, usually features that are irrelevant, redundant, or noisy are excluded from the problem representation. This work presents a multi-objective wrapper, based on genetic algorithms, to select the most relevant set of features for face recognition tasks. The proposed strategy explores the space of multiple feasible selections in order to minimize the cardinality of the feature subset, and at the same time to maximize its discriminative capacity. Experimental results show that, in comparison with other state-of-the-art approaches, the proposed approach allows to improve the classification performance, while reducing the representation dimensionality.Fil: Vignolo, Leandro Daniel. Consejo Nacional de Investigaciones Cientificas y Tecnicas. Centro Cientifico Tecnológico Santa Fe. Instituto de Investigacion en Señales, Sistemas e Inteligencia Computacional; Argentina; ArgentinaFil: Milone, Diego Humberto. Consejo Nacional de Investigaciones Cientificas y Tecnicas. Centro Cientifico Tecnológico Santa Fe. Instituto de Investigacion en Señales, Sistemas e Inteligencia Computacional; Argentina; ArgentinaFil: Scharcanski, Jacob. Universidade Federal do Rio Grande do Sul. Instituto de Informatica and Dept. de Engenharia Eletrica; Brasi

    Exploring feature extraction methods for infant mood classification

    No full text
    Speaker state recognition is an important issue to understand the human behaviour and to achieve more comprehensive speech interactive systems, and therefore has received much attention in recent years. This work addresses the automatic classification of three types of child emotions in vocalisations: neutral mood, fussing (negative mood) and crying (negative mood). Speech, in a broad sense, contains a lot of para-linguistic information that can be revealed by means of different methods for feature extraction and, in this case, these would be useful for mood detection. Here, several set of features are proposed, combined and compared with state-of-art characteristics used for speech-related tasks, and these are based on spectral information, bio-inspired ear model, auditory sparse representations with dictionaries, optimised wavelet coefficients and optimised filter bank for cepstral representation. All the experiments were performed using the Extreme Learning Machines as classifier because it is a state-of-art classifier and to achieve comparable results. The results show that by means of the proposed feature extraction methods it is possible to improve the performance provided by the baseline features. Also, different combinations of the developed feature sets were studied in order to further exploit their properties.Fil: Vignolo, Leandro Daniel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Albornoz, Enrique Marcelo. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Martínez, César Ernesto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; Argentin

    Genetic wavelet packets for speech recognition

    Get PDF
    The most widely used speech representation is based on the mel-frequency cepstral coefficients, which incorporates biologically inspired characteristics into artificial recognizers. However, the recognition performance with these features can still be enhanced, specially in adverse conditions. Recent advances have been made with the introduction of wavelet based representations for different kinds of signals, which have shown to improve the classification performance. However, the problem of finding an adequate wavelet based representation for a particular problem is still an important challenge. In this work we propose a genetic algorithm to evolve a speech representation, based on a non-orthogonal wavelet decomposition, for phoneme classification. The results, obtained for a set of spanish phonemes, show that the proposed genetic algorithm is able to find a representation that improves speech recognition results. Moreover, the optimized representation was evaluated in noise conditions.Fil: Vignolo, Leandro Daniel. Consejo Nacional de Investigaciones Cientificas y Tecnicas. Centro Cientifico Tecnológico Santa Fe. Instituto de Investigacion en Señales, Sistemas e Inteligencia Computacional; Argentina; Argentina. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas; ArgentinaFil: Milone, Diego Humberto. Consejo Nacional de Investigaciones Cientificas y Tecnicas. Centro Cientifico Tecnológico Santa Fe. Instituto de Investigacion en Señales, Sistemas e Inteligencia Computacional; Argentina; Argentina. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas; ArgentinaFil: Rufiner, Hugo Leonardo. Consejo Nacional de Investigaciones Cientificas y Tecnicas. Centro Cientifico Tecnológico Santa Fe. Instituto de Investigacion en Señales, Sistemas e Inteligencia Computacional; Argentina; Argentina. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas; Argentin

    Automatic classification of Furnariidae species from the Paranaense Littoral region using speech-related features and machine learning

    No full text
    Over the last years, researchers have addressed the automatic classification of calling bird species. This is important for achieving more exhaustive environmental monitoring and for managing natural resources. Vocalisations help to identify new species, their natural history and macro-systematic relations, while computer systems allow the bird recognition process to be sped up and improved. In this study, an approach that uses state-of-the-art features designed for speech and speaker state recognition is presented. A method for voice activity detection was employed previous to feature extraction. Our analysis includes several classification techniques (multilayer perceptrons, support vector machines and random forest) and compares their performance using different configurations to define the best classification method. The experimental results were validated in a cross-validation scheme, using 25 species of the family Furnariidae that inhabit the Paranaense Littoral region of Argentina (South America). The results show that a high classification rate, close to 90%, is obtained for this family in this Furnariidae group using the proposed features and classifiers.Fil: Albornoz, Enrique Marcelo. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Vignolo, Leandro Daniel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Sarquis, Juan Andrés. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto Nacional de Limnología. Universidad Nacional del Litoral. Instituto Nacional de Limnología; ArgentinaFil: Leon, Evelina Jesica. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto Nacional de Limnología. Universidad Nacional del Litoral. Instituto Nacional de Limnología; Argentin

    Evolutionary cepstral coefficients

    No full text
    Evolutionary algorithms provide flexibility and robustness required to find satisfactory solutions in complex search spaces. This is why they are successfully applied for solving real engineering problems. In this work we propose an algorithm to evolve a robust speech representation, using a dynamic data selection method for reducing the computational cost of the fitness computation while improving the generalisation capabilities. The most commonly used speech representation are the mel-frequency cepstral coefficients, which incorporate biologically inspired characteristics into artificial recognizers. Recent advances have been made with the introduction of alternatives to the classic mel scaled filterbank, improving the phoneme recognition performance in adverse conditions. In order to find an optimal filterbank, filter parameters such as the central and side frequencies are optimised. A hidden Markov model is used as the classifier for the evaluation of the fitness for each individual. Experiments were conducted using real and synthetic phoneme databases, considering different additive noise levels. Classification results show that the method accomplishes the task of finding an optimised filterbank for phoneme recognition, which provides robustness in adverse conditions.Fil: Vignolo, Leandro Daniel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Rufiner, Hugo Leonardo. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Milone, Diego Humberto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Goddard, John C.. Universidad Autónoma Metropolitana; Méxic

    Feature optimisation for stress recognition in speech

    No full text
    Mel-frequency cepstral coefficients introduced biologically-inspired features into speech technology, becoming the most commonly used representation for speech, speaker and emotion recognition, and even for applications in music. While this representation is quite popular, it is ambitious to assume that it would provide the best results for every application, as it is not designed for each specific objective.This work proposes a methodology to learn a speech representation from data by optimising a filter bank, in order to improve results in the classification of stressed speech. Since population-based metaheuristics have proved successful in related applications, an evolutionary algorithm is designed to search for a filter bank that maximises the classification accuracy. For the codification, spline functions are used to shape the filter banks, which allows reducing the number of parameters to optimise. The filter banks obtained with the proposed methodology improve the results in stressed and emotional speech classification.Fil: Vignolo, Leandro Daniel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Prasanna, S.R. Mahadeva. Indian Institute of Technology Guwahati; IndiaFil: Dandapat, Samarendra. Indian Institute of Technology Guwahati; IndiaFil: Rufiner, Hugo Leonardo. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Milone, Diego Humberto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; Argentin

    Empirical Mode Decomposition for adaptive AM-FM analysis of Speech: A Review

    No full text
    This work reviews the advancements in the non-conventional analysis of speech signals, particularly from an AM-FM analysis point of view. The benefits of such an analysis, as opposed to the traditional shorttime analysis of speech, is illustrated in this work. The inherent non-linearity of the speech productionsystem is discussed. The limitations of Fourier analysis, Linear Prediction (LP) analysis, and the Mel Filterbank Cepstral Coefficients (MFCCs), are presented, thus providing the motivation for the AM-FM representation of speech. The principle and methodology of traditional AM-FM analysis is discussed, as amethod of capturing the non-linear dynamics of the speech signal. The technique of Empirical Mode Decomposition (EMD) is then introduced as a means of performing adaptive AM-FM analysis of speech, alleviating the limitations of the fixed analysis provided by the traditional AM-FM methodology. The merits and demerits of EMD with respect to traditional AM-FM analysis is discussed. The developments of EMD to counter its demerits are presented. Selected applications of EMD in speech processing are briefly reviewed. The paper concludes by pointing out some aspects of speech processing where EMD might be explored.Fil: Sharma, Rajib. Indian Institute Of Technology Guwahati; IndiaFil: Vignolo, Leandro Daniel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Schlotthauer, Gaston. Universidad Nacional de Entre Ríos. Facultad de Ingeniería. Departamento de Matemática e Informática. Laboratorio de Señales y Dinámicas no Lineales; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe; ArgentinaFil: Colominas, Marcelo Alejandro. Universidad Nacional de Entre Ríos. Facultad de Ingeniería. Departamento de Matemática e Informática. Laboratorio de Señales y Dinámicas no Lineales; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe; ArgentinaFil: Rufiner, Hugo Leonardo. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Prasanna, S. R. M.. Indian Institute Of Technology Guwahati; Indi
    corecore