11 research outputs found

    Learning error-correcting representations for multi-class problems

    Get PDF
    [eng] Real life is full of multi-class decision tasks. In the Pattern Recognition field, several method- ologies have been proposed to deal with binary problems obtaining satisfying results in terms of performance. However, the extension of very powerful binary classifiers to the multi-class case is a complex task. The Error-Correcting Output Codes framework has demonstrated to be a very powerful tool to combine binary classifiers to tackle multi-class problems. However, most of the combinations of binary classifiers in the ECOC framework overlook the underlay- ing structure of the multi-class problem. In addition, is still unclear how the Error-Correction of an ECOC design is distributed among the different classes. In this dissertation, we are interested in tackling critic problems of the ECOC framework, such as the definition of the number of classifiers to tackle a multi-class problem, how to adapt the ECOC coding to multi-class data and how to distribute error-correction among different pairs of categories. In order to deal with this issues, this dissertation describes several proposals. 1) We define a new representation for ECOC coding matrices that expresses the pair-wise codeword separability and allows for a deeper understanding of how error-correction is distributed among classes. 2) We study the effect of using a logarithmic number of binary classifiers to treat the multi-class problem in order to obtain very efficient models. 3) In order to search for very compact ECOC coding matrices that take into account the distribution of multi-class data we use Genetic Algorithms that take into account the constraints of the ECOC framework. 4) We propose a discrete factorization algorithm that finds an ECOC configuration that allocates the error-correcting capabilities to those classes that are more prone to errors. The proposed methodologies are evaluated on different real and synthetic data sets: UCI Machine Learning Repository, handwriting symbols, traffic signs from a Mobile Mapping System, and Human Pose Recovery. The results of this thesis show that significant perfor- mance improvements are obtained on traditional coding ECOC designs when the proposed ECOC coding designs are taken into account. [[spa] En la vida cotidiana las tareas de decisión multi-clase surgen constantemente. En el campo de Reconocimiento de Patrones muchos métodos de clasificación binaria han sido propuestos obteniendo resultados altamente satisfactorios en términos de rendimiento. Sin embargo, la extensión de estos sofisticados clasificadores binarios al contexto multi-clase es una tarea compleja. En este ámbito, las estrategias de Códigos Correctores de Errores (CCEs) han demostrado ser una herramienta muy potente para tratar la combinación de clasificadores binarios. No obstante, la mayoría de arquitecturas de combinación de clasificadores binarios negligen la estructura del problema multi-clase. Sin embargo, el análisis de la distribución de corrección de errores entre clases es aún un problema abierto. En esta tesis doctoral, nos centramos en tratar problemas críticos de los códigos correctores de errores; la definición del número de clasificadores necesarios para tratar un problema multi-clase arbitrario; la adaptación de los problemas binarios al problema multi-clase y cómo distribuir la corrección de errores entre clases. Para dar respuesta a estas cuestiones, en esta tesis doctoral describimos varias propuestas. 1) Definimos una nueva representación para CCEs que expresa la separabilidad entre pares de códigos y nos permite una mejor comprensión de cómo se distribuye la corrección de errores entre distintas clases. 2) Estudiamos el efecto de usar un número logarítmico de clasificadores binarios para tratar el problema multi-clase con el objetivo de obtener modelos muy eficientes. 3) Con el objetivo de encontrar modelos muy eficientes que tienen en cuenta la estructura del problema multi-clase utilizamos algoritmos genéticos que tienen en cuenta las restricciones de los ECCs. 4) Pro- ponemos un algoritmo de factorización de matrices discreta que encuentra ECCs con una configuración que distribuye corrección de error a aquellas categorías que son más propensas a tener errores. Las metodologías propuestas son evaluadas en distintos problemas reales y sintéticos como por ejemplo: Repositorio UCI de Aprendizaje Automático, reconocimiento de símbolos escritos, clasificación de señales de tráfico y reconocimiento de la pose humana. Los resultados obtenidos en esta tesis muestran mejoras significativas en rendimiento comparados con los diseños tradiciones de ECCs cuando las distintas propuestas se tienen en cuenta

    Generalized Stacked Sequential Learning

    Get PDF
    [eng] Over the past few decades, machine learning (ML) algorithms have become a very useful tool in tasks where designing and programming explicit, rule-based algorithms are infeasible. Some examples of applications where machine learning has been applied successfully are spam filtering, optical character recognition (OCR), search engines and computer vision. One of the most common tasks in ML is supervised learning, where the goal is to learn a general model able to predict the correct label of unseen examples from a set of known labeled input data. In supervised learning often it is assumed that data is independent and identically distributed (i.i.d ). This means that each sample in the data set has the same probability distribution as the others and all are mutually independent. However, classification problems in real world databases can break this i.i.d. assumption. For example, consider the case of object recognition in image understanding. In this case, if one pixel belongs to a certain object category, it is very likely that neighboring pixels also belong to the same object, with the exception of the borders. Another example is the case of a laughter detection application from voice records. A laugh has a clear pattern alternating voice and non-voice segments. Thus, discriminant information comes from the alternating pattern, and not just by the samples on their own. Another example can be found in the case of signature section recognition in an e-mail. In this case, the signature is usually found at the end of the mail, thus important discriminant information is found in the context. Another case is part-of-speech tagging in which each example describes a word that is categorized as noun, verb, adjective, etc. In this case it is very unlikely that patterns such as [verb, verb, adjective, verb] occur. All these applications present a common feature: the sequence/context of the labels matters. Sequential learning (25) breaks the i.i.d. assumption and assumes that samples are not independently drawn from a joint distribution of the data samples X and their labels Y . In sequential learning the training data actually consists of sequences of pairs (x, y), so that neighboring examples exhibit some kind of correlation. Usually sequential learning applications consider one-dimensional relationship support, but these types of relationships appear very frequently in other domains, such as images, or video. Sequential learning should not be confused with time series prediction. The main difference between both problems lays in the fact that sequential learning has access to the whole data set before any prediction is made and the full set of labels is to be provided at the same time. On the other hand, time series prediction has access to real labels up to the current time t and the goal is to predict the label at t + 1. Another related but different problem is sequence classification. In this case, the problem is to predict a single label for an input sequence. If we consider the image domain, the sequential learning goal is to classify the pixels of the image taking into account their context, while sequence classification is equivalent to classify one full image as one class. Sequential learning has been addressed from different perspectives: from the point of view of meta-learning by means of sliding window techniques, recurrent sliding windows or stacked sequential learning where the method is formulated as a combination of classifiers; or from the point of view of graphical models, using for example Hidden Markov Models or Conditional Random Fields. In this thesis, we are concerned with meta-learning strategies. Cohen et al. (17) showed that stacked sequential learning (SSL from now on) performed better than CRF and HMM on a subset of problems called “sequential partitioning problems”. These problems are characterized by long runs of identical labels. Moreover, SSL is computationally very efficient since it only needs to train two classifiers a constant number of times. Considering these benefits, we decided to explore in depth sequential learning using SSL and generalize the Cohen architecture to deal with a wider variety of problems

    Contributions to High-Dimensional Pattern Recognition

    Full text link
    This thesis gathers some contributions to statistical pattern recognition particularly targeted at problems in which the feature vectors are high-dimensional. Three pattern recognition scenarios are addressed, namely pattern classification, regression analysis and score fusion. For each of these, an algorithm for learning a statistical model is presented. In order to address the difficulty that is encountered when the feature vectors are high-dimensional, adequate models and objective functions are defined. The strategy of learning simultaneously a dimensionality reduction function and the pattern recognition model parameters is shown to be quite effective, making it possible to learn the model without discarding any discriminative information. Another topic that is addressed in the thesis is the use of tangent vectors as a way to take better advantage of the available training data. Using this idea, two popular discriminative dimensionality reduction techniques are shown to be effectively improved. For each of the algorithms proposed throughout the thesis, several data sets are used to illustrate the properties and the performance of the approaches. The empirical results show that the proposed techniques perform considerably well, and furthermore the models learned tend to be very computationally efficient.Villegas Santamaría, M. (2011). Contributions to High-Dimensional Pattern Recognition [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/10939Palanci

    A speaker classification framework for non-intrusive user modeling : speech-based personalization of in-car services

    Get PDF
    Speaker Classification, i.e. the automatic detection of certain characteristics of a person based on his or her voice, has a variety of applications in modern computer technology and artificial intelligence: As a non-intrusive source for user modeling, it can be employed for personalization of human-machine interfaces in numerous domains. This dissertation presents a principled approach to the design of a novel Speaker Classification system for automatic age and gender recognition which meets these demands. Based on literature studies, methods and concepts dealing with the underlying pattern recognition task are developed. The final system consists of an incremental GMM-SVM supervector architecture with several optimizations. An extensive data-driven experiment series explores the parameter space and serves as evaluation of the component. Further experiments investigate the language-independence of the approach. As an essential part of this thesis, a framework is developed that implements all tasks associated with the design and evaluation of Speaker Classification in an integrated development environment that is able to generate efficient runtime modules for multiple platforms. Applications from the automotive field and other domains demonstrate the practical benefit of the technology for personalization, e.g. by increasing local danger warning lead time for elderly drivers.Die Sprecherklassifikation, also die automatische Erkennung bestimmter Merkmale einer Person anhand ihrer Stimme, besitzt eine Vielzahl von Anwendungsmöglichkeiten in der modernen Computertechnik und Künstlichen Intelligenz: Als nicht-intrusive Wissensquelle für die Benutzermodellierung kann sie zur Personalisierung in vielen Bereichen eingesetzt werden. In dieser Dissertation wird ein fundierter Ansatz zum Entwurf eines neuartigen Sprecherklassifikationssystems zur automatischen Bestimmung von Alter und Geschlecht vorgestellt, welches diese Anforderungen erfüllt. Ausgehend von Literaturstudien werden Konzepte und Methoden zur Behandlung des zugrunde liegenden Mustererkennungsproblems entwickelt, welche zu einer inkrementell arbeitenden GMM-SVM-Supervector-Architektur mit diversen Optimierungen führen. Eine umfassende datengetriebene Experimentalreihe dient der Erforschung des Parameterraumes und zur Evaluierung der Komponente. Weitere Studien untersuchen die Sprachunabhängigkeit des Ansatzes. Als wesentlicher Bestandteil der Arbeit wird ein Framework entwickelt, das alle im Zusammenhang mit Entwurf und Evaluierung von Sprecherklassifikation anfallenden Aufgaben in einer integrierten Entwicklungsumgebung implementiert, welche effiziente Laufzeitmodule für verschiedene Plattformen erzeugen kann. Anwendungen aus dem Automobilbereich und weiteren Domänen demonstrieren den praktischen Nutzen der Technologie zur Personalisierung, z.B. indem die Vorlaufzeit von lokalen Gefahrenwarnungen für ältere Fahrer erhöht wird

    Classification with class-independent quality information for biometric verification

    Get PDF
    Biometric identity verification systems frequently face the challenges of non-controlled conditions of data acquisition. Under such conditions biometric signals may suffer from quality degradation due to extraneous, identity-independent factors. It has been demonstrated in numerous reports that a degradation of biometric signal quality is a frequent cause of significant deterioration of classification performance, also in multiple-classifier, multimodal systems, which systematically outperform their single-classifier counterparts. Seeking to improve the robustness of classifiers to degraded data quality, researchers started to introduce measures of signal quality into the classification process. In the existing approaches, the role of class-independent quality information is governed by intuitive rather than mathematical notions, resulting in a clearly drawn distinction between the single-, multiple-classifier and multimodal approaches. The application of quality measures in a multiple-classifier system has received far more attention, with a dominant intuitive notion that a classifier that has data of higher quality at its disposal ought to be more credible than a classifier that operates on noisy signals. In the case of single-classifier systems a quality-based selection of models, classifiers or thresholds has been proposed. In both cases, quality measures have the function of meta-information which supervises but not intervenes with the actual classifier or classifiers employed to assign class labels to modality-specific and class-selective features. In this thesis we argue that in fact the very same mechanism governs the use of quality measures in single- and multi-classifier systems alike, and we present a quantitative rather than intuitive perspective on the role of quality measures in classification. We notice the fact that for a given set of classification features and their fixed marginal distributions, the class separation in the joint feature space changes with the statistical dependencies observed between the individual features. The same effect applies to a feature space in which some of the features are class-independent. Consequently, we demonstrate that the class separation can be improved by augmenting the feature space with class-independent quality information, provided that it sports statistical dependencies on the class-selective features. We discuss how to construct classifier-quality measure ensembles in which the dependence between classification scores and the quality features helps decrease classification errors below those obtained using the classification scores alone. We propose Q – stack, a novel theoretical framework of improving classification with class-independent quality measures based on the concept of classifier stacking. In the scheme of Q – stack a classifier ensemble is used in which the first classifier layer is made of the baseline unimodal classifiers, and the second, stacked classifier operates on features composed of the normalized similarity scores and the relevant quality measures. We present Q – stack as a generalized framework of classification with quality information and we argue that previously proposed methods of classification with quality measures are its special cases. Further in this thesis we address the problem of estimating probability of single classification errors. We propose to employ the subjective Bayesian interpretation of single event probability as credence in the correctness of single classification decisions. We propose to apply the credence-based error predictor as a functional extension of the proposed Q – stack framework, where a Bayesian stacked classifier is employed. As such, the proposed method of credence estimation and error prediction inherits the benefit of seamless incorporation of quality information in the process of credence estimation. We propose a set of objective evaluation criteria for credence estimates, and we discuss how the proposed method can be applied together with an appropriate repair strategy to reduce classification errors to a desired target level. Finally, we demonstrate the application of Q – stack and its functional extension to single error prediction on the task of biometric identity verification using face and fingerprint modalities, and their multimodal combinations, using a real biometric database. We show that the use of the classification and error prediction methods proposed in this thesis allows for a systematic reduction of the error rates below those of the baseline classifiers

    Creating Persian-like music using computational intelligence

    Get PDF
    Dastgāh are modal systems in traditional Persian music. Each Dastgāh consists of a group of melodies called Gushé, classified in twelve groups about a century ago (Farhat, 1990). Prior to that time, musical pieces were transferred through oral tradition. The traditional music productions revolve around the existing Dastgāh, and Gushe pieces. In this thesis computational intelligence tools are employed in creating novel Dastgāh-like music.There are three types of creativity: combinational, exploratory, and transformational (Boden, 2000). In exploratory creativity, a conceptual space is navigated for discovering new forms. Sometimes the exploration results in transformational creativity. This is due to meaningful alterations happening on one or more of the governing dimensions of an item. In combinational creativity new links are established between items not previously connected. Boden stated that all these types of creativity can be implemented using artificial intelligence.Various tools, and techniques are employed, in the research reported in this thesis, for generating Dastgāh-like music. Evolutionary algorithms are responsible for navigating the space of sequences of musical motives. Aesthetical critics are employed for constraining the search space in exploratory (and hopefully transformational) type of creativity. Boltzmann machine models are applied for assimilating some of the mechanisms involved in combinational creativity. The creative processes involved are guided by aesthetical critics, some of which are derived from a traditional Persian music database.In this project, Cellular Automata (CA) are the main pattern generators employed to produce raw creative materials. Various methodologies are suggested for extracting features from CA progressions and mapping them to musical space, and input to audio synthesizers. The evaluation of the results of this thesis are assisted by publishing surveys which targeted both public and professional audiences. The generated audio samples are evaluated regarding their Dastgāh-likeness, and the level of creativity of the systems involved

    Pattern Recognition

    Get PDF
    A wealth of advanced pattern recognition algorithms are emerging from the interdiscipline between technologies of effective visual features and the human-brain cognition process. Effective visual features are made possible through the rapid developments in appropriate sensor equipments, novel filter designs, and viable information processing architectures. While the understanding of human-brain cognition process broadens the way in which the computer can perform pattern recognition tasks. The present book is intended to collect representative researches around the globe focusing on low-level vision, filter design, features and image descriptors, data mining and analysis, and biologically inspired algorithms. The 27 chapters coved in this book disclose recent advances and new ideas in promoting the techniques, technology and applications of pattern recognition

    Towards a Linear Combination of Dichotomizers by Margin Maximization

    No full text
    When dealing with two-class problems the combination of several dichotomizers is an established technique to improve the classification performance. In this context the margin is considered a central concept since several theoretical results show that improving the margin on the training set is beneficial for the generalization error of a classifier. In particular, this has been analyzed with reference to learning algorithms based on boosting which aim to build strong classifiers through the combination of many weak classifiers. In this paper we try to experimentally verify if the margin maximization can be beneficial also when combining already trained classifiers. We have employed an algorithm for evaluating the weights of a linear convex combination of dichotomizers so as to maximize the margin of the combination on the training set. Several experiments performed on publicly available data sets have shown that a combination based on margin maximization could be particularly effective if compared with other established fusion methods
    corecore