12 research outputs found

    Solving Large-Margin Hidden Markov Model Estimation via Semidefinite Programming

    Full text link

    Soft margin estimation for automatic speech recognition

    Get PDF
    In this study, a new discriminative learning framework, called soft margin estimation (SME), is proposed for estimating the parameters of continuous density hidden Markov models (HMMs). The proposed method makes direct use of the successful ideas of margin in support vector machines to improve generalization capability and decision feedback learning in discriminative training to enhance model separation in classifier design. SME directly maximizes the separation of competing models to enhance the testing samples to approach a correct decision if the deviation from training samples is within a safe margin. Frame and utterance selections are integrated into a unified framework to select the training utterances and frames critical for discriminating competing models. SME offers a flexible and rigorous framework to facilitate the incorporation of new margin-based optimization criteria into HMMs training. The choice of various loss functions is illustrated and different kinds of separation measures are defined under a unified SME framework. SME is also shown to be able to jointly optimize feature extraction and HMMs. Both the generalized probabilistic descent algorithm and the Extended Baum-Welch algorithm are applied to solve SME. SME has demonstrated its great advantage over other discriminative training methods in several speech recognition tasks. Tested on the TIDIGITS digit recognition task, the proposed SME approach achieves a string accuracy of 99.61%, the best result ever reported in literature. On the 5k-word Wall Street Journal task, SME reduced the word error rate (WER) from 5.06% of MLE models to 3.81%, with relative 25% WER reduction. This is the first attempt to show the effectiveness of margin-based acoustic modeling for large vocabulary continuous speech recognition in a HMMs framework. The generalization of SME was also well demonstrated on the Aurora 2 robust speech recognition task, with around 30% relative WER reduction from the clean-trained baseline.Ph.D.Committee Chair: Dr. Chin-Hui Lee; Committee Member: Dr. Anthony Joseph Yezzi; Committee Member: Dr. Biing-Hwang (Fred) Juang; Committee Member: Dr. Mark Clements; Committee Member: Dr. Ming Yua

    Optimización de ecuaciones con restricciones no lineales: comparativo entre técnicas heurística y convexa

    Get PDF
    In this article, different optimization techniques were explored through different methodologies. It is important to highlight that optimization problems are found in a large number of academic disciplines and the paths proposed to solve them are found first in the so-called strong mathematical techniques (global optimum) through existence and uniqueness theorems, and the second way, the so-called heuristic or metaheuristic techniques, inspired mostly by biological, social, and cultural processes which allow expanding the search spaces for solutions or relaxing the functions to be optimized from continuous to non-continuous as well as constraints. The metaheuristic technique studied is the particle swarm optimization, (PSO) based on the complete model (cognitive and social components) which is a metaheuristic technique inspired by biology, comparatively with the convex mathematical technique using the behavior of positive semi-definite matrices, for the formulation and modeling of problems with objective functions and convex feasible regions. The problem solved by these two methods consists of knowing the values of the resources of two variables within an objective function. Finally, the answers obtained are evaluated under the assumption that the local minima are global minima within the neighborhood.En el presente artículo se exploran diversas técnicas de optimización a través de metodologías diferentes; es importante resaltar que los problemas de optimización se encuentran en una gran multitud de disciplinas académicas, y los caminos propuestos para resolverlos se encuentran, el primero, en las técnicas matemáticas denominadas fuertes (óptimo global) a través de teoremas de existencia y unicidad, y el segundo camino, en las denominadas técnicas heurísticas o metaheurísticas inspiradas en su mayoría en procesos biológicos, sociales, culturales, las cuales permiten ampliar los espacios de búsqueda de las soluciones o relajar las funciones por optimizar de continuas a no continuas, al igual que las restricciones. La técnica metaheurística estudiada es el enjambre de partículas, (PSO) basada en el modelo completo (componentes cognitiva y social), el cual es una técnica metaheurística inspirada en la biología, comparativamente con la técnica matemática convexa utilizando el comportamiento de las matrices semidefinidas positivas, para el planteamiento y modelado de problemas con funciones objetivo y regiones factibles convexas. El problema resuelto por estos dos métodos consiste en conocer los valores de los recursos de dos variables dentro de una función objetivo. Por último, se evalúan las respuestas obtenidas bajo la suposición de que los mínimos locales son mínimos globales dentro de la vecindad

    Graph Inference with Applications to Low-Resource Audio Search and Indexing

    Get PDF
    The task of query-by-example search is to retrieve, from among a collection of data, the observations most similar to a given query. A common approach to this problem is based on viewing the data as vertices in a graph in which edge weights reflect similarities between observations. Errors arise in this graph-based framework both from errors in measuring these similarities and from approximations required for fast retrieval. In this thesis, we use tools from graph inference to analyze and control the sources of these errors. We establish novel theoretical results related to representation learning and to vertex nomination, and use these results to control the effects of model misspecification, noisy similarity measurement and approximation error on search accuracy. We present a state-of-the-art system for query-by-example audio search in the context of low-resource speech recognition, which also serves as an illustrative example and testbed for applying our theoretical results

    Pattern Recognition

    Get PDF
    Pattern recognition is a very wide research field. It involves factors as diverse as sensors, feature extraction, pattern classification, decision fusion, applications and others. The signals processed are commonly one, two or three dimensional, the processing is done in real- time or takes hours and days, some systems look for one narrow object class, others search huge databases for entries with at least a small amount of similarity. No single person can claim expertise across the whole field, which develops rapidly, updates its paradigms and comprehends several philosophical approaches. This book reflects this diversity by presenting a selection of recent developments within the area of pattern recognition and related fields. It covers theoretical advances in classification and feature extraction as well as application-oriented works. Authors of these 25 works present and advocate recent achievements of their research related to the field of pattern recognition

    A compact semidefinite programming (SDP) formulation for large margin estimation of HMMS in speech recognition

    No full text
    corecore