644 research outputs found

    Soft margin estimation for automatic speech recognition

    Get PDF
    In this study, a new discriminative learning framework, called soft margin estimation (SME), is proposed for estimating the parameters of continuous density hidden Markov models (HMMs). The proposed method makes direct use of the successful ideas of margin in support vector machines to improve generalization capability and decision feedback learning in discriminative training to enhance model separation in classifier design. SME directly maximizes the separation of competing models to enhance the testing samples to approach a correct decision if the deviation from training samples is within a safe margin. Frame and utterance selections are integrated into a unified framework to select the training utterances and frames critical for discriminating competing models. SME offers a flexible and rigorous framework to facilitate the incorporation of new margin-based optimization criteria into HMMs training. The choice of various loss functions is illustrated and different kinds of separation measures are defined under a unified SME framework. SME is also shown to be able to jointly optimize feature extraction and HMMs. Both the generalized probabilistic descent algorithm and the Extended Baum-Welch algorithm are applied to solve SME. SME has demonstrated its great advantage over other discriminative training methods in several speech recognition tasks. Tested on the TIDIGITS digit recognition task, the proposed SME approach achieves a string accuracy of 99.61%, the best result ever reported in literature. On the 5k-word Wall Street Journal task, SME reduced the word error rate (WER) from 5.06% of MLE models to 3.81%, with relative 25% WER reduction. This is the first attempt to show the effectiveness of margin-based acoustic modeling for large vocabulary continuous speech recognition in a HMMs framework. The generalization of SME was also well demonstrated on the Aurora 2 robust speech recognition task, with around 30% relative WER reduction from the clean-trained baseline.Ph.D.Committee Chair: Dr. Chin-Hui Lee; Committee Member: Dr. Anthony Joseph Yezzi; Committee Member: Dr. Biing-Hwang (Fred) Juang; Committee Member: Dr. Mark Clements; Committee Member: Dr. Ming Yua

    Model for self-consistent analysis of arbitrary MQW structures

    Full text link
    Self-consistent computations of the potential profile in complex semiconductor heterostructures can be successfully applied for comprehensive simulation of the gain and the absorption spectra, for the analysis of the capture, escape, tunneling, recombination, and relaxation phenomena and as a consequence it can be used for studying dynamical behavior of semiconductor lasers and amplifiers. However, many authors use non-entirely correct ways for the application of the method. In this paper the versatile model is proposed for the investigation, optimization, and the control of parameters of the semiconductor lasers and optical amplifiers which may be employed for the creation of new generations of the high-density photonic systems for the information processing and data transfer, follower and security arrangements. The model is based on the coupled Schredinger, Poisson and drift-diffusion equations which allow to determine energy quantization levels and wave functions of charge carriers, take into account built-in fields, and to investigate doped MQW structures and those under external electric fields influence. In the paper the methodology of computer realization based on our model is described. Boundary conditions for each equation and consideration of the convergence for the method are included. Frequently encountered in practice approaches and errors of self-consistent computations are described. Domains of applicability of the main approaches are estimated. Application examples of the method are given. Some of regularities of the results which were discovered by using self-consistent method are discussed. Design recommendations for structure optimization in respect to managing some parameters of AMQW structures are given.Comment: 12 pages, 2 table, 4 figures, Optics East Symposium, Conference on Physics and Applications of Optoelectronic Devices, October 25-28, 2004, Philadelphia, Pennsylvania, US

    Manhattan Cutset Sampling and Sensor Networks.

    Full text link
    Cutset sampling is a new approach to acquiring two-dimensional data, i.e., images, where values are recorded densely along straight lines. This type of sampling is motivated by physical scenarios where data must be taken along straight paths, such as a boat taking water samples. Additionally, it may be possible to better reconstruct image edges using the dense amount of data collected on lines. Finally, an advantage of cutset sampling is in the design of wireless sensor networks. If battery-powered sensors are placed densely along straight lines, then the transmission energy required for communication between sensors can be reduced, thereby extending the network lifetime. A special case of cutset sampling is Manhattan sampling, where data is recorded along evenly-spaced rows and columns. This thesis examines Manhattan sampling in three contexts. First, we prove a sampling theorem demonstrating an image can be perfectly reconstructed from Manhattan samples when its spectrum is bandlimited to the union of two Nyquist regions corresponding to the two lattices forming the Manhattan grid. An efficient ``onion peeling'' reconstruction method is provided, and we show that the Landau bound is achieved. This theorem is generalized to dimensions higher than two, where again signals are reconstructable from a Manhattan set if they are bandlimited to a union of Nyquist regions. Second, for non-bandlimited images, we present several algorithms for reconstructing natural images from Manhattan samples. The Locally Orthogonal Orientation Penalization (LOOP) algorithm is the best of the proposed algorithms in both subjective quality and mean-squared error. The LOOP algorithm reconstructs images well in general, and outperforms competing algorithms for reconstruction from non-lattice samples. Finally, we study cutset networks, which are new placement topologies for wireless sensor networks. Assuming a power-law model for communication energy, we show that cutset networks offer reduced communication energy costs over lattice and random topologies. Additionally, when solving centralized and decentralized source localization problems, cutset networks offer reduced energy costs over other topologies for fixed sensor densities and localization accuracies. Finally, with the eventual goal of analyzing different cutset topologies, we analyze the energy per distance required for efficient long-distance communication in lattice networks.PhDElectrical Engineering: SystemsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/120876/1/mprelee_1.pd

    Thermomechanical Characterization and Modeling of Superelastic Shape Memory Alloy Beams and Frames.

    Full text link
    Of existing applications, the majority of shape memory alloy (SMA) devices consist of beam (orthodontic wire, eye glasses frames, catheter guide wires) and framed structures (cardiovascular stents, vena cava filters). Although uniaxial tension data is often sufficient to model basic beam behavior (which has been the main focus of the research community), the tension-compression asymmetry and complex phase transformation behavior of SMAs suggests more information is necessary to properly model higher complexity states of loading. In this work, SMA beams are experimentally characterized under general loading conditions (including tension, compression, pure bending, and buckling); furthermore, a model is developed with respect to general beam deformation based on the relevant phenomena observed in the experimental characterization. Stress induced phase transformation within superelastic SMA beams is shown to depend on not only the loading mode, but also kinematic constraints imposed by beam geometry (such as beam cross-section and length). In the cases of tension and pure bending, the structural behavior is unstable and corresponds to phase transformation localization and propagation. This unstable behavior is the result of a local level up-down-up stress/strain response in tension, which is measured here using a novel composite--based experimental technique. In addition to unstable phase transformation, intriguing post-buckling straightening is observed in short SMA columns during monotonic loading (termed unbuckling here). Based on this phenomenological understanding of SMA beam behavior, a trilinear based material law is developed in the context of a Shanley column model and is found to capture many of the relevant features of column buckling, including the experimentally observed unbuckling behavior. Due to the success of this model, it is generalized within the context of beam theory and, in conjunction with Bloch wave stability analysis, is used to model and design SMA honeycombs.PHDAerospace EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/113455/1/watkinrt_1.pd

    A Formal Model of Ambiguity and its Applications in Machine Translation

    Get PDF
    Systems that process natural language must cope with and resolve ambiguity. In this dissertation, a model of language processing is advocated in which multiple inputs and multiple analyses of inputs are considered concurrently and a single analysis is only a last resort. Compared to conventional models, this approach can be understood as replacing single-element inputs and outputs with weighted sets of inputs and outputs. Although processing components must deal with sets (rather than individual elements), constraints are imposed on the elements of these sets, and the representations from existing models may be reused. However, to deal efficiently with large (or infinite) sets, compact representations of sets that share structure between elements, such as weighted finite-state transducers and synchronous context-free grammars, are necessary. These representations and algorithms for manipulating them are discussed in depth in depth. To establish the effectiveness and tractability of the proposed processing model, it is applied to several problems in machine translation. Starting with spoken language translation, it is shown that translating a set of transcription hypotheses yields better translations compared to a baseline in which a single (1-best) transcription hypothesis is selected and then translated, independent of the translation model formalism used. More subtle forms of ambiguity that arise even in text-only translation (such as decisions conventionally made during system development about how to preprocess text) are then discussed, and it is shown that the ambiguity-preserving paradigm can be employed in these cases as well, again leading to improved translation quality. A model for supervised learning that learns from training data where sets (rather than single elements) of correct labels are provided for each training instance and use it to learn a model of compound word segmentation is also introduced, which is used as a preprocessing step in machine translation

    Acoustic-phonetic constraints in continuous speech recognition: a case study using the digit vocabulary.

    Get PDF
    Thesis (Ph.D.)—Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1985.Includes bibliographical references (leaves 155-159).This electronic version was scanned from a copy of the thesis on file at the Speech Communication Group. The certified thesis is available in the Institute Archives and Special Collections.Vinton-Hayes Fellowship. DARPA, monitored through the Office of Naval Research. System Development Foundation.Ph.D
    • …
    corecore