213 research outputs found

    Sparse and Low-rank Modeling for Automatic Speech Recognition

    Get PDF
    This thesis deals with exploiting the low-dimensional multi-subspace structure of speech towards the goal of improving acoustic modeling for automatic speech recognition (ASR). Leveraging the parsimonious hierarchical nature of speech, we hypothesize that whenever a speech signal is measured in a high-dimensional feature space, the true class information is embedded in low-dimensional subspaces whereas noise is scattered as random high-dimensional erroneous estimations in the features. In this context, the contribution of this thesis is twofold: (i) identify sparse and low-rank modeling approaches as excellent tools for extracting the class-specific low-dimensional subspaces in speech features, and (ii) employ these tools under novel ASR frameworks to enrich the acoustic information present in the speech features towards the goal of improving ASR. Techniques developed in this thesis focus on deep neural network (DNN) based posterior features which, under the sparse and low-rank modeling approaches, unveil the underlying class-specific low-dimensional subspaces very elegantly. In this thesis, we tackle ASR tasks of varying difficulty, ranging from isolated word recognition (IWR) and connected digit recognition (CDR) to large-vocabulary continuous speech recognition (LVCSR). For IWR and CDR, we propose a novel \textit{Compressive Sensing} (CS) perspective towards ASR. Here exemplar-based speech recognition is posed as a problem of recovering sparse high-dimensional word representations from compressed low-dimensional phonetic representations. In the context of LVCSR, this thesis argues that albeit their power in representation learning, DNN based acoustic models still have room for improvement in exploiting the \textit{union of low-dimensional subspaces} structure of speech data. Therefore, this thesis proposes to enhance DNN posteriors by projecting them onto the manifolds of the underlying classes using principal component analysis (PCA) or compressive sensing based dictionaries. Projected posteriors are shown to be more accurate training targets for learning better acoustic models, resulting in improved ASR performance. The proposed approach is evaluated on both close-talk and far-field conditions, confirming the importance of sparse and low-rank modeling of speech in building a robust ASR framework. Finally, the conclusions of this thesis are further consolidated by an information theoretic analysis approach which explicitly quantifies the contribution of proposed techniques in improving ASR

    OPTIMIZATION MODELS AND METHODOLOGIES TO SUPPORT EMERGENCY PREPAREDNESS AND POST-DISASTER RESPONSE

    Get PDF
    This dissertation addresses three important optimization problems arising during the phases of pre-disaster emergency preparedness and post-disaster response in time-dependent, stochastic and dynamic environments. The first problem studied is the building evacuation problem with shared information (BEPSI), which seeks a set of evacuation routes and the assignment of evacuees to these routes with the minimum total evacuation time. The BEPSI incorporates the constraints of shared information in providing on-line instructions to evacuees and ensures that evacuees departing from an intermediate or source location at a mutual point in time receive common instructions. A mixed-integer linear program is formulated for the BEPSI and an exact technique based on Benders decomposition is proposed for its solution. Numerical experiments conducted on a mid-sized real-world example demonstrate the effectiveness of the proposed algorithm. The second problem addressed is the network resilience problem (NRP), involving an indicator of network resilience proposed to quantify the ability of a network to recover from randomly arising disruptions resulting from a disaster event. A stochastic, mixed integer program is proposed for quantifying network resilience and identifying the optimal post-event course of action to take. A solution technique based on concepts of Benders decomposition, column generation and Monte Carlo simulation is proposed. Experiments were conducted to illustrate the resilience concept and procedure for its measurement, and to assess the role of network topology in its magnitude. The last problem addressed is the urban search and rescue team deployment problem (USAR-TDP). The USAR-TDP seeks an optimal deployment of USAR teams to disaster sites, including the order of site visits, with the ultimate goal of maximizing the expected number of saved lives over the search and rescue period. A multistage stochastic program is proposed to capture problem uncertainty and dynamics. The solution technique involves the solution of a sequence of interrelated two-stage stochastic programs with recourse. A column generation-based technique is proposed for the solution of each problem instance arising as the start of each decision epoch over a time horizon. Numerical experiments conducted on an example of the 2010 Haiti earthquake are presented to illustrate the effectiveness of the proposed approach
    • …
    corecore