827 research outputs found

    A Bayesian Network View on Acoustic Model-Based Techniques for Robust Speech Recognition

    Full text link
    This article provides a unifying Bayesian network view on various approaches for acoustic model adaptation, missing feature, and uncertainty decoding that are well-known in the literature of robust automatic speech recognition. The representatives of these classes can often be deduced from a Bayesian network that extends the conventional hidden Markov models used in speech recognition. These extensions, in turn, can in many cases be motivated from an underlying observation model that relates clean and distorted feature vectors. By converting the observation models into a Bayesian network representation, we formulate the corresponding compensation rules leading to a unified view on known derivations as well as to new formulations for certain approaches. The generic Bayesian perspective provided in this contribution thus highlights structural differences and similarities between the analyzed approaches

    Research on Effective Designs and Evaluation for Speech Interface Systems

    Get PDF
    制度:新 ; 報告番号:乙2305号 ; 学位の種類:博士(工学) ; 授与年月日:2011/2/25 ; 早大学位記番号:新564

    VOICE BIOMETRICS UNDER MISMATCHED NOISE CONDITIONS

    Get PDF
    This thesis describes research into effective voice biometrics (speaker recognition) under mismatched noise conditions. Over the last two decades, this class of biometrics has been the subject of considerable research due to its various applications in such areas as telephone banking, remote access control and surveillance. One of the main challenges associated with the deployment of voice biometrics in practice is that of undesired variations in speech characteristics caused by environmental noise. Such variations can in turn lead to a mismatch between the corresponding test and reference material from the same speaker. This is found to adversely affect the performance of speaker recognition in terms of accuracy. To address the above problem, a novel approach is introduced and investigated. The proposed method is based on minimising the noise mismatch between reference speaker models and the given test utterance, and involves a new form of Test-Normalisation (T-Norm) for further enhancing matching scores under the aforementioned adverse operating conditions. Through experimental investigations, based on the two main classes of speaker recognition (i.e. verification/ open-set identification), it is shown that the proposed approach can significantly improve the performance accuracy under mismatched noise conditions. In order to further improve the recognition accuracy in severe mismatch conditions, an approach to enhancing the above stated method is proposed. This, which involves providing a closer adjustment of the reference speaker models to the noise condition in the test utterance, is shown to considerably increase the accuracy in extreme cases of noisy test data. Moreover, to tackle the computational burden associated with the use of the enhanced approach with open-set identification, an efficient algorithm for its realisation in this context is introduced and evaluated. The thesis presents a detailed description of the research undertaken, describes the experimental investigations and provides a thorough analysis of the outcomes

    Multi-Factor Authentication: A Survey

    Get PDF
    Today, digitalization decisively penetrates all the sides of the modern society. One of the key enablers to maintain this process secure is authentication. It covers many different areas of a hyper-connected world, including online payments, communications, access right management, etc. This work sheds light on the evolution of authentication systems towards Multi-Factor Authentication (MFA) starting from Single-Factor Authentication (SFA) and through Two-Factor Authentication (2FA). Particularly, MFA is expected to be utilized for human-to-everything interactions by enabling fast, user-friendly, and reliable authentication when accessing a service. This paper surveys the already available and emerging sensors (factor providers) that allow for authenticating a user with the system directly or by involving the cloud. The corresponding challenges from the user as well as the service provider perspective are also reviewed. The MFA system based on reversed Lagrange polynomial within Shamir’s Secret Sharing (SSS) scheme is further proposed to enable more flexible authentication. This solution covers the cases of authenticating the user even if some of the factors are mismatched or absent. Our framework allows for qualifying the missing factors by authenticating the user without disclosing sensitive biometric data to the verification entity. Finally, a vision of the future trends in MFA is discussed.Peer reviewe

    Support Vector Machines for Speech Recognition

    Get PDF
    Hidden Markov models (HMM) with Gaussian mixture observation densities are the dominant approach in speech recognition. These systems typically use a representational model for acoustic modeling which can often be prone to overfitting and does not translate to improved discrimination. We propose a new paradigm centered on principles of structural risk minimization using a discriminative framework for speech recognition based on support vector machines (SVMs). SVMs have the ability to simultaneously optimize the representational and discriminative ability of the acoustic classifiers. We have developed the first SVM-based large vocabulary speech recognition system that improves performance over traditional HMM-based systems. This hybrid system achieves a state-of-the-art word error rate of 10.6% on a continuous alphadigit task ? a 10% improvement relative to an HMM system. On SWITCHBOARD, a large vocabulary task, the system improves performance over a traditional HMM system from 41.6% word error rate to 40.6%. This dissertation discusses several practical issues that arise when SVMs are incorporated into the hybrid system

    Proceedings of the second "international Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST'14)

    Get PDF
    The implicit objective of the biennial "international - Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST) is to foster collaboration between international scientific teams by disseminating ideas through both specific oral/poster presentations and free discussions. For its second edition, the iTWIST workshop took place in the medieval and picturesque town of Namur in Belgium, from Wednesday August 27th till Friday August 29th, 2014. The workshop was conveniently located in "The Arsenal" building within walking distance of both hotels and town center. iTWIST'14 has gathered about 70 international participants and has featured 9 invited talks, 10 oral presentations, and 14 posters on the following themes, all related to the theory, application and generalization of the "sparsity paradigm": Sparsity-driven data sensing and processing; Union of low dimensional subspaces; Beyond linear and convex inverse problem; Matrix/manifold/graph sensing/processing; Blind inverse problems and dictionary learning; Sparsity and computational neuroscience; Information theory, geometry and randomness; Complexity/accuracy tradeoffs in numerical methods; Sparsity? What's next?; Sparse machine learning and inference.Comment: 69 pages, 24 extended abstracts, iTWIST'14 website: http://sites.google.com/site/itwist1

    Image Restoration

    Get PDF
    This book represents a sample of recent contributions of researchers all around the world in the field of image restoration. The book consists of 15 chapters organized in three main sections (Theory, Applications, Interdisciplinarity). Topics cover some different aspects of the theory of image restoration, but this book is also an occasion to highlight some new topics of research related to the emergence of some original imaging devices. From this arise some real challenging problems related to image reconstruction/restoration that open the way to some new fundamental scientific questions closely related with the world we interact with

    On parameterized deformations and unsupervised learning

    Get PDF

    Looking beyond Pixels:Theory, Algorithms and Applications of Continuous Sparse Recovery

    Get PDF
    Sparse recovery is a powerful tool that plays a central role in many applications, including source estimation in radio astronomy, direction of arrival estimation in acoustics or radar, super-resolution microscopy, and X-ray crystallography. Conventional approaches usually resort to discretization, where the sparse signals are estimated on a pre-defined grid. However, sparse signals do not line up conveniently on any grid in reality. While the discrete setup usually leads to a simple optimization problem that can be solved with standard tools, there are two noticeable drawbacks: (i) Because of the model mismatch, the effective noise level is increased; (ii) The minimum reachable resolution is limited by the grid step-size. Because of the limitations, it is essential to develop a technique that estimates sparse signals in the continuous-domain--in essence seeing beyond pixels. The aims of this thesis are (i) to further develop a continuous-domain sparse recovery framework based on finite rate of innovation (FRI) sampling on both theoretical and algorithmic aspects; (ii) adapt the proposed technique to several applications, namely radio astronomy point source estimation, direction of arrival estimation in acoustics, and single image up-sampling; (iii) show that the continuous-domain sparse recovery approach can surpass the instrument resolution limit and achieve super-resolution. We propose a continuous-domain sparse recovery technique by generalizing the FRI sampling framework to cases with non-uniform measurements. We achieve this by identifying a set of unknown uniform sinusoidal samples and the linear transformation that links the uniform samples of sinusoids to the measurements. The continuous-domain sparsity constraint can be equivalently enforced with a discrete convolution equation of these sinusoidal samples. The sparse signal is reconstructed by minimizing the fitting error between the given and the re-synthesized measurements subject to the sparsity constraint. Further, we develop a multi-dimensional sampling framework for Diracs in two or higher dimensions with linear sample complexity. This is a significant improvement over previous methods, which have a complexity that increases exponentially with dimension. An efficient algorithm has been proposed to find a valid solution to the continuous-domain sparse recovery problem such that the reconstruction (i) satisfies the sparsity constraint; and (ii) fits the measurements (up to the noise level). We validate the flexibility and robustness of the FRI-based continuous-domain sparse recovery in both simulations and experiments with real data. We show that the proposed method surpasses the diffraction limit of radio telescopes with both realistic simulation and real data from the LOFAR radio telescope. In addition, FRI-based sparse reconstruction requires fewer measurements and smaller baselines to reach a similar reconstruction quality compared with conventional methods. Next, we apply the proposed approach to direction of arrival estimation in acoustics. We show that accurate off-grid source locations can be reliably estimated from microphone measurements with arbitrary array geometries. Finally, we demonstrate the effectiveness of the continuous-domain sparsity constraint in regularizing an otherwise ill-posed inverse problem, namely single-image super-resolution. By incorporating image edge models, the up-sampled image retains sharp edges and is free from ringing artifacts

    Recent Application in Biometrics

    Get PDF
    In the recent years, a number of recognition and authentication systems based on biometric measurements have been proposed. Algorithms and sensors have been developed to acquire and process many different biometric traits. Moreover, the biometric technology is being used in novel ways, with potential commercial and practical implications to our daily activities. The key objective of the book is to provide a collection of comprehensive references on some recent theoretical development as well as novel applications in biometrics. The topics covered in this book reflect well both aspects of development. They include biometric sample quality, privacy preserving and cancellable biometrics, contactless biometrics, novel and unconventional biometrics, and the technical challenges in implementing the technology in portable devices. The book consists of 15 chapters. It is divided into four sections, namely, biometric applications on mobile platforms, cancelable biometrics, biometric encryption, and other applications. The book was reviewed by editors Dr. Jucheng Yang and Dr. Norman Poh. We deeply appreciate the efforts of our guest editors: Dr. Girija Chetty, Dr. Loris Nanni, Dr. Jianjiang Feng, Dr. Dongsun Park and Dr. Sook Yoon, as well as a number of anonymous reviewers
    corecore