3 research outputs found

    Speaker adaptation of an acoustic-to-articulatory inversion model using cascaded Gaussian mixture regressions

    No full text
    International audienceThe article presents a method for adapting a GMM-based acoustic-articulatory inversion model trained on a reference speaker to another speaker. The goal is to estimate the articulatory trajectories in the geometrical space of a reference speaker from the speech audio signal of another speaker. This method is developed in the context of a system of visual biofeedback, aimed at pronunciation training. This system provides a speaker with visual information about his/her own articulation, via a 3D orofacial clone. In previous work, we proposed to use GMM-based voice conversion for speaker adaptation. Acoustic-articulatory mapping was achieved in 2 consecutive steps: 1) converting the spectral trajectories of the target speaker (i.e. the system user) into spectral trajectories of the reference speaker (voice conversion), and 2) estimating the most likely articulatory trajectories of the reference speaker from the converted spectral features (acoustic-articulatory inversion). In this work, we propose to combine these two steps into the same statistical mapping framework, by fusing multiple regressions based on trajectory GMM and maximum likelihood criterion (MLE). The proposed technique is compared to two standard speaker adaptation techniques based respectively on MAP and MLLR

    Extending the Cascaded Gaussian Mixture Regression Framework for Cross-Speaker Acoustic-Articulatory Mapping

    Get PDF
    International audienceThis article addresses the adaptation of an acoustic-articulatory inversion model of a reference speaker to the voice of another source speaker, using a limited amount of audio-only data. In this study, the articulatory-acoustic relationship of the reference speaker is modeled by a Gaussian mixture model and inference of articulatory data from acoustic data is made by the associated Gaussian mixture regression (GMR). To address speaker adaptation, we previously proposed a general framework called Cascaded-GMR (C-GMR) which decomposes the adaptation process into two consecutive steps: spectral conversion between source and reference speaker and acoustic-articulatory inversion of converted spectral trajectories. In particular, we proposed the Integrated C-GMR technique (IC-GMR) in which both steps are tied together in the same probabilistic model. In this article, we extend the C-GMR framework with another model called Joint-GMR (J-GMR). Contrary to the IC-GMR, this model aims at exploiting all potential acoustic-articulatory relationships, including those between the source speaker's acoustics and the reference speaker's articulation. We present the full derivation of the exact Expectation-Maximization (EM) training algorithm for the J-GMR. It exploits the missing data methodology of machine learning to deal with limited adaptation data. We provide an extensive evaluation of the J-GMR on both synthetic acoustic-articulatory data and on the multi-speaker MOCHA EMA database. We compare the J-GMR performance to other models of the C-GMR framework, notably the IC-GMR, and discuss their respective merits

    A Novel Approach To Intelligent Navigation Of A Mobile Robot In A Dynamic And Cluttered Indoor Environment

    Get PDF
    The need and rationale for improved solutions to indoor robot navigation is increasingly driven by the influx of domestic and industrial mobile robots into the market. This research has developed and implemented a novel navigation technique for a mobile robot operating in a cluttered and dynamic indoor environment. It divides the indoor navigation problem into three distinct but interrelated parts, namely, localization, mapping and path planning. The localization part has been addressed using dead-reckoning (odometry). A least squares numerical approach has been used to calibrate the odometer parameters to minimize the effect of systematic errors on the performance, and an intermittent resetting technique, which employs RFID tags placed at known locations in the indoor environment in conjunction with door-markers, has been developed and implemented to mitigate the errors remaining after the calibration. A mapping technique that employs a laser measurement sensor as the main exteroceptive sensor has been developed and implemented for building a binary occupancy grid map of the environment. A-r-Star pathfinder, a new path planning algorithm that is capable of high performance both in cluttered and sparse environments, has been developed and implemented. Its properties, challenges, and solutions to those challenges have also been highlighted in this research. An incremental version of the A-r-Star has been developed to handle dynamic environments. Simulation experiments highlighting properties and performance of the individual components have been developed and executed using MATLAB. A prototype world has been built using the WebotsTM robotic prototyping and 3-D simulation software. An integrated version of the system comprising the localization, mapping and path planning techniques has been executed in this prototype workspace to produce validation results
    corecore