14,049 research outputs found

    Tracking Articulator Movements Using Orientation Measurements

    Get PDF
    This paper introduces a new method to track articulator movements, specifically jaw position and angle, using 5 degree of freedom (5 DOF) orientation data. The approach uses a quaternion rotation method to accomplish this jaw tracking during speech using a single senor on the mandibular incisor. Data were collected using the NDI Wave Speech Research System for one pilot subject with various speech tasks. The degree of jaw rotation from the proposed approach is compared with traditional geometric calculation. Results show that the quaternion based method is able to describe jaw angle trajectory and gives more accurate and smooth estimation of jaw kinematics

    Vowel Production in Mandarin Accented English and American English: Kinematic and Acoustic Data from the Marquette University Mandarin Accented English Corpus

    Get PDF
    Few electromagnetic articulography (EMA) datasets are publicly available, and none have focused systematically on non-native accented speech. We introduce a kinematic-acoustic database of speech from 40 (gender and dialect balanced) participants producing upper-Midwestern American English (AE) L1 or Mandarin Accented English (MAE) L2 (Beijing or Shanghai dialect base). The Marquette University EMA-MAE corpus will be released publicly to help advance research in areas such as pronunciation modeling, acoustic-articulatory inversion, L1-L2 comparisons, pronunciation error detection, and accent modification training. EMA data were collected at a 400 Hz sampling rate with synchronous audio using the NDI Wave System. Articulatory sensors were placed on the midsagittal lips, lower incisors, and tongue blade and dorsum, as well as on the lip corner and lateral tongue body. Sensors provide five degree-of-freedom measurements including three-dimensional sensor position and two-dimensional orientation (pitch and roll). In the current work we analyze kinematic and acoustic variability between L1 and L2 vowels. We address the hypothesis that MAE is characterized by larger differences in the articulation of back vowels than front vowels and smaller vowel spaces compared to AE. The current results provide a seminal comparison of the kinematics and acoustics of vowel production between MAE and AE speakers

    Parallel Reference Speaker Weighting for Kinematic-Independent Acoustic-to-Articulatory Inversion

    Get PDF
    Acoustic-to-articulatory inversion, the estimation of articulatory kinematics from an acoustic waveform, is a challenging but important problem. Accurate estimation of articulatory movements has the potential for significant impact on our understanding of speech production, on our capacity to assess and treat pathologies in a clinical setting, and on speech technologies such as computer aided pronunciation assessment and audio-video synthesis. However, because of the complex and speaker-specific relationship between articulation and acoustics, existing approaches for inversion do not generalize well across speakers. As acquiring speaker-specific kinematic data for training is not feasible in many practical applications, this remains an important and open problem. This paper proposes a novel approach to acoustic-to-articulatory inversion, Parallel Reference Speaker Weighting (PRSW), which requires no kinematic data for the target speaker and a small amount of acoustic adaptation data. PRSW hypothesizes that acoustic and kinematic similarities are correlated and uses speaker-adapted articulatory models derived from acoustically derived weights. The system was assessed using a 20-speaker data set of synchronous acoustic and Electromagnetic Articulography (EMA) kinematic data. Results demonstrate that by restricting the reference group to a subset consisting of speakers with strong individual speaker-dependent inversion performance, the PRSW method is able to attain kinematic-independent acoustic-to-articulatory inversion performance nearly matching that of the speaker-dependent model, with an average correlation of 0.62 versus 0.63. This indicates that given a sufficiently complete and appropriately selected reference speaker set for adaptation, it is possible to create effective articulatory models without kinematic training data

    The Electromagnetic Articulography Mandarin Accented English (EMA-MAE) Corpus of Acoustic and 3D Articulatory Kinematic Data

    Get PDF
    There is a significant need for more comprehensive electromagnetic articulography (EMA) datasets that can provide matched acoustics and articulatory kinematic data with good spatial and temporal resolution. The Marquette University Electromagnetic Articulography Mandarin Accented English (EMA-MAE) corpus provides kinematic and acoustic data from 40 gender and dialect balanced speakers representing 20 Midwestern standard American English L1 speakers and 20 Mandarin Accented English (MAE) L2 speakers, half Beijing region dialect and half are Shanghai region dialect. Three dimensional EMA data were collected at a 400 Hz sampling rate using the NDI Wave system, with articulatory sensors on the midsagittal lips, lower incisors, tongue blade and dorsum, plus lateral lip corner and tongue body. Sensors provide three-dimensional position data as well as two-dimensional orientation data representing the orientation of the sensor plane. Data have been corrected for head movement relative to a fixed reference sensor and also adjusted using a biteplate calibration system to place the data in an articulatory working space relative to each subject\u27s individual midsagittal and maxillary occlusal planes. Speech materials include isolated words chosen to focus on specific contrasts between the English and Mandarin languages, as well as sentences and paragraphs for continuous speech, totaling approximately 45 minutes of data per subject. A beta version of the EMA-MAE corpus is now available, and the full corpus is in preparation for public release to help advance research in areas such as pronunciation modeling, acoustic-articulatory inversion, L1-L2 comparisons, pronunciation error detection, and accent modification training

    Palate-referenced Articulatory Features for Acoustic-to-Articulator Inversion

    Get PDF
    The selection of effective articulatory features is an important component of tasks such as acoustic-to-articulator inversion and articulatory synthesis. Although it is common to use direct articulatory sensor measurements as feature variables, this approach fails to incorporate important physiological information such as palate height and shape and thus is not as representative of vocal tract cross section as desired. We introduce a set of articulator feature variables that are palate referenced and normalized with respect to the articulatory working space in order to improve the quality of the vocal tract representation. These features include normalized horizontal positions plus the normalized palatal height of two midsagittal and one lateral tongue sensor, as well as normalized lip separation and lip protrusion. The quality of the feature representation is evaluated subjectively by comparing the variances and vowel separation in the working space and quantitatively through measurement of acoustic-to-articulator inversion error. Results indicate that the palate-referenced features have reduced variance and increased separation between vowels spaces and substantially lower inversion error than direct sensor measures

    Discrimination of Individual Tigers (\u3cem\u3ePanthera tigris\u3c/em\u3e) from Long Distance Roars

    Get PDF
    This paper investigates the extent of tiger (Panthera tigris) vocal individuality through both qualitative and quantitative approaches using long distance roars from six individual tigers at Omaha\u27s Henry Doorly Zoo in Omaha, NE. The framework for comparison across individuals includes statistical and discriminant function analysis across whole vocalization measures and statistical pattern classification using a hidden Markov model (HMM) with frame-based spectral features comprised of Greenwood frequency cepstral coefficients. Individual discrimination accuracy is evaluated as a function of spectral model complexity, represented by the number of mixtures in the underlying Gaussian mixture model (GMM), and temporal model complexity, represented by the number of sequential states in the HMM. Results indicate that the temporal pattern of the vocalization is the most significant factor in accurate discrimination. Overall baseline discrimination accuracy for this data set is about 70% using high level features without complex spectral or temporal models. Accuracy increases to about 80% when more complex spectral models (multiple mixture GMMs) are incorporated, and increases to a final accuracy of 90% when more detailed temporal models (10-state HMMs) are used. Classification accuracy is stable across a relatively wide range of configurations in terms of spectral and temporal model resolution

    The subcritical baroclinic instability in local accretion disc models

    Full text link
    (abridged) Aims: We present new results exhibiting a subcritical baroclinic instability (SBI) in local shearing box models. We describe the 2D and 3D behaviour of this instability using numerical simulations and we present a simple analytical model describing the underlying physical process. Results: A subcritical baroclinic instability is observed in flows stable for the Solberg-Hoiland criterion using local simulations. This instability is found to be a nonlinear (or subcritical) instability, which cannot be described by ordinary linear approaches. It requires a radial entropy gradient weakly unstable for the Schwartzchild criterion and a strong thermal diffusivity (or equivalently a short cooling time). In compressible simulations, the instability produces density waves which transport angular momentum outward with typically alpha<3e-3, the exact value depending on the background temperature profile. Finally, the instability survives in 3D, vortex cores becoming turbulent due to parametric instabilities. Conclusions: The subcritical baroclinic instability is a robust phenomenon, which can be captured using local simulations. The instability survives in 3D thanks to a balance between the 2D SBI and 3D parametric instabilities. Finally, this instability can lead to a weak outward transport of angular momentum, due to the generation of density waves by the vortices.Comment: 12 pages, 17 figures, Accepted in A&

    On the Fundamental Mass-Period Functions of Extrasolar Planets

    Full text link
    Employing a catalog of 175 extrasolar planets (exoplanets) detected by the Doppler-shift method, we constructed the independent and coupled mass-period functions. It is the first time in this field that the selection effect is considered in the coupled mass-period functions. Our results are consistent with those in Tabachnik and Tremaine (2002) with the major differences that we obtain a flatter mass function but a steeper period function. Moreover, our coupled mass-period functions show that about 2.5 percent of stars would have a planet with mass between Earth Mass and Neptune Mass, and about 3 percent of stars would have a planet with mass between Neptune Mass and Jupiter Mass.Comment: Accepted by ApJ Supplement Series in Nov. 2009, Acknowledgment added in Dec. 2009, a Reference-Based Catalog of Exoplanets can be obtained electronically from Appendix A of the latex file or from the authors for further studies
    corecore