25 research outputs found

    Thickening of galactic disks through clustered star formation

    Full text link
    (Abridged) The building blocks of galaxies are star clusters. These form with low-star formation efficiencies and, consequently, loose a large part of their stars that expand outwards once the residual gas is expelled by the action of the massive stars. Massive star clusters may thus add kinematically hot components to galactic field populations. This kinematical imprint on the stellar distribution function is estimated here by calculating the velocity distribution function for ensembles of star-clusters distributed as power-law or log-normal initial cluster mass functions (ICMFs). The resulting stellar velocity distribution function is non-Gaussian and may be interpreted as being composed of multiple kinematical sub-populations. The notion that the formation of star-clusters may add hot kinematical components to a galaxy is applied to the age--velocity-dispersion relation of the Milky Way disk to study the implied history of clustered star formation, with an emphasis on the possible origin of the thick disk.Comment: MNRAS, accepted, 27 pages, 9 figure

    Kalman tracking of linear predictor and harmonic noise models for noisy speech enhancement

    Get PDF
    This paper presents a speech enhancement method based on the tracking and denoising of the formants of a linear prediction (LP) model of the spectral envelope of speech and the parameters of a harmonic noise model (HNM) of its excitation. The main advantages of tracking and denoising the prominent energy contours of speech are the efficient use of the spectral and temporal structures of successive speech frames and a mitigation of processing artefact known as the ‘musical noise’ or ‘musical tones’.The formant-tracking linear prediction (FTLP) model estimation consists of three stages: (a) speech pre-cleaning based on a spectral amplitude estimation, (b) formant-tracking across successive speech frames using the Viterbi method, and (c) Kalman filtering of the formant trajectories across successive speech frames.The HNM parameters for the excitation signal comprise; voiced/unvoiced decision, the fundamental frequency, the harmonics’ amplitudes and the variance of the noise component of excitation. A frequency-domain pitch extraction method is proposed that searches for the peak signal to noise ratios (SNRs) at the harmonics. For each speech frame several pitch candidates are calculated. An estimate of the pitch trajectory across successive frames is obtained using a Viterbi decoder. The trajectories of the noisy excitation harmonics across successive speech frames are modeled and denoised using Kalman filters.The proposed method is used to deconstruct noisy speech, de-noise its model parameters and then reconstitute speech from its cleaned parts. Experimental evaluations show the performance gains of the formant tracking, pitch extraction and noise reduction stages

    Body-composition reference data for simple and reference techniques and a 4-component model: A new UK reference child

    Get PDF
    Background: A routine pediatric clinical assessment of body composition is increasingly recommended but has long been hampered by the following 2 factors: a lack of appropriate techniques and a lack of reference data with which to interpret individual measurements. Several techniques have become available, but reference data are needed. Objective: We aimed to provide body-composition reference data for use in clinical practice and research. Design: Body composition was measured by using a gold standard 4-component model, along with various widely used reference and bedside methods, in a large, representative sample of British children aged from 4 to ≥20 y. Measurements were made of anthropometric variables (weight, height, 4 skinfold thicknesses, and waist girth), dual-energy X-ray absorptiometry, body density, bioelectrical impedance, and total body water, and 4-component fat and fat-free masses were calculated. Reference charts and SD scores (SDSs) were constructed for each outcome by using the lambda-mu-sigma method. The same outcomes were generated for the fat-free mass index and fat mass index. Results: Body-composition growth charts and SDSs for 5-20 y were based on a final sample of 533 individuals. Correlations between SDSs by using different techniques were ≥0.68 for adiposity outcomes and ≥0.80 for fat-free mass outcomes. Conclusions: These comprehensive reference data for pediatric body composition can be used across a variety of techniques. Together with advances in measurement technologies, the data should greatly enhance the ability of clinicians to assess and monitor body composition in routine clinical practice and should facilitate the use of body-composition measurements in research studies. © 2012 American Society for Nutrition

    Robust Acoustic Speech Feature Prediction from Noisy Mel-Frequency Cepstral Coefficients

    No full text
    This paper examines the effect of applying noise compensation to acoustic speech feature prediction from noisy mel-frequency cepstral coefficient (MFCC) vectors within a distributed speech recognition architecture. An acoustic speech feature (comprising fundamental frequency, formant frequencies, speech/nonspeech classification, and voicing classification) is predicted from an MFCC vector in a maximum a posteriori (MAP) framework using phoneme-specific or global models of speech. The effect of noise is considered and three different noise compensation methods, that have been successful in robust speech recognition, are integrated within the MAP framework. Experiments show that noise compensation can be applied successfully to prediction with best performance given by a model adaptation method that performs only slightly worse than matched training and testing. Further experiments consider application of the predicted acoustic features to speech reconstruction. A series of human listening tests show that the predicted features are sufficient for speech reconstruction and that noise compensation improves speech quality in noisy conditions

    Robust acoustic speech feature prediction from Mel frequency cepstral coefficients

    No full text
    EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    HMM-based MAP Prediction of Voiced and Unvoiced Formant Frequencies from Noisy MFCC Vectors

    No full text
    This paper describes how formant frequencies of voiced and unvoiced speech can be predicted from mel-frequency cepstral coefficients (MFCC) vectors using maximum a posteriori (MAP) estimation within a hidden Markov model (HMM) framework. Gaussian mixture models (GMMs) are used to model the local joint density of MFCCs and formant frequencies. More localised prediction is achieved by modelling speech using voiced, unvoiced and non-speech GMMs for every state of each model of a set of HMMs. To predict formant frequencies from a MFCC vector, first a prediction of the speech class (voiced, unvoiced or non-speech) is made. Formant frequencies are predicted from voiced and unvoiced speech using a MAP estimation made using the state-specific GMMs. This 'eHMM-GMM' prediction of speech class and formant frequencies was evaluated on a male 5000 word unconstrained large vocabulary speaker-independent database

    Reconstructing clean speech from noisy MFCC vectors

    No full text
    The aim of this work is to reconstruct clean speech solely from a stream of noise-contaminated MFCC vectors, as may be encountered in distributed speech recognition systems. Speech reconstruction is performed using the ETSI Aurora back-end speech reconstruction standard which requires MFCC vectors, fundamental frequency and voicing information. In this work, fundamental frequency and voicing are obtained using maximum a posteriori prediction from input MFCC vectors, thereby allowing speech reconstruction solely from a stream of MFCC vectors. Two different methods to improve prediction accuracy in noisy conditions are then developed. Experimental results first establish that improved fundamental frequency and voicing prediction is obtained when noise compensation is applied. A series of human listening tests are then used to analyse the reconstructed speech quality, which determine the effectiveness of noise compensation in terms of mean opinion scores

    Formant prediction from MFCC vectors

    No full text
    This work proposes a novel method of predicting formant frequencies from a stream of mel-frequency cepstral coefficients (MFCC) feature vectors. Prediction is based on modelling the joint density of MFCC vectors and formant vectors using a Gaussian mixture model (GMM). Using this GMM and an input MFCC vector, two maximum a posteriori (MAP) prediction methods are developed. The first method predicts formants from the closest, in some sense, cluster to the input MFCC vector, while the second method takes a weighted contribution of formants from all clusters. Experimental results are presented using the ETSI Aurora connected digit database and show that the predicted formant frequency is within 3.25% of the reference formant frequency, as measured from hand-corrected formant tracks

    MAP Prediction of Formant Frequencies and Voicing Class from MFCC Vectors in Noise

    No full text
    Novel methods are presented for predicting formant frequencies and voicing class from mel-frequency cepstral coefficients (MFCCs). It is shown how Gaussian mixture models (GMMs) can be used to model the relationship between formant frequencies and MFCCs. Using such models and an input MFCC vector, a maximum a posteriori (MAP) prediction of formant frequencies can be made. The specific relationship each speech sound has between MFCCs and formant frequencies is exploited by using state-specific GMMs within a framework of a set of hidden Markov models (HMMs). Formant prediction accuracy and voicing prediction of speaker-independent male speech are evaluated on both a constrained vocabulary connected digits database and a large vocabulary database. Experimental results show that for HMM–GMM prediction on the connected digits database, voicing class prediction error is less than 3.5%. Less than 1.8% of frames have formant frequency percentage errors greater than 20% and the mean percentage error of the remaining frames is less than 3.7%. Further experiments show prediction accuracy under noisy conditions. For example, at a signal-to-noise ratio (SNR) of 0 dB, voicing class prediction error increases to 9.4%, less than 4.3% of frames have formant frequency percentage errors over 20% and the formant frequency percentage error for the remaining frames is less than 5.7%
    corecore