84 research outputs found

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies

    Voicing quantification is more relevant than period perturbation in substitution voices: an advanced acoustical study

    Get PDF
    Quality of substitution voicing—i.e., phonation with a voice that is not generated by the vibration of two vocal folds—cannot be adequately evaluated with routinely used software for acoustic voice analysis that is aimed at ‘common’ dysphonias and nearly periodic voice signals. The AMPEX analysis program (Van Immerseel and Martens) has been shown previously to be able to detect periodicity in irregular signals with background noise, and to be suited for running speech. The validity of this analysis program is first tested using realistic synthesized voice signals with known levels of cycle-to-cycle perturbations and additive noise. Second, exhaustive acoustic analysis is performed of the voices of 116 patients surgically treated for advanced laryngeal cancer and recorded in seven European academic centers. All of them read out a short phonetically balanced passage. Patients were divided into six groups according to the oscillating structures they used to phonate. Results show that features related to quantification of voicing enable a distinction between the different groups, while the features reporting F0-instability fail to do so. Acoustic evaluation of voice quality in substitution voices thus best relies upon voicing quantification

    Models and analysis of vocal emissions for biomedical applications: 5th International Workshop: December 13-15, 2007, Firenze, Italy

    Get PDF
    The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies. The Workshop has the sponsorship of: Ente Cassa Risparmio di Firenze, COST Action 2103, Biomedical Signal Processing and Control Journal (Elsevier Eds.), IEEE Biomedical Engineering Soc. Special Issues of International Journals have been, and will be, published, collecting selected papers from the conference

    Models and analysis of vocal emissions for biomedical applications

    Get PDF
    This book of Proceedings collects the papers presented at the 4th International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, MAVEBA 2005, held 29-31 October 2005, Firenze, Italy. The workshop is organised every two years, and aims to stimulate contacts between specialists active in research and industrial developments, in the area of voice analysis for biomedical applications. The scope of the Workshop includes all aspects of voice modelling and analysis, ranging from fundamental research to all kinds of biomedical applications and related established and advanced technologies

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies

    Unveiling the impact of neuromotor disorders on speech: a structured approach combining biomechanical fundamentals and statistical machine learning

    Get PDF
    Speech has been shown to convey clinically useful information in the study of Neurodegenerative Disorders (NDs), such as Parkinson’s Disease (PD). Traditionally the use of speech as an exploratory tool in People with Parkinson’s (PwP) has focused on the estimation of acoustic characteristics and their study at face value, analysing the physio-acoustical markers and using them as features for the differentiation between Healthy Controls (HC) and PwP. The present work takes a step further, given the intricate interoperation between neuromotor activity, responsible for both planning and driving the system, and the production of the acoustic speech signal; by the study of speech, this relationship may be properly exploited and analysed, providing a non-invasive method for the diagnosis, analysis, and observation of NDs. This work aims to introduce a working model that is capable of linking both domains and serves as a projection tool to provide insights about a speaker’s neuromotor state. This is based on a review of the neurophysiological background of the structure and function of the nervous system, and a review of the main nervous system dysfunctions involved in PD and other related neuromotor disorders. The role of the respiratory, phonatory, and articulatory systems is reviewed in the production of voice and speech under normal and pathological circumstances. This setting might allow for speech to be considered a useful trait within the precision medicine framework, as it provides a personal biometric marker that is innate and easy to elicit, can be recorded remotely with inexpensive equipment, is non-invasive, cost-effective, and easy to process. The problem can be divided into two main categories: firstly, a binary detection task distinguishing between healthy controls and individuals with NDs based on the projection model and phonatory estimates; secondly, a progression and tracking task providing a set of quantitative indices that enable clinically interpretable scores. This study aims to define a set of features and models that help to characterise hypokinetic dysarthria (HD). These incorporate the neuroscientific knowhow semantically and quantitatively to be used in clinical decision support tools that provide mechanistic insight on the processes involved in speech production, incorporating into the algorithmic element neuromotor considerations that add to better interpretability, consequently leading to improved clinical decisions and diagnosis. An overview of the acoustic signal processing algorithms for use in speech articulation and phonation system inversion regarding neuromotor disorder assessment is provided. An algorithmic methodology for model inversion and exploration has been proposed for the functional characterization and system inversion of each subsystem involved under the neuro-biomechanical foundations exposed before. A description of the vocal fold biomechanics using the glottal source, and formant dynamics provides the base for specific mapping to articulation kinematics. The statistical methods used in performance evaluation are based on three-way comparisons and transversal and longitudinal assessment by classical hypothesis testing. Three related experimental studies are shown to empirically illustrate the potential of phonation and articulation analysis: the characterization of PD from glottal biomechanics based on the amplitude distributions of the glottal flow and on the vocal fold body stiffness in assessing the efficiency of transcranial magnetic stimulation, and the description of PD dysarthria through an articulation projection model. The results from the biomechanical analysis of phonation showed that the behaviour of glottal source amplitude distributions from PD and healthy controls using three-way comparisons and hierarchical clustering were essentially distinguishable from those from normative young participants with the best accuracy scores produced by SVM classifiers of 94.8% (males) and 92.2% (females). Nevertheless, PD participants were barely separable from age-matched controls, possibly pointing to confounding factors due to age. The outcomes from using vocal fold stiffness in assessing the efficiency of transcranial magnetic stimulation showed mixed results, as some PD participants reflected clear improvements in phonation stability after stimulation, whereas some others did not. Some cases of sham controls experienced also minor improvements of unknown origin, possibly expressing a placebo effect. The overall results on the efficiency of stimulation showed an accuracy global score of 67% over the 18 cases studied. The results from articulation projection modelling showed the possibility of formulating personalised models for PD and control participants to transform acoustic formant dynamics into articulation kinematics. This might open the possibility of characterising PD dysarthria based on speech audio records. The most remarkable findings of the study include the determination of the glottal source amplitude distribution behaviour of normative and PD participants; the impact of age effects in phonation as a confounding factor in neuromotor disorder characterization; the importance of ensuring that the classification of speech dysarthria is based on principles that can be explained and interpreted; the need of taking into account the effects of medication when framing new classification experiments; the potential of using EEG-band decomposition to analyse vocal fold stiffness correlates, as well as the possibility of using these descriptions in longitudinal monitoring of treatment efficiency; the feasibility of establishing a relationship between acoustic and kinematic variables by projection model inversion; and the potential of these descriptions for estimating neuromotor activities in midbrain related to phonation and articulation activity. The most important outcome to be brought forth from the thesis is that the methodology used throughout the project uses a bottom-up approach based on speech model inversion at the acoustical, biomechanical, and neuromotor levels allowing to estimate glottal signals, biomechanical correlates, and neuromotor activity from speech alone, establishing a common neuromechanical characterisation framework on its own

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA) came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the neonate to the adult and elderly. Over the years the initial issues have grown and spread also in other aspects of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years always in Firenze, Italy. This edition celebrates twenty years of uninterrupted and succesfully research in the field of voice analysis

    Glottal-synchronous speech processing

    No full text
    Glottal-synchronous speech processing is a field of speech science where the pseudoperiodicity of voiced speech is exploited. Traditionally, speech processing involves segmenting and processing short speech frames of predefined length; this may fail to exploit the inherent periodic structure of voiced speech which glottal-synchronous speech frames have the potential to harness. Glottal-synchronous frames are often derived from the glottal closure instants (GCIs) and glottal opening instants (GOIs). The SIGMA algorithm was developed for the detection of GCIs and GOIs from the Electroglottograph signal with a measured accuracy of up to 99.59%. For GCI and GOI detection from speech signals, the YAGA algorithm provides a measured accuracy of up to 99.84%. Multichannel speech-based approaches are shown to be more robust to reverberation than single-channel algorithms. The GCIs are applied to real-world applications including speech dereverberation, where SNR is improved by up to 5 dB, and to prosodic manipulation where the importance of voicing detection in glottal-synchronous algorithms is demonstrated by subjective testing. The GCIs are further exploited in a new area of data-driven speech modelling, providing new insights into speech production and a set of tools to aid deployment into real-world applications. The technique is shown to be applicable in areas of speech coding, identification and artificial bandwidth extension of telephone speec

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA) came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the newborn to the adult and elderly. Over the years the initial issues have grown and spread also in other fields of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years in Firenze, Italy. This edition celebrates twenty-two years of uninterrupted and successful research in the field of voice analysis
    corecore