130 research outputs found

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA) came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the neonate to the adult and elderly. Over the years the initial issues have grown and spread also in other aspects of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years always in Firenze, Italy. This edition celebrates twenty years of uninterrupted and succesfully research in the field of voice analysis

    On the quality of synthetic speech : evaluation and improvements

    Get PDF

    Exploring the contribution of voice quality to the perception of gender in Scottish English

    Get PDF
    This study investigates how voice quality, here phonation, aļ¬€ects listener perception of speaker gender, and how voice quality interacts with pitch, a major cue to speaker gender, when cueing gender perceptions. Gender diļ¬€erences in voice quality have been identiļ¬ed in both Scottish (Beck and Schaeļ¬„er 2015; Stuart-Smith 1999) and American English (Abdelli-Beruh et al. 2014; D. Klatt and L. Klatt 1990; Podesva 2013; Syrdal 1996; Wolk et al. 2012; Yuasa 2010). There is evidence from previous research that suggest gender diļ¬€erences in voice quality may also inļ¬‚uence listener perception of speaker gender, with breathy voice being perceived as feminine or female characteristic by listeners (Addington 1968; Andrews and Schmidt 1997; Bishop and Keating 2012; Holmberg et al. 2010; Porter 2012; Skuk and Schweinberger 2014; Van Borsel et al. 2009) and creaky voice being perceived as masculine characteristic (Greer 2015; Lee 2016). However, some studies have found that voice quality has little eļ¬€ect (Booz and Ferguson 2016; King et al. 2012; Owen and Hancock 2010). The present study seeks to investigate the contribution of voice quality, taking into account the various methods of producing voice quality diļ¬€erences in stimuli, cultural diļ¬€erences in gendered meanings of voice quality, and diļ¬€erent methods of quantifying ā€˜perceived genderā€™, which may contribute to the conļ¬‚icting results of previous studies. To investigate the contribution of voice quality to perceptions of speaker gender, a perception experiment was be carried out where 32 Scottish listeners and 40 North American listeners heard stimuli with diļ¬€erent voice qualities (modal, breathy, creaky) and at diļ¬€erent pitch levels (120Hz, 165Hz, 210Hz), and were asked to make judgements about the gender of the speaker. Diļ¬€erences in voice quality were produced by a speaker with the ability to create voice quality distinctions, as well as created through copy synthesis from the speakerā€™s voice. Listeners were asked to indicate whether they thought the voice belonged to a man or a woman and rate how masculine and feminine the voice sounded. Relative to modal voice, I predicted that listeners would be more likely to categorise breathy voices as women, and would rate them as more feminine and less masculine, and that listeners would be less likely to categorise creaky voices as women, and would rate them as more masculine and less feminine. I also predicted that there might be diļ¬€erences in how Scottish listeners and North American listeners perceived voice quality, given that the gender diļ¬€erences in voice quality in these two varieties of English have been found to diļ¬€er in previous research. Consistent with my predictions, I found that relative to modal voice, listeners were more likely to categorise breathy voice stimuli as women, and rated breathy voice stimuli as more feminine and less masculine. However, in contrast with my predictions, I found that relative to modal voice, listeners were more likely to categorise creaky voice stimuli as women, and rated them as less masculine, but not more feminine. Furthermore, contrary to predictions, I did not identify diļ¬€erences between Scottish and North American listeners in terms of voice quality perception. Diļ¬€erences were also found in how breathy and creaky voice inļ¬‚uence gender perception at diļ¬€erent pitch levels. Overall, these results show that voice quality has an important inļ¬‚uence on listener perception of speaker gender, and that the gendered meanings of creaky voice are changing and have disassociated from its low pitch. Future research should consider whether this evaluation among Scottish listeners this may reļ¬‚ect a wider change in the gender diļ¬€erences in production

    Automatic voice disorder recognition using acoustic amplitude modulation features

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2004.Includes bibliographical references (p. 114-117).An automatic dysphonia recognition system is designed that exploits amplitude modulations (AM) in voice using biologically-inspired models. This system recognizes general dysphonia and four subclasses: hyperfunction, A-P squeezing, paralysis, and vocal fold lesions. The models developed represent processing in the auditory system at the level of the cochlea, auditory nerve, and inferior colliculus. Recognition experiments using dysphonic sentence data obtained from the Kay Elemetrics Disordered Voice Database suggest that our system provides complementary information to state-of-the-art mel-cepstral features. A model for analyzing AM in dysphonic speech is also developed from a traditional communications engineering perspective. Through a case study of seven disordered voices, we show that different AM patterns occur in different frequency bands. This perspective challenges current dysphonia analysis methods that analyze AM in the time-domain signal.by Nicolas Malyska.S.M

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA) came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the newborn to the adult and elderly. Over the years the initial issues have grown and spread also in other fields of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years in Firenze, Italy. This edition celebrates twenty-two years of uninterrupted and successful research in the field of voice analysis

    The Sound Manifesto

    Full text link
    Computing practice today depends on visual output to drive almost all user interaction. Other senses, such as audition, may be totally neglected, or used tangentially, or used in highly restricted specialized ways. We have excellent audio rendering through D-A conversion, but we lack rich general facilities for modeling and manipulating sound comparable in quality and flexibility to graphics. We need co-ordinated research in several disciplines to improve the use of sound as an interactive information channel. Incremental and separate improvements in synthesis, analysis, speech processing, audiology, acoustics, music, etc. will not alone produce the radical progress that we seek in sonic practice. We also need to create a new central topic of study in digital audio research. The new topic will assimilate the contributions of different disciplines on a common foundation. The key central concept that we lack is sound as a general-purpose information channel. We must investigate the structure of this information channel, which is driven by the co-operative development of auditory perception and physical sound production. Particular audible encodings, such as speech and music, illuminate sonic information by example, but they are no more sufficient for a characterization than typography is sufficient for a characterization of visual information.Comment: To appear in the conference on Critical Technologies for the Future of Computing, part of SPIE's International Symposium on Optical Science and Technology, 30 July to 4 August 2000, San Diego, C

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies

    Models and analysis of vocal emissions for biomedical applications

    Get PDF
    This book of Proceedings collects the papers presented at the 3rd International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, MAVEBA 2003, held 10-12 December 2003, Firenze, Italy. The workshop is organised every two years, and aims to stimulate contacts between specialists active in research and industrial developments, in the area of voice analysis for biomedical applications. The scope of the Workshop includes all aspects of voice modelling and analysis, ranging from fundamental research to all kinds of biomedical applications and related established and advanced technologies
    • ā€¦
    corecore