2,129 research outputs found

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies

    Reactions of adult listeners to infant speech-like vocalizations and cry

    Get PDF

    A machine learning approach to infant distress calls and maternal behaviour of wild chimpanzees

    Get PDF
    We are grateful to the Royal Zoological Society of Scotland for providing core funding to the Budongo Conservation Field Station. This research was supported by funding from the European Union’s Seventh Framework Programme for research, technological development and demonstration (Grant agreement no 283871), a Fyssen Foundation post-doctoral fellowship awarded to GD, the Swiss National Science Foundation (PZ00P3_154741) and Start up-funding of the Taipei Medical University (108-6402-004-112) awarded to CDD.Distress calls are an acoustically variable group of vocalizations ubiquitous in mammals and other animals. Their presumed function is to recruit help, but there has been much debate on whether the nature of the disturbance can be inferred from the acoustics of distress calls. We used machine learning to analyse episodes of distress calls of wild infant chimpanzees. We extracted exemplars from those distress call episodes and examined them in relation to the external event triggering them and the distance to the mother. In further steps, we tested whether the acoustic variants were associated with particular maternal responses. Our results suggest that, although infant chimpanzee distress calls are highly graded, they can convey information about discrete problems experienced by the infant and about distance to the mother, which in turn may help guide maternal parenting decisions. The extent to which mothers rely on acoustic cues alone (versus integrate other contextual-visual information) to decide upon intervening should be the focus of future research.PostprintPeer reviewe

    Automatic Recognition of Non-Verbal Acoustic Communication Events With Neural Networks

    Get PDF
    Non-verbal acoustic communication is of high importance to humans and animals: Infants use the voice as a primary communication tool. Animals of all kinds employ acoustic communication, such as chimpanzees, which use pant-hoot vocalizations for long-distance communication. Many applications require the assessment of such communication for a variety of analysis goals. Computational systems can support these areas through automatization of the assessment process. This is of particular importance in monitoring scenarios over large spatial and time scales, which are infeasible to perform manually. Algorithms for sound recognition have traditionally been based on conventional machine learning approaches. In recent years, so-called representation learning approaches have gained increasing popularity. This particularly includes deep learning approaches that feed raw data to deep neural networks. However, there remain open challenges in applying these approaches to automatic recognition of non-verbal acoustic communication events, such as compensating for small data set sizes. The leading question of this thesis is: How can we apply deep learning more effectively to automatic recognition of non-verbal acoustic communication events? The target communication types were specifically (1) infant vocalizations and (2) chimpanzee long-distance calls. This thesis comprises four studies that investigated aspects of this question: Study (A) investigated the assessment of infant vocalizations by laypersons. The central goal was to derive an infant vocalization classification scheme based on the laypersons' perception. The study method was based on the Nijmegen Protocol, where participants rated vocalization recordings through various items, such as affective ratings and class labels. Results showed a strong association between valence ratings and class labels, which was used to derive a classification scheme. Study (B) was a comparative study on various neural network types for the automatic classification of infant vocalizations. The goal was to determine the best performing network type among the currently most prevailing ones, while considering the influence of their architectural configuration. Results showed that convolutional neural networks outperformed recurrent neural networks and that the choice of the frequency and time aggregation layer inside the network is the most important architectural choice. Study (C) was a detailed investigation on computer vision-like convolutional neural networks for infant vocalization classification. The goal was to determine the most important architectural properties for increasing classification performance. Results confirmed the importance of the aggregation layer and additionally identified the input size of the fully-connected layers and the accumulated receptive field to be of major importance. Study (D) was an investigation on compensating class imbalance for chimpanzee call detection in naturalistic long-term recordings. The goal was to determine which compensation method among a selected group improved performance the most for a deep learning system. Results showed that spectrogram denoising was most effective, while methods for compensating relative imbalance either retained or decreased performance.:1. Introduction 2. Foundations in Automatic Recognition of Acoustic Communication 3. State of Research 4. Study (A): Investigation of the Assessment of Infant Vocalizations by Laypersons 5. Study (B): Comparison of Neural Network Types for Automatic Classification of Infant Vocalizations 6. Study (C): Detailed Investigation of CNNs for Automatic Classification of Infant Vocalizations 7. Study (D): Compensating Class Imbalance for Acoustic Chimpanzee Detection With Convolutional Recurrent Neural Networks 8. Conclusion and Collected Discussion 9. AppendixNonverbale akustische Kommunikation ist für Menschen und Tiere von großer Bedeutung: Säuglinge nutzen die Stimme als primäres Kommunikationsmittel. Schimpanse verwenden sogenannte 'Pant-hoots' und Trommeln zur Kommunikation über weite Entfernungen. Viele Anwendungen erfordern die Beurteilung solcher Kommunikation für verschiedenste Analyseziele. Algorithmen können solche Bereiche durch die Automatisierung der Beurteilung unterstützen. Dies ist besonders wichtig beim Monitoring langer Zeitspannen oder großer Gebiete, welche manuell nicht durchführbar sind. Algorithmen zur Geräuscherkennung verwendeten bisher größtenteils konventionelle Ansätzen des maschinellen Lernens. In den letzten Jahren hat eine alternative Herangehensweise Popularität gewonnen, das sogenannte Representation Learning. Dazu gehört insbesondere Deep Learning, bei dem Rohdaten in tiefe neuronale Netze eingespeist werden. Jedoch gibt es bei der Anwendung dieser Ansätze auf die automatische Erkennung von nonverbaler akustischer Kommunikation ungelöste Herausforderungen, wie z.B. die Kompensation der relativ kleinen Datenmengen. Die Leitfrage dieser Arbeit ist: Wie können wir Deep Learning effektiver zur automatischen Erkennung nonverbaler akustischer Kommunikation verwenden? Diese Arbeit konzentriert sich speziell auf zwei Kommunikationsarten: (1) vokale Laute von Säuglingen (2) Langstreckenrufe von Schimpansen. Diese Arbeit umfasst vier Studien, welche Aspekte dieser Frage untersuchen: Studie (A) untersuchte die Beurteilung von Säuglingslauten durch Laien. Zentrales Ziel war die Ableitung eines Klassifikationsschemas für Säuglingslaute auf der Grundlage der Wahrnehmung von Laien. Die Untersuchungsmethode basierte auf dem sogenannten Nijmegen-Protokoll. Hier beurteilten die Teilnehmenden Lautaufnahmen von Säuglingen anhand verschiedener Variablen, wie z.B. affektive Bewertungen und Klassenbezeichnungen. Die Ergebnisse zeigten eine starke Assoziation zwischen Valenzbewertungen und Klassenbezeichnungen, die zur Ableitung eines Klassifikationsschemas verwendet wurde. Studie (B) war eine vergleichende Studie verschiedener Typen neuronaler Netzwerke für die automatische Klassifizierung von Säuglingslauten. Ziel war es, den leistungsfähigsten Netzwerktyp unter den momentan verbreitetsten Typen zu ermitteln. Hierbei wurde der Einfluss verschiedener architektonischer Konfigurationen innerhalb der Typen berücksichtigt. Die Ergebnisse zeigten, dass Convolutional Neural Networks eine höhere Performance als Recurrent Neural Networks erreichten. Außerdem wurde gezeigt, dass die Wahl der Frequenz- und Zeitaggregationsschicht die wichtigste architektonische Entscheidung ist. Studie (C) war eine detaillierte Untersuchung von Computer Vision-ähnlichen Convolutional Neural Networks für die Klassifizierung von Säuglingslauten. Ziel war es, die wichtigsten architektonischen Eigenschaften zur Steigerung der Erkennungsperformance zu bestimmen. Die Ergebnisse bestätigten die Bedeutung der Aggregationsschicht. Zusätzlich Eigenschaften, die als wichtig identifiziert wurden, waren die Eingangsgröße der vollständig verbundenen Schichten und das akkumulierte rezeptive Feld. Studie (D) war eine Untersuchung zur Kompensation der Klassenimbalance zur Erkennung von Schimpansenrufen in Langzeitaufnahmen. Ziel war es, herauszufinden, welche Kompensationsmethode aus einer Menge ausgewählter Methoden die Performance eines Deep Learning Systems am meisten verbessert. Die Ergebnisse zeigten, dass Spektrogrammentrauschen am effektivsten war, während Methoden zur Kompensation des relativen Ungleichgewichts die Performance entweder gleichhielten oder verringerten.:1. Introduction 2. Foundations in Automatic Recognition of Acoustic Communication 3. State of Research 4. Study (A): Investigation of the Assessment of Infant Vocalizations by Laypersons 5. Study (B): Comparison of Neural Network Types for Automatic Classification of Infant Vocalizations 6. Study (C): Detailed Investigation of CNNs for Automatic Classification of Infant Vocalizations 7. Study (D): Compensating Class Imbalance for Acoustic Chimpanzee Detection With Convolutional Recurrent Neural Networks 8. Conclusion and Collected Discussion 9. Appendi

    Models and analysis of vocal emissions for biomedical applications: 5th International Workshop: December 13-15, 2007, Firenze, Italy

    Get PDF
    The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies. The Workshop has the sponsorship of: Ente Cassa Risparmio di Firenze, COST Action 2103, Biomedical Signal Processing and Control Journal (Elsevier Eds.), IEEE Biomedical Engineering Soc. Special Issues of International Journals have been, and will be, published, collecting selected papers from the conference

    Models and analysis of vocal emissions for biomedical applications

    Get PDF
    This book of Proceedings collects the papers presented at the 3rd International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, MAVEBA 2003, held 10-12 December 2003, Firenze, Italy. The workshop is organised every two years, and aims to stimulate contacts between specialists active in research and industrial developments, in the area of voice analysis for biomedical applications. The scope of the Workshop includes all aspects of voice modelling and analysis, ranging from fundamental research to all kinds of biomedical applications and related established and advanced technologies

    Sex stereotypes influence adults' perception of babies' cries

    Get PDF
    Background: Despite widespread evidence that gender stereotypes influence human parental behavior, their potential effects on adults’ perception of babies’ cries have been overlooked. In particular, whether adult listeners overgeneralize the sex dimorphism that characterizes the voice of adult speakers (men are lower-pitched than women) to their perception of babies’ cries has not been investigated. Methods: We used playback experiments combining natural and re-synthesised cries of 3 month-old babies to investigate whether the interindividual variation in the fundamental frequency (pitch) of cries affected adult listeners’ identification of the baby’s sex, their perception the baby’s femininity and masculinity, and whether these biases interacted with their perception of the level of discomfort expressed by the cry. Results: We show that low-pitched cries are more likely to be attributed to boys and high-pitched cries to girls, despite the absence of sex differences in pitch. Moreover, low-pitched boys are perceived as more masculine and high-pitched girls are perceived as more feminine. Finally, adult men rate relatively low-pitched cries as expressing more discomfort when presented as belonging to boys than to girls. Conclusion: Such biases in caregivers’ responses to babies’ cries may have implications on children’s immediate welfare and on the development of their gender identity

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA) came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the neonate to the adult and elderly. Over the years the initial issues have grown and spread also in other aspects of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years always in Firenze, Italy

    Magel2 and Hypothalamic POMC Neuron Modulation of Infant Mice Isolation-Induced Vocalizations

    Get PDF
    Abstract The proper development of infant mammals depends on infant vocalization. Infants vocalize (i.e., cry) when isolated from their caregivers, attracting their attention to receive nurture. Impaired vocal behavior can lead to maternal neglect and even death in some species. Similar to humans and other mammals, infant mice vocalize upon isolation from their nest and decrease vocalizations when reunited with their mother or littermates. Mouse pups vocalize above the human audible range, emitting ultrasonic vocalizations (USV). My thesis investigated the effects of the imprinted gene, Magel2, on mouse vocal behavior (Chapter 2; published in Genes, Brain, and Behavior) and also identified a population of neurons in the hypothalamus that modulate vocal behavior (Chapter 3; unpublished). Magel2 (or MAGEL2 in humans) is a paternal imprint gene and its loss of function is associated with atypical behaviors seen in autism spectrum disorders and in Prader-Willi Syndrome. In Chapter 2, I report the study of the emission of ultrasonic vocalizations by Magel2 deficient pups during their early postnatal development. I recorded and analyzed vocalizations from Magel2 deficient pups and their wildtype littermates during isolation from the home nest at postnatal days 6-12. I describe my findings showing that Magel2 deficient pups present a lower rate of vocalizations and altered vocal repertoire compared to wildtype littermates. Moreover, these results correlate with altered behavior of the dam towards their own pups: dams prefer to retrieve their wildtype offspring compared to their Magel2 deficient offspring. These results suggest that Magel2 affects the expression of infant vocalizations and also modulates the expression of maternal behaviors. In Chapter 3, I describe my discovery of a population of neurons in the mammalian hypothalamus that modulate the emission of ultrasonic vocalizations in mouse pups. The brain opioid theory of social attachment postulates that pups release opioids in the brain during caretaking behaviors, which reinforces the attachment bond between pups and caretakers. From the three main receptors known to bind different types of endogenous opioids, μ-opioid receptors (ORPM1) are thought to be important in the modulation of attachment behaviors and, consequently, emission of vocalizations. Whether endogenous opioids act on ORPM1-expressing cells to modulate vocalizations is unknow. Since the opioid with highest affinity for ORPM1 is β-endorphin, I determined the contribution of neurons that produce β-endorphin—POMC neurons—in infant vocalizations. Using genetic, chemogenomic, and pharmacogenetic approaches, my results show that mice deficient for β-endorphin vocalize more than controls, an effect that is mimicked by a pharmacological blocker of opioid receptors, naloxone. Importantly, naloxone fails to increase vocalizations in β-endorphin deficient pups. Moreover, using chemogenetics, activation of POMC neurons in the hypothalamus suppresses the emission of vocalizations, while ablation of these neurons increased the number of vocalizations. Finally, I show that activation of POMC neurons in mice deficient for the Orpm1 does not suppress the emission of vocalizations. Together, the results in Chapter 3 suggest that the emission of infant vocalizations is modulated by POMC neurons in the hypothalamus via the release of beta-endorphin that signals in downstream mu-opioid receptors. In sum, this dissertation reports novel findings on the effect of the Magel2 gene and of hypothalamic POMC neurons in the modulation of infant vocalization. As we learn more about the physiological and neuronal responses to distress that occurs in infants, we will more accurately understand the mechanisms involved in the affective emotional states that contribute to the normal and pathological development of infants
    • …
    corecore