4 research outputs found

    A study of tones and tempo in continuous Mandarin digit strings and their application in telephone quality speech recognition

    No full text
    Prosodic cues (namely, fundamental frequency, energy and duration) provide important information for speech. For a tonal language such as Chinese, fundamental frequency ( ) plays a critical role in characterizing tone as well, which is an essential phonemic feature. In this paper, we describe our work on duration and tone modeling for telephone-quality continuous Mandarin digits, and the application of these models to improve recognition. The duration modeling includes a speaking-rate normalization scheme. A novel extraction algorithm is developed, and parameters based on orthonormal decomposition of the contour are extracted for tone recognition. Context dependency is expressed by “tri-tone ” models clustered into broad classes. A 20.0 % error rate is achieved for four-tone classification. Over a baseline recognition performance of 5.1 % word error rate, we achieve 31.4 % error reduction with duration models, 23.5 % error reduction with tone models, and 39.2 % error reduction with duration and tone models combined. 1

    Orchestration : the movement and vocal behavior of free-ranging Norwegian killer whales (Orcinus orca)

    Get PDF
    Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the Massachusetts Institute of Technology and the Woods Hole Oceanographic Institution June 2008Studying the social and cultural transmission of behavior among animals helps to identify patterns of interaction and information content flowing between individuals. Killer whales are likely to acquire traits culturally based on their population-specific feeding behaviors and group-distinctive vocal repertoires. I used digital tags to explore the contributions of individual Norwegian killer whales to group carousel feeding and the relationships between vocal and non-vocal activity. Periods of tail slapping to incapacitate herring during feeding were characterized by elevated movement variability, heightened vocal activity and call types containing additional orientation cues. Tail slaps produced by tagged animals were identified using a rapid pitch change and occurred primarily within 20m of the surface. Two simultaneously tagged animals maneuvered similarly when tail slapping within 60s of one another, indicating that the position and composition of the herring ball influenced their behavior. Two types of behavioral sequence preceding the tight circling of carousel feeding were apparent. First, the animals engaged in periods of directional swimming. They were silent in 2 of 3 instances, suggesting they may have located other foraging groups by eavesdropping. Second, tagged animals made broad horizontal loops as they dove in a manner consistent with corralling. All 4 of these occasions were accompanied by vocal activity, indicating that this and tail slapping may benefit from social communication. No significant relationship between the call types and the actual movement measurements was found. Killer whale vocalizations traditionally have been classified into discrete call types. Using human speech processing techniques, I considered that calls are alternatively comprised of shared segments that can be recombined to form the stereotyped and variable repertoire. In a classification experiment, the characterization of calls using the whole call, a set of unshared segments, or a set of shared segments yielded equivalent performance. The shared segments required less information to parse the same vocalizations, suggesting a more parsimonious system of representation. This closer examination of the movements and vocalizations of Norwegian killer whales, combined with future work on ontogeny and transmission, will inform our understanding of whether and how culture plays a role in achieving population-specific behaviors in this species.Funding sources: The Ocean Life Institute at WHOI and the National Geographic Society, the National Defense Science and Engineering Graduate Fellowship, a National Science Foundation Graduate Fellowship, the Academic Programs Office at WHOI and Dennis McLaughlin at MIT

    Movement and vocal behavior of free-ranging Norwegian killer whales (Orcinus orca)

    Get PDF
    Thesis (Ph. D.)--Joint Program in Oceanography/Applied Ocean Science and Engineering (Massachusetts Institute of Technology, Dept. of Biology; and the Woods Hole Oceanographic Institution), 2008.Includes bibliographical references.Studying the social and cultural transmission of behavior among animals helps to identify patterns of interaction and information content flowing between individuals. Killer whales are likely to acquire traits culturally based on their population-specific feeding behaviors and group-distinctive vocal repertoires. I used digital tags to explore the contributions of individual Norwegian killer whales to group carousel feeding and the relationships between vocal and non-vocal activity. Periods of tail slapping to incapacitate herring during feeding were characterized by elevated movement variability, heightened vocal activity and call types containing additional orientation cues. Tail slaps produced by tagged animals were identified using a rapid pitch change and occurred primarily within 20m of the surface. Two simultaneously tagged animals maneuvered similarly when tail slapping within 60s of one another, indicating that the position and composition of the herring ball influenced their behavior. Two types of behavioral sequence preceding the tight circling of carousel feeding were apparent. First, the animals engaged in periods of directional swimming. They were silent in 2 of 3 instances, suggesting they may have located other foraging groups by eavesdropping. Second, tagged animals made broad horizontal loops as they dove in a manner consistent with corralling. All 4 of these occasions were accompanied by vocal activity, indicating that this and tail slapping may benefit from social communication. No significant relationship between the call types and the actual movement measurements was found. Killer whale vocalizations traditionally have been classified into discrete call types. Using human speech processing techniques, I considered that calls are alternatively comprised of shared segments that can be recombined to form the stereotyped and variable repertoire.(cont.) In a classification experiment, the characterization of calls using the whole call, a set of unshared segments, or a set of shared segments yielded equivalent performance. The shared segments required less information to parse the same vocalizations, suggesting a more parsimonious system of representation. This closer examination of the movements and vocalizations of Norwegian killer whales, combined with future work on ontogeny and transmission, will inform our understanding of whether and how culture plays a role in achieving population-specific behaviors in this species.by Ari Daniel Shapiro.Ph.D
    corecore