7,540 research outputs found

    Speech Recognition by Composition of Weighted Finite Automata

    Full text link
    We present a general framework based on weighted finite automata and weighted finite-state transducers for describing and implementing speech recognizers. The framework allows us to represent uniformly the information sources and data structures used in recognition, including context-dependent units, pronunciation dictionaries, language models and lattices. Furthermore, general but efficient algorithms can used for combining information sources in actual recognizers and for optimizing their application. In particular, a single composition algorithm is used both to combine in advance information sources such as language models and dictionaries, and to combine acoustic observations and information sources dynamically during recognition.Comment: 24 pages, uses psfig.st

    A toolbox for animal call recognition

    Get PDF
    Monitoring the natural environment is increasingly important as habit degradation and climate change reduce theworld’s biodiversity.We have developed software tools and applications to assist ecologists with the collection and analysis of acoustic data at large spatial and temporal scales.One of our key objectives is automated animal call recognition, and our approach has three novel attributes. First, we work with raw environmental audio, contaminated by noise and artefacts and containing calls that vary greatly in volume depending on the animal’s proximity to the microphone. Second, initial experimentation suggested that no single recognizer could dealwith the enormous variety of calls. Therefore, we developed a toolbox of generic recognizers to extract invariant features for each call type. Third, many species are cryptic and offer little data with which to train a recognizer. Many popular machine learning methods require large volumes of training and validation data and considerable time and expertise to prepare. Consequently we adopt bootstrap techniques that can be initiated with little data and refined subsequently. In this paper, we describe our recognition tools and present results for real ecological problems

    Spoken content retrieval: A survey of techniques and technologies

    Get PDF
    Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR

    Longitudinal predictors of Chinese word reading and spelling among elementary grade students.

    Get PDF
    published_or_final_versio

    From phonetics to phonology : The emergence of first words in Italian

    Get PDF
    This study assesses the extent of phonetic continuity between babble and words in four Italian children followed longitudinally from 0; 9 or 0; 10 to 2;0-two with relatively rapid and two with slower lexical growth. Prelinguistic phonetic characteristics, including both (a) consistent use of specific consonants and (b) age of onset and extent of consonant variegation in babble, are found to predict rate of lexical advance and to relate to the form of the early words. In addition, each child's lexical profile is analyzed to test the hypothesis of non-linearity in phonological development. All of the children show the expected pattern of phonological advance: 'Relatively accurate first word production is followed by lexical expansion, characterized by a decrease in accuracy and an increase of similarity between word forms. We interpret such a profile as reflecting the emergence of word templates, a first step in phonological organization
    • 

    corecore