222 research outputs found

    Fast vocabulary acquisition in an NMF-based self-learning vocal user interface

    Get PDF
    AbstractIn command-and-control applications, a vocal user interface (VUI) is useful for handsfree control of various devices, especially for people with a physical disability. The spoken utterances are usually restricted to a predefined list of phrases or to a restricted grammar, and the acoustic models work well for normal speech. While some state-of-the-art methods allow for user adaptation of the predefined acoustic models and lexicons, we pursue a fully adaptive VUI by learning both vocabulary and acoustics directly from interaction examples. A learning curve usually has a steep rise in the beginning and an asymptotic ceiling at the end. To limit tutoring time and to guarantee good performance in the long run, the word learning rate of the VUI should be fast and the learning curve should level off at a high accuracy. In order to deal with these performance indicators, we propose a multi-level VUI architecture and we investigate the effectiveness of alternative processing schemes. In the low-level layer, we explore the use of MIDA features (Mutual Information Discrimination Analysis) against conventional MFCC features. In the mid-level layer, we enhance the acoustic representation by means of phone posteriorgrams and clustering procedures. In the high-level layer, we use the NMF (Non-negative Matrix Factorization) procedure which has been demonstrated to be an effective approach for word learning. We evaluate and discuss the performance and the feasibility of our approach in a realistic experimental setting of the VUI-user learning context

    Look before you speak: Children’s integration of visual information into informative referring expressions

    Get PDF
    Children's ability to refer is underpinned by their developing cognitive skills. Using a production task (n = 57), we examined pre-articulatory visual fixations to contrast objects (e.g., to a large apple when the target was a small one) to investigate how visual scanning drives informativeness across development. Eye-movements reveal that although four-year-olds fixate contrast objects to a similar extent as seven-year-olds and adults, this does not result in explicit referential informativeness. Instead, four-year-olds frequently omit distinguishing information from their referring expressions regardless of the comprehensiveness of their visual scan. In contrast, older children make greater use of information gleaned from their visual inspections, like adults. Thus, we find a barrier not to the incidence of contrast fixations by younger children, but to their use of them in referential informativeness. We recommend that follow-up work investigates whether younger children's immature executive skills prevent them from describing referents in relation to contrast objects

    Towards natural speech acquisition: incremental word learning with limited data

    Get PDF
    Ayllon Clemente I. Towards natural speech acquisition: incremental word learning with limited data. Bielefeld: Bielefeld University; 2013

    Speech segmentation and clustering methods for a new speech recognition architecture

    Get PDF
    Perinteiset automaattiset puheentunnistusmenetelmät eivät pärjää suorituskyvyssä ihmisen puheenhavaintokyvylle. Voidaksemme kuroa tämän eron umpeen, on kehitettävä täysin uudentyyppisiä arkkitehtuureja puheentunnistusta varten. Puhetta ja kieltä itsestään ihmisen lailla oppiva järjestelmä on yksi tällainen vaihtoehto. Tämä diplomityö esittelee erään lähtökohdan oppivalle järjestelmälle, koostuen uudenlaisesta sokeasta puheen segmentointialgoritmista, segmenttien piirteistyksestä, sekä menetelmistä vähittäiselle puhedatan luokittelulle klusteroinnin avulla. Kaikki metodit arvioitiin kattavilla kokeilla, ja itse arviontimenetelmien luonteeseen kiinnitettiin huomiota. Segmentoinnissa saavutettiin alan kirjallisuuteen nähden hyvät tulokset. Järjestelmän mahdollisia jatkokehityssuuntauksia on hahmoteltu muunmuassa mahdollisten muistiarkkitehtuurien ja älykkään top-down palautteen osalta.To reduce the gap between performance of traditional speech recognition systems and human speech recognition skills, a new architecture is required. A system that is capable of incremental learning offers one such solution to this problem. This thesis introduces a bottom-up approach for such a speech processing system, consisting of a novel blind speech segmentation algorithm, a segmental feature extraction methodology, and data classification by incremental clustering. All methods were evaluated by extensive experiments with a broad range of test material and the evaluation methodology was itself also scrutinized. The segmentation algorithm achieved above standard quality results compared to what is found in current literature regarding blind segmentation. Possibilities for follow-up research of memory structures and intelligent top-down feedback in speech processing are also outlined

    Biomass Wastes for Energy Production

    Get PDF
    Environmental problems are forcing a rethinking of the world’s energy supply system. In parallel, there is an increasing amount of global solid waste production. A fundamental shift toward greater reliance on biomass wastes in the world’s energy system is plausible because of ongoing major technological advances that hold the promise of making the conversion of biomass into high-quality energy carriers, like electricity and gaseous or liquid fuels, economically competitive with fossil fuels. Therefore, waste-to-energy systems have become a paramount topic for both industry and researchers due to interest in energy production from waste and improved chemical and thermal efficiencies with more cost-effective designs. This biomass shift is also important for industries to become more efficient by using their own wastes to produce their own energy in the light of the circular economy concept. This book on “Biomass Wastes for Energy Production” brings novel advances on waste-to-energy technologies, life cycle assessment, and computational models, and contributes to promoting rethinking of the world’s energy supply systems

    At the interface: Dynamic interactions of explicit and implicit language knowledge.

    Full text link
    Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/139748/1/AttheInterface.pd
    • …
    corecore