667 research outputs found

    Retraining-free Customized ASR for Enharmonic Words Based on a Named-Entity-Aware Model and Phoneme Similarity Estimation

    Full text link
    End-to-end automatic speech recognition (E2E-ASR) has the potential to improve performance, but a specific issue that needs to be addressed is the difficulty it has in handling enharmonic words: named entities (NEs) with the same pronunciation and part of speech that are spelled differently. This often occurs with Japanese personal names that have the same pronunciation but different Kanji characters. Since such NE words tend to be important keywords, ASR easily loses user trust if it misrecognizes them. To solve these problems, this paper proposes a novel retraining-free customized method for E2E-ASRs based on a named-entity-aware E2E-ASR model and phoneme similarity estimation. Experimental results show that the proposed method improves the target NE character error rate by 35.7% on average relative to the conventional E2E-ASR model when selecting personal names as a target NE.Comment: accepted by INTERSPEECH202

    Real-Time Expression Control System for Wearable Animatronics

    Get PDF
    The animatoronics is used for the expression of the character with many pic-ture works which includes movies. As for the animatronics mask that the per-son wears of the lively character, the expression of lively character is truly possible because the actor\u27s performance is reflected directly. In this research, I suggest using an animatronics mask in order to reflect the character\u27s feelings and expressions real time by the actor wearing the mask.Conventionally, it is necessary for the actor to look good with the movements and facial expressions beforehand in order to determine if the actor can intuitively play the character but thinks that an actor can play a character intuitively by using this system.Art and Design Research for Sustainable Development ; September 22, 2018Conference: Tsukuba Global Science Week 2018Date: September 20-22, 2018Venue: Tsukuba International Congress Center Sponsored: University of Tsukub

    Robust sound source mapping using three-layered selective audio rays for mobile robots

    Full text link
    © 2016 IEEE. This paper investigates sound source mapping in a real environment using a mobile robot. Our approach is based on audio ray tracing which integrates occupancy grids and sound source localization using a laser range finder and a microphone array. Previous audio ray tracing approaches rely on all observed rays and grids. As such observation errors caused by sound reflection, sound occlusion, wall occlusion, sounds at misdetected grids, etc. can significantly degrade the ability to locate sound sources in a map. A three-layered selective audio ray tracing mechanism is proposed in this work. The first layer conducts frame-based unreliable ray rejection (sensory rejection) considering sound reflection and wall occlusion. The second layer introduces triangulation and audio tracing to detect falsely detected sound sources, rejecting audio rays associated to these misdetected sounds sources (short-term rejection). A third layer is tasked with rejecting rays using the whole history (long-term rejection) to disambiguate sound occlusion. Experimental results under various situations are presented, which proves the effectiveness of our method

    Improvement of DOA Estimation by using Quaternion Output in Sound Event Localization and Detection

    Get PDF
    This paper describes improvement of Direction of Arrival (DOA) estimation performance using quaternion output in the Detection and Classification of Acoustic Scenes and Events (DCASE) 2019 Task 3. DCASE 2019 Task3 focuses on the sound event localization and detection (SELD) which is a task that simultaneously estimates the sound source direction in addition to conventional sound event detection (SED). In the baseline method, the sound source direction angle is directly regressed. However, the angle is a periodic function and it has discontinuities which may make learning unstable. Specifical-ly, even though -180 deg and 180 deg are in the same direc-tion, a large loss is calculated. Estimating DOA angles with a classification approach instead of regression can solve such instability of discontinuities but this causes limitation of reso-lution. In this paper, we propose to introduce the quaternion which is a continuous function into the output layer of the neural network instead of directly estimating the sound source direction angle. This method can be easily implemented only by changing the output of the existing neural network, and thus does not significantly increase the number of parameters in the middle layers. Experimental results show that proposed method improves the DOA estimation without significantly increasing the number of parameters.24424

    Self-assertion, self-inhibition,and problem behavior in preschool children

    Get PDF
    The present study examined the relation between self-regulation (self-assertion and self-inhibition) and problem behavior (antisocial and asocial behavior) in preschool children. 332 children were rated by their teachers about self-regulation and problem behavior. The results of analysis of variance showed the followings: (1) Children with low self-inhibition showed higher antisocial behavior score than children with high self-inhibition regardless of gender or age. (2) Children with high self-assertion showed higher antisocial behavior score than children with low self-assertion regarding the 3- and 4-year old girls. (3) Children with low self-assertion showed higher asocial behavior score than children with high self-assertion regarding the 3- and 5-year old boys and the 5-year old girls. (4) Children with low self-assertion showed higher antisocial behavior score than children with high self-assertion, and children with low self-inhibition showed higher asocial behavior score than children with high self-inhibition concerning the 5-year old boys. These results were discussed in view of gender and developmental differences
    corecore