217 research outputs found

    DDSupport: Language Learning Support System that Displays Differences and Distances from Model Speech

    Full text link
    When beginners learn to speak a non-native language, it is difficult for them to judge for themselves whether they are speaking well. Therefore, computer-assisted pronunciation training systems are used to detect learner mispronunciations. These systems typically compare the user's speech with that of a specific native speaker as a model in units of rhythm, phonemes, or words and calculate the differences. However, they require extensive speech data with detailed annotations or can only compare with one specific native speaker. To overcome these problems, we propose a new language learning support system that calculates speech scores and detects mispronunciations by beginners based on a small amount of unannotated speech data without comparison to a specific person. The proposed system uses deep learning--based speech processing to display the pronunciation score of the learner's speech and the difference/distance between the learner's and a group of models' pronunciation in an intuitively visual manner. Learners can gradually improve their pronunciation by eliminating differences and shortening the distance from the model until they become sufficiently proficient. Furthermore, since the pronunciation score and difference/distance are not calculated compared to specific sentences of a particular model, users are free to study the sentences they wish to study. We also built an application to help non-native speakers learn English and confirmed that it can improve users' speech intelligibility

    Three-dimensional Mid-air Acoustic Manipulation by Ultrasonic Phased Arrays

    Full text link
    The essence of levitation technology is the countervailing of gravity. It is known that an ultrasound standing wave is capable of suspending small particles at its sound pressure nodes. The acoustic axis of the ultrasound beam in conventional studies was parallel to the gravitational force, and the levitated objects were manipulated along the fixed axis (i.e. one-dimensionally) by controlling the phases or frequencies of bolted Langevin-type transducers. In the present study, we considered extended acoustic manipulation whereby millimetre-sized particles were levitated and moved three-dimensionally by localised ultrasonic standing waves, which were generated by ultrasonic phased arrays. Our manipulation system has two original features. One is the direction of the ultrasound beam, which is arbitrary because the force acting toward its centre is also utilised. The other is the manipulation principle by which a localised standing wave is generated at an arbitrary position and moved three-dimensionally by opposed and ultrasonic phased arrays. We experimentally confirmed that expanded-polystyrene particles of 0.6 mm and 2 mm in diameter could be manipulated by our proposed method.Comment: 5pages, 4figure

    LipLearner: Customizable Silent Speech Interactions on Mobile Devices

    Full text link
    Silent speech interface is a promising technology that enables private communications in natural language. However, previous approaches only support a small and inflexible vocabulary, which leads to limited expressiveness. We leverage contrastive learning to learn efficient lipreading representations, enabling few-shot command customization with minimal user effort. Our model exhibits high robustness to different lighting, posture, and gesture conditions on an in-the-wild dataset. For 25-command classification, an F1-score of 0.8947 is achievable only using one shot, and its performance can be further boosted by adaptively learning from more data. This generalizability allowed us to develop a mobile silent speech interface empowered with on-device fine-tuning and visual keyword spotting. A user study demonstrated that with LipLearner, users could define their own commands with high reliability guaranteed by an online incremental learning scheme. Subjective feedback indicated that our system provides essential functionalities for customizable silent speech interactions with high usability and learnability.Comment: Conditionally accepted to the ACM CHI Conference on Human Factors in Computing Systems 2023 (CHI '23

    Interaction techniques for mobile collocation

    Get PDF
    Research on mobile collocated interactions has been exploring situations where collocated users engage in collaborative activities using their personal mobile devices (e.g., smartphones and tablets), thus going from personal/individual toward shared/multiuser experiences and interactions. The proliferation of ever- smaller computers that can be worn on our wrists (e.g., Apple Watch) and other parts of the body (e.g., Google Glass), have expanded the possibilities and increased the complexity of interaction in what we term “mobile collocated” situations. The focus of this workshop is to bring together a community of researchers, designers and practitioners to explore novel interaction techniques for mobile collocated interactions
    • …
    corecore