217 research outputs found
DDSupport: Language Learning Support System that Displays Differences and Distances from Model Speech
When beginners learn to speak a non-native language, it is difficult for them
to judge for themselves whether they are speaking well. Therefore,
computer-assisted pronunciation training systems are used to detect learner
mispronunciations. These systems typically compare the user's speech with that
of a specific native speaker as a model in units of rhythm, phonemes, or words
and calculate the differences. However, they require extensive speech data with
detailed annotations or can only compare with one specific native speaker. To
overcome these problems, we propose a new language learning support system that
calculates speech scores and detects mispronunciations by beginners based on a
small amount of unannotated speech data without comparison to a specific
person. The proposed system uses deep learning--based speech processing to
display the pronunciation score of the learner's speech and the
difference/distance between the learner's and a group of models' pronunciation
in an intuitively visual manner. Learners can gradually improve their
pronunciation by eliminating differences and shortening the distance from the
model until they become sufficiently proficient. Furthermore, since the
pronunciation score and difference/distance are not calculated compared to
specific sentences of a particular model, users are free to study the sentences
they wish to study. We also built an application to help non-native speakers
learn English and confirmed that it can improve users' speech intelligibility
Three-dimensional Mid-air Acoustic Manipulation by Ultrasonic Phased Arrays
The essence of levitation technology is the countervailing of gravity. It is
known that an ultrasound standing wave is capable of suspending small particles
at its sound pressure nodes. The acoustic axis of the ultrasound beam in
conventional studies was parallel to the gravitational force, and the levitated
objects were manipulated along the fixed axis (i.e. one-dimensionally) by
controlling the phases or frequencies of bolted Langevin-type transducers. In
the present study, we considered extended acoustic manipulation whereby
millimetre-sized particles were levitated and moved three-dimensionally by
localised ultrasonic standing waves, which were generated by ultrasonic phased
arrays. Our manipulation system has two original features. One is the direction
of the ultrasound beam, which is arbitrary because the force acting toward its
centre is also utilised. The other is the manipulation principle by which a
localised standing wave is generated at an arbitrary position and moved
three-dimensionally by opposed and ultrasonic phased arrays. We experimentally
confirmed that expanded-polystyrene particles of 0.6 mm and 2 mm in diameter
could be manipulated by our proposed method.Comment: 5pages, 4figure
LipLearner: Customizable Silent Speech Interactions on Mobile Devices
Silent speech interface is a promising technology that enables private
communications in natural language. However, previous approaches only support a
small and inflexible vocabulary, which leads to limited expressiveness. We
leverage contrastive learning to learn efficient lipreading representations,
enabling few-shot command customization with minimal user effort. Our model
exhibits high robustness to different lighting, posture, and gesture conditions
on an in-the-wild dataset. For 25-command classification, an F1-score of 0.8947
is achievable only using one shot, and its performance can be further boosted
by adaptively learning from more data. This generalizability allowed us to
develop a mobile silent speech interface empowered with on-device fine-tuning
and visual keyword spotting. A user study demonstrated that with LipLearner,
users could define their own commands with high reliability guaranteed by an
online incremental learning scheme. Subjective feedback indicated that our
system provides essential functionalities for customizable silent speech
interactions with high usability and learnability.Comment: Conditionally accepted to the ACM CHI Conference on Human Factors in
Computing Systems 2023 (CHI '23
Interaction techniques for mobile collocation
Research on mobile collocated interactions has been exploring situations where collocated users engage in collaborative activities using their personal mobile devices (e.g., smartphones and tablets), thus going from personal/individual toward shared/multiuser experiences and interactions. The proliferation of ever- smaller computers that can be worn on our wrists (e.g., Apple Watch) and other parts of the body (e.g., Google Glass), have expanded the possibilities and increased the complexity of interaction in what we term “mobile collocated” situations. The focus of this workshop is to bring together a community of researchers, designers and practitioners to explore novel interaction techniques for mobile collocated interactions
- …