71 research outputs found
Articulatory and bottleneck features for speaker-independent ASR of dysarthric speech
The rapid population aging has stimulated the development of assistive
devices that provide personalized medical support to the needies suffering from
various etiologies. One prominent clinical application is a computer-assisted
speech training system which enables personalized speech therapy to patients
impaired by communicative disorders in the patient's home environment. Such a
system relies on the robust automatic speech recognition (ASR) technology to be
able to provide accurate articulation feedback. With the long-term aim of
developing off-the-shelf ASR systems that can be incorporated in clinical
context without prior speaker information, we compare the ASR performance of
speaker-independent bottleneck and articulatory features on dysarthric speech
used in conjunction with dedicated neural network-based acoustic models that
have been shown to be robust against spectrotemporal deviations. We report ASR
performance of these systems on two dysarthric speech datasets of different
characteristics to quantify the achieved performance gains. Despite the
remaining performance gap between the dysarthric and normal speech, significant
improvements have been reported on both datasets using speaker-independent ASR
architectures.Comment: to appear in Computer Speech & Language -
https://doi.org/10.1016/j.csl.2019.05.002 - arXiv admin note: substantial
text overlap with arXiv:1807.1094
Models and analysis of vocal emissions for biomedical applications
This book of Proceedings collects the papers presented at the 3rd International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, MAVEBA 2003, held 10-12 December 2003, Firenze, Italy. The workshop is organised every two years, and aims to stimulate contacts between specialists active in research and industrial developments, in the area of voice analysis for biomedical applications. The scope of the Workshop includes all aspects of voice modelling and analysis, ranging from fundamental research to all kinds of biomedical applications and related established and advanced technologies
Perceptual models in speech quality assessment and coding
The ever-increasing demand for good communications/toll
quality speech has created a renewed interest into the
perceptual impact of rate compression. Two general areas are
investigated in this work, namely speech quality assessment
and speech coding.
In the field of speech quality assessment, a model is
developed which simulates the processing stages of the
peripheral auditory system. At the output of the model a
"running" auditory spectrum is obtained. This represents
the auditory (spectral) equivalent of any acoustic sound such
as speech. Auditory spectra from coded speech segments serve
as inputs to a second model. This model simulates the
information centre in the brain which performs the speech
quality assessment. [Continues.
Uncovering the Deceptions: An Analysis on Audio Spoofing Detection and Future Prospects
Audio has become an increasingly crucial biometric modality due to its
ability to provide an intuitive way for humans to interact with machines. It is
currently being used for a range of applications, including person
authentication to banking to virtual assistants. Research has shown that these
systems are also susceptible to spoofing and attacks. Therefore, protecting
audio processing systems against fraudulent activities, such as identity theft,
financial fraud, and spreading misinformation, is of paramount importance. This
paper reviews the current state-of-the-art techniques for detecting audio
spoofing and discusses the current challenges along with open research
problems. The paper further highlights the importance of considering the
ethical and privacy implications of audio spoofing detection systems. Lastly,
the work aims to accentuate the need for building more robust and generalizable
methods, the integration of automatic speaker verification and countermeasure
systems, and better evaluation protocols.Comment: Accepted in IJCAI 202
- …