Search CORE

20 research outputs found

Semantic Processing of Out-Of-Vocabulary Words in a Spoken Dialogue System

Author: Aretoulaki Maria
Boros Manuela
Gallwitz Florian
Niemann Heinrich
Noeth Elmar
Publication venue
Publication date: 01/01/1997
Field of study

One of the most important causes of failure in spoken dialogue systems is usually neglected: the problem of words that are not covered by the system's vocabulary (out-of-vocabulary or OOV words). In this paper a methodology is described for the detection, classification and processing of OOV words in an automatic train timetable information system. The various extensions that had to be effected on the different modules of the system are reported, resulting in the design of appropriate dialogue strategies, as are encouraging evaluation results on the new versions of the word recogniser and the linguistic processor.Comment: 4 pages, 2 eps figures, requires LaTeX2e, uses eurospeech.sty and epsfi

arXiv.org e-Print Archive

CiteSeerX

Integrating Syntactic and Prosodic Information for the Efficient Detection of Empty Categories

Author: Batliner Anton
Feldhaus Anke
Geissler Stefan
Kiessling Andreas
Kiss Tibor
Kompe Ralf
Noeth Elmar
Publication venue
Publication date: 01/01/1996
Field of study

We describe a number of experiments that demonstrate the usefulness of prosodic information for a processing module which parses spoken utterances with a feature-based grammar employing empty categories. We show that by requiring certain prosodic properties from those positions in the input where the presence of an empty category has to be hypothesized, a derivation can be accomplished more efficiently. The approach has been implemented in the machine translation project VERBMOBIL and results in a significant reduction of the work-load for the parser.Comment: To appear in the Proceedings of Coling 1996, Copenhagen. 6 page

arXiv.org e-Print Archive

Disentangled Latent Speech Representation for Automatic Pathological Intelligibility Assessment

Author: Heismann Bjoern
Klumpp Philipp
Maier Andreas
Noeth Elmar
Schuster Maria
Weise Tobias
Yang Seung Hee
Publication venue
Publication date: 08/04/2022
Field of study

Speech intelligibility assessment plays an important role in the therapy of patients suffering from pathological speech disorders. Automatic and objective measures are desirable to assist therapists in their traditionally subjective and labor-intensive assessments. In this work, we investigate a novel approach for obtaining such a measure using the divergence in disentangled latent speech representations of a parallel utterance pair, obtained from a healthy reference and a pathological speaker. Experiments on an English database of Cerebral Palsy patients, using all available utterances per speaker, show high and significant correlation values (R = -0.9) with subjective intelligibility measures, while having only minimal deviation (+-0.01) across four different reference speaker pairs. We also demonstrate the robustness of the proposed method (R = -0.89 deviating +-0.02 over 1000 iterations) by considering a significantly smaller amount of utterances per speaker. Our results are among the first to show that disentangled speech representations can be used for automatic pathological speech intelligibility assessment, resulting in a reference speaker pair invariant method, applicable in scenarios with only few utterances available.Comment: Submitted to INTERSPEECH202

arXiv.org e-Print Archive

Federated learning for secure development of AI models for Parkinson's disease detection using speech from different languages

Author: Arasteh Soroosh Tayebi
Maier Andreas
Noeth Elmar
Orozco-Arroyave Juan Rafael
Rios-Urrego Cristian David
Rusz Jan
Yang Seung Hee
Publication venue
Publication date: 18/05/2023
Field of study

Parkinson's disease (PD) is a neurological disorder impacting a person's speech. Among automatic PD assessment methods, deep learning models have gained particular interest. Recently, the community has explored cross-pathology and cross-language models which can improve diagnostic accuracy even further. However, strict patient data privacy regulations largely prevent institutions from sharing patient speech data with each other. In this paper, we employ federated learning (FL) for PD detection using speech signals from 3 real-world language corpora of German, Spanish, and Czech, each from a separate institution. Our results indicate that the FL model outperforms all the local models in terms of diagnostic accuracy, while not performing very differently from the model based on centrally combined training sets, with the advantage of not requiring any data sharing among collaborators. This will simplify inter-institutional collaborations, resulting in enhancement of patient outcomes.Comment: Accepted for INTERSPEECH 202

arXiv.org e-Print Archive

A survey on perceived speaker traits: personality, likability, pathology, and the first challenge

Author: Batliner Anton
Bocklet Tobias
Burkhardt Felix
Eyben Florian
Mohammadi Gelareh
Noeth Elmar
Schuller Björn
Steidl Stefan
van Son Rob
Vinciarelli Alessandro
Weiss Benjamin
Weninger Felix
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

The INTERSPEECH 2012 Speaker Trait Challenge aimed at a unified test-bed for perceived speaker traits – the first challenge of this kind: personality in the five OCEAN personality dimensions, likability of speakers, and intelligibility of pathologic speakers. In the present article, we give a brief overview of the state-of-the-art in these three fields of research and describe the three sub-challenges in terms of the challenge conditions, the baseline results provided by the organisers, and a new openSMILE feature set, which has been used for computing the baselines and which has been provided to the participants. Furthermore, we summarise the approaches and the results presented by the participants to show the various techniques that are currently applied to solve these classification tasks

Crossref

Enlighten

International Migration, Integration and Social Cohesion online publications

Automatic assessment of non-native prosody for English as l2

Author: Batliner Anton
Hönig Florian
Noeth Elmar
Weilhammer Karl
Publication venue
Publication date: 02/01/2020
Field of study

OPUS Augsburg

Use of prosodic speech characteristics for automated detection of alcohol intoxication

Author: Batliner Anton
Huber Richard
Levit Michael
Noeth Elmar
Publication venue
Publication date: 20/01/2020
Field of study

OPUS Augsburg

A taxonomy of applications that utilize emotional awareness

Author: Batliner Anton
Burkhardt Felix
Noeth Elmar
Van Ballegooy Markus
Publication venue
Publication date: 02/01/2020
Field of study

OPUS Augsburg

From emotion to interaction: lessons from real human-machine-dialogues

Author: Batliner Anton
Haas Jürgen
Hacker Christian
Noeth Elmar
Steidl Stefan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 02/01/2020
Field of study

OPUS Augsburg