Search CORE

5,423 research outputs found

PEAKS – A system for the automatic evaluation of voice and speech disorders

Author: A. Batliner
A. Maier
Batliner
Batliner
Batliner
Batliner
Bellandese
Bodin
Bressmann
Brown
Brown
Cohen
Cohen
Courrieu
E. Nöth
Enderby
F. Rosanowski
Furia
Gales
Harding
Haughey
Henningsson
Keuning
Knuuttila
Kuttner
M. Schuster
Mahanna
Markkanen-Leppanen
Millard
Moore
Mády
Paal
Panchal
Pauloski
Paulowski
Penrose
Press
Riedhammer
Robbins
Robbins
Rosanowski
Ruben
Schutte
Schönweiler
Schönweiler
Seikaly
Su
T. Haderlein
Terai
U. Eysholdt
Wantia
Witten
Publication venue: 'Elsevier BV'
Publication date
Field of study

Combining phonological and acoustic ASR-free features for pathological speech intelligibility assessment

Author: Bocklet Tobias
Martens Jean-Pierre
Middag Catherine
Nöth Elmar
Publication venue: International Speech Communication Association (ISCA)
Publication date: 01/01/2011
Field of study

Intelligibility is widely used to measure the severity of articulatory problems in pathological speech. Recently, a number of automatic intelligibility assessment tools have been developed. Most of them use automatic speech recognizers (ASR) to compare the patient's utterance with the target text. These methods are bound to one language and tend to be less accurate when speakers hesitate or make reading errors. To circumvent these problems, two different ASR-free methods were developed over the last few years, only making use of the acoustic or phonological properties of the utterance. In this paper, we demonstrate that these ASR-free techniques are also able to predict intelligibility in other languages. Moreover, they show to be complementary, resulting in even better intelligibility predictions when both methods are combined

Ghent University Academic Bibliography

Fog Computing in Medical Internet-of-Things: Architecture, Implementation, and Applications

Author: Abtahi Mohammadreza
Akbar Umer
Borthakur Debanjan
Constant Nicholas
Dubey Harishchandra
Mahler Leslie
Mankodiya Kunal
Monteiro Admir
Sun Yan
Yang Qing
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 24/06/2017
Field of study

In the era when the market segment of Internet of Things (IoT) tops the chart in various business reports, it is apparently envisioned that the field of medicine expects to gain a large benefit from the explosion of wearables and internet-connected sensors that surround us to acquire and communicate unprecedented data on symptoms, medication, food intake, and daily-life activities impacting one's health and wellness. However, IoT-driven healthcare would have to overcome many barriers, such as: 1) There is an increasing demand for data storage on cloud servers where the analysis of the medical big data becomes increasingly complex, 2) The data, when communicated, are vulnerable to security and privacy issues, 3) The communication of the continuously collected data is not only costly but also energy hungry, 4) Operating and maintaining the sensors directly from the cloud servers are non-trial tasks. This book chapter defined Fog Computing in the context of medical IoT. Conceptually, Fog Computing is a service-oriented intermediate layer in IoT, providing the interfaces between the sensors and cloud servers for facilitating connectivity, data transfer, and queryable local database. The centerpiece of Fog computing is a low-power, intelligent, wireless, embedded computing node that carries out signal conditioning and data analytics on raw data collected from wearables or other medical sensors and offers efficient means to serve telehealth interventions. We implemented and tested an fog computing system using the Intel Edison and Raspberry Pi that allows acquisition, computing, storage and communication of the various medical data such as pathological speech data of individuals with speech disorders, Phonocardiogram (PCG) signal for heart rate estimation, and Electrocardiogram (ECG)-based Q, R, S detection.Comment: 29 pages, 30 figures, 5 tables. Keywords: Big Data, Body Area Network, Body Sensor Network, Edge Computing, Fog Computing, Medical Cyberphysical Systems, Medical Internet-of-Things, Telecare, Tele-treatment, Wearable Devices, Chapter in Handbook of Large-Scale Distributed Computing in Smart Healthcare (2017), Springe

arXiv.org e-Print Archive

Crossref

Automatic Detection of Laryngeal Pathology on Sustained Vowels Using Short-Term Cepstral Parameters: Analysis of Performance and Theoretical Justification

Author: B. Boyanov
B. Boyanov
J.G. Proakis
J.I. Godino-Llorente
J.I. Godino-Llorente
J.I. Godino-Llorente
J.R. Deller
L. Rabiner
P.J. Murphy
R.O. Duda
S. Haykin
S.E. Bou-Ghazale
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

The majority of speech signal analysis procedures for automatic detection of laryngeal pathologies mainly rely on parameters extracted from time domain processing. Moreover, calculation of these parameters often requires prior pitch period estimation; therefore, their validity heavily depends on the robustness of pitch detection. Within this paper, an alternative approach based on cepstral- domain processing is presented which has the advantage of not requiring pitch estimation, thus providing a gain in both simplicity and robustness. While the proposed scheme is similar to solutions based on Mel-frequency cepstral parameters, already present in literature, it has an easier physical interpretation while achieving similar performance standards

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Archivo Digital UPM

I hear you eat and speak: automatic recognition of eating condition and food type, use-cases, and impact on ASR performance

Author: Batliner A
Hantke S
Kurle R
Mousa AELD
Ringeval F
Schuller B
Weninger F
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 14/04/2016
Field of study

We propose a new recognition task in the area of computational paralinguistics: automatic recognition of eating conditions in speech, i. e., whether people are eating while speaking, and what they are eating. To this end, we introduce the audio-visual iHEARu-EAT database featuring 1.6 k utterances of 30 subjects (mean age: 26.1 years, standard deviation: 2.66 years, gender balanced, German speakers), six types of food (Apple, Nectarine, Banana, Haribo Smurfs, Biscuit, and Crisps), and read as well as spontaneous speech, which is made publicly available for research purposes. We start with demonstrating that for automatic speech recognition (ASR), it pays off to know whether speakers are eating or not. We also propose automatic classification both by brute-forcing of low-level acoustic features as well as higher-level features related to intelligibility, obtained from an Automatic Speech Recogniser. Prediction of the eating condition was performed with a Support Vector Machine (SVM) classifier employed in a leave-one-speaker-out evaluation framework. Results show that the binary prediction of eating condition (i. e., eating or not eating) can be easily solved independently of the speaking condition; the obtained average recalls are all above 90%. Low-level acoustic features provide the best performance on spontaneous speech, which reaches up to 62.3% average recall for multi-way classification of the eating condition, i. e., discriminating the six types of food, as well as not eating. The early fusion of features related to intelligibility with the brute-forced acoustic feature set improves the performance on read speech, reaching a 66.4% average recall for the multi-way classification task. Analysing features and classifier errors leads to a suitable ordinal scale for eating conditions, on which automatic regression can be performed with up to 56.2% determination coefficient

Directory of Open Access Journals

Spiral - Imperial College Digital Repository

Voice pathology detection based on the modified voice contour and SVM

Author: Ali Zulfiqar
Alsulaiman Mansour
Elamvazuthi Irraivan
Farahat Mohamed
Malki Khalid H.
Mesallam Tamer A.
Muhammad Ghulam
Publication venue
Publication date: 01/01/2016
Field of study

Crossref

Ulster University's Research Portal

Cepstral analysis of speech signals in the process of automatic pathological voice assessment

Author: Samborska-Owczarek Anna
Publication venue: 'Uniwersytetu Marii Curie-Sklodowskiej w Lublinie'
Publication date: 04/01/2015
Field of study

The paper describes the problem of cepstral speech analysis in the process of automated voicedisorder probability estimation. The author proposes to derive two of the most diagnosticallysignificant voice features: quality of harmonic structure and degree of subharmonic from cepstrumof speech signal. Traditionally, these attributes are estimated auricularly or by spectrum (orspectrogram) observation, hence this analysis often lacks accuracy and objectivity. The introducedparameters were calculated for the recordings from Disordered Voice Database (Kay, model 4337version 2.7.0) which consists of 710 voice samples (657 pathological, 53 healthy) recorded in thelaboratory environment and described with diagnosis and a number of additional attributes (suchas age, sex, nationality).The proposed cepstral voice features were compared to similar voice parameters derived fromMultidimensional Voice Program (Kay, model 5105 version 2.7.0) in respect to their diagnosticsignificance and presented graphically. The results show that cepstral features are more correlatedwith decision and better discriminate clusters of healthy and disordered voices. Additionally, bothparameters are obtained by single cepstral transform and do not require to perform F0 trackingearlier as it is derived simultaneously

University of Maria Curie-Skłodowska (UMCS): Scientific e-Journals / Uniwersytet Marii Curie-Skłodowskiej: e-czasopisma naukowe

Recommended from our members

Computational Approaches to Modeling Speaker State in the Medical Domain

Author: Elhadad Noemie
Hirschberg Julia Bell
Hjalmarsson Anna
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2010
Field of study

Recently, researchers in computer science and engineering have begun to explore the possibility of finding speech-based correlates of various medical conditions using automatic, computational methods. If such language cues can be identified and quantified automatically, this information can be used to support diagnosis and treatment of medical conditions in clinical settings and to further fundamental research in understanding cognition. This chapter reviews computational approaches that explore communicative patterns of patients who suffer from medical conditions such as depression, autism spectrum disorders, schizophrenia, and cancer. There are two main approaches discussed: research that explores features extracted from the acoustic signal and research that focuses on lexical and semantic features. We also present some applied research that uses computational methods to develop assistive technologies. In the final sections we discuss issues related to and the future of this emerging field of research

Columbia University Academic Commons