2,023 research outputs found

    Exploiting Nonlinear Recurrence and Fractal Scaling Properties for Voice Disorder Detection

    Get PDF
    Background: Voice disorders affect patients profoundly, and acoustic tools can potentially measure voice function objectively. Disordered sustained vowels exhibit wide-ranging phenomena, from nearly periodic to highly complex, aperiodic vibrations, and increased "breathiness". Modelling and surrogate data studies have shown significant nonlinear and non-Gaussian random properties in these sounds. Nonetheless, existing tools are limited to analysing voices displaying near periodicity, and do not account for this inherent biophysical nonlinearity and non-Gaussian randomness, often using linear signal processing methods insensitive to these properties. They do not directly measure the two main biophysical symptoms of disorder: complex nonlinear aperiodicity, and turbulent, aeroacoustic, non-Gaussian randomness. Often these tools cannot be applied to more severe disordered voices, limiting their clinical usefulness.

Methods: This paper introduces two new tools to speech analysis: recurrence and fractal scaling, which overcome the range limitations of existing tools by addressing directly these two symptoms of disorder, together reproducing a "hoarseness" diagram. A simple bootstrapped classifier then uses these two features to distinguish normal from disordered voices.

Results: On a large database of subjects with a wide variety of voice disorders, these new techniques can distinguish normal from disordered cases, using quadratic discriminant analysis, to overall correct classification performance of 91.8% plus or minus 2.0%. The true positive classification performance is 95.4% plus or minus 3.2%, and the true negative performance is 91.5% plus or minus 2.3% (95% confidence). This is shown to outperform all combinations of the most popular classical tools.

Conclusions: Given the very large number of arbitrary parameters and computational complexity of existing techniques, these new techniques are far simpler and yet achieve clinically useful classification performance using only a basic classification technique. They do so by exploiting the inherent nonlinearity and turbulent randomness in disordered voice signals. They are widely applicable to the whole range of disordered voice phenomena by design. These new measures could therefore be used for a variety of practical clinical purposes.
&#xa

    Objective dysphonia quantification in vocal fold paralysis: comparing nonlinear with classical measures

    Get PDF
    Clinical acoustic voice recording analysis is usually performed using classical perturbation measures including jitter, shimmer and noise-to-harmonic ratios. However, restrictive mathematical limitations of these measures prevent analysis for severely dysphonic voices. Previous studies of alternative nonlinear random measures addressed wide varieties of vocal pathologies. Here, we analyze a single vocal pathology cohort, testing the performance of these alternative measures alongside classical measures.

We present voice analysis pre- and post-operatively in unilateral vocal fold paralysis (UVFP) patients and healthy controls, patients undergoing standard medialisation thyroplasty surgery, using jitter, shimmer and noise-to-harmonic ratio (NHR), and nonlinear recurrence period density entropy (RPDE), detrended fluctuation analysis (DFA) and correlation dimension. Systematizing the preparative editing of the recordings, we found that the novel measures were more stable and hence reliable, than the classical measures, on healthy controls.

RPDE and jitter are sensitive to improvements pre- to post-operation. Shimmer, NHR and DFA showed no significant change (p > 0.05). All measures detect statistically significant and clinically important differences between controls and patients, both treated and untreated (p < 0.001, AUC > 0.7). Pre- to post-operation, GRBAS ratings show statistically significant and clinically important improvement in overall dysphonia grade (G) (AUC = 0.946, p < 0.001).

Re-calculating AUCs from other study data, we compare these results in terms of clinical importance. We conclude that, when preparative editing is systematized, nonlinear random measures may be useful UVFP treatment effectiveness monitoring tools, and there may be applications for other forms of dysphonia.
&#xa

    Effects of Waveform PMF on Anti-Spoofing Detection

    Get PDF
    International audienceIn the context of detection of speaker recognition identity impersonation , we observed that the waveform probability mass function (PMF) of genuine speech differs from significantly of of PMF from identity theft extracts. This is true for synthesized or converted speech as well as for replayed speech. In this work, we mainly ask whether this observation has a significant impact on spoofing detection performance. In a second step, we want to reduce the distribution gap of waveforms between authentic speech and spoofing speech. We propose a genuiniza-tion of the spoofing speech (by analogy with Gaussianisation), i.e. to obtain spoofing speech with a PMF close to the PMF of genuine speech. Our genuinization is evaluated on ASVspoof 2019 challenge datasets, using the baseline system provided by the challenge organization. In the case of constant Q cep-stral coefficients (CQCC) features, the genuinization leads to a degradation of the baseline system performance by a factor of 10, which shows a potentially large impact of the distribution os waveforms on spoofing detection performance. However, by ''playing" with all configurations, we also observed different behaviors, including performance improvements in specific cases. This leads us to conclude that waveform distribution plays an important role and must be taken into account by anti-spoofing systems

    Fog Computing in Medical Internet-of-Things: Architecture, Implementation, and Applications

    Full text link
    In the era when the market segment of Internet of Things (IoT) tops the chart in various business reports, it is apparently envisioned that the field of medicine expects to gain a large benefit from the explosion of wearables and internet-connected sensors that surround us to acquire and communicate unprecedented data on symptoms, medication, food intake, and daily-life activities impacting one's health and wellness. However, IoT-driven healthcare would have to overcome many barriers, such as: 1) There is an increasing demand for data storage on cloud servers where the analysis of the medical big data becomes increasingly complex, 2) The data, when communicated, are vulnerable to security and privacy issues, 3) The communication of the continuously collected data is not only costly but also energy hungry, 4) Operating and maintaining the sensors directly from the cloud servers are non-trial tasks. This book chapter defined Fog Computing in the context of medical IoT. Conceptually, Fog Computing is a service-oriented intermediate layer in IoT, providing the interfaces between the sensors and cloud servers for facilitating connectivity, data transfer, and queryable local database. The centerpiece of Fog computing is a low-power, intelligent, wireless, embedded computing node that carries out signal conditioning and data analytics on raw data collected from wearables or other medical sensors and offers efficient means to serve telehealth interventions. We implemented and tested an fog computing system using the Intel Edison and Raspberry Pi that allows acquisition, computing, storage and communication of the various medical data such as pathological speech data of individuals with speech disorders, Phonocardiogram (PCG) signal for heart rate estimation, and Electrocardiogram (ECG)-based Q, R, S detection.Comment: 29 pages, 30 figures, 5 tables. Keywords: Big Data, Body Area Network, Body Sensor Network, Edge Computing, Fog Computing, Medical Cyberphysical Systems, Medical Internet-of-Things, Telecare, Tele-treatment, Wearable Devices, Chapter in Handbook of Large-Scale Distributed Computing in Smart Healthcare (2017), Springe

    Analysis and Detection of Pathological Voice using Glottal Source Features

    Full text link
    Automatic detection of voice pathology enables objective assessment and earlier intervention for the diagnosis. This study provides a systematic analysis of glottal source features and investigates their effectiveness in voice pathology detection. Glottal source features are extracted using glottal flows estimated with the quasi-closed phase (QCP) glottal inverse filtering method, using approximate glottal source signals computed with the zero frequency filtering (ZFF) method, and using acoustic voice signals directly. In addition, we propose to derive mel-frequency cepstral coefficients (MFCCs) from the glottal source waveforms computed by QCP and ZFF to effectively capture the variations in glottal source spectra of pathological voice. Experiments were carried out using two databases, the Hospital Universitario Principe de Asturias (HUPA) database and the Saarbrucken Voice Disorders (SVD) database. Analysis of features revealed that the glottal source contains information that discriminates normal and pathological voice. Pathology detection experiments were carried out using support vector machine (SVM). From the detection experiments it was observed that the performance achieved with the studied glottal source features is comparable or better than that of conventional MFCCs and perceptual linear prediction (PLP) features. The best detection performance was achieved when the glottal source features were combined with the conventional MFCCs and PLP features, which indicates the complementary nature of the features

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA) came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the neonate to the adult and elderly. Over the years the initial issues have grown and spread also in other aspects of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years always in Firenze, Italy. This edition celebrates twenty years of uninterrupted and succesfully research in the field of voice analysis

    Improving automatic detection of obstructive sleep apnea through nonlinear analysis of sustained speech

    Get PDF
    We present a novel approach for the detection of severe obstructive sleep apnea (OSA) based on patients' voices introducing nonlinear measures to describe sustained speech dynamics. Nonlinear features were combined with state-of-the-art speech recognition systems using statistical modeling techniques (Gaussian mixture models, GMMs) over cepstral parameterization (MFCC) for both continuous and sustained speech. Tests were performed on a database including speech records from both severe OSA and control speakers. A 10 % relative reduction in classification error was obtained for sustained speech when combining MFCC-GMM and nonlinear features, and 33 % when fusing nonlinear features with both sustained and continuous MFCC-GMM. Accuracy reached 88.5 % allowing the system to be used in OSA early detection. Tests showed that nonlinear features and MFCCs are lightly correlated on sustained speech, but uncorrelated on continuous speech. Results also suggest the existence of nonlinear effects in OSA patients' voices, which should be found in continuous speech

    Advanced bioimpedance signal processing techniques for hemodynamic monitoring during anesthesia

    Get PDF
    Aplicat embargament des de la data de defensa fins els maig 2020.Cardiac output (CO) defines the blood flow arriving from the heart to the different organs in the body and it is thus a primary determinant of global 02 transport. Cardiac output has traditionally been measured using invasive methods, whose risk sometimes exceeds the advantages of a cardiac output monitoring. In this context, the minimization of risk in new noninvasive technologies for CO monitoring could translate into major advantages for clinicians, hospitals and patients: ease of usage and availability, reduced recovery time, and improved patient outcome. Impedance Cardiography (ICG) is a promising noninvasive technology for cardiac output monitoring but available information on the ICG signals is more scare than other physiological signals such as the electrocardiogram (ECG). The present Doctoral Thesis contributes to the development of signal treatment techniques for the ICG in order to create an innovative hemodynamic monitor. First, an extensive literature review is provided regarding the basics of the clinical background in which cardiac output monitoring is used and concerning the state of the art of cardiac output monitors on the market. This Doctoral Thesis has produced a considerable amount of clinical data which is also explained in detail. These clinical data are also useful to complement the theoretical explanation of patient indices such as heart rate variability, blood flow and blood pressure. In addition, a new method to create synthetic biomedical signals with known time-frequency characteristics is introduced. One of the first analysis in this Doctoral Thesis studies the time difference between peak points of the heart beats in the ECG and the ICG: the RC segment. This RC segment is a measure of the time delay between electrical and mechanical activity of the heart. The relationship of the RC segment with blood pressure and heart interval is analyzed. The concordance of beat durations of both the electrocardiogram and the impedance cardiogram is one of the key results to develop new artefact detection algorithms and the RC could also have an impact in describing the hemodynamics of a patient. Time-frequency distributions (TFDs) are also used to characterize how the frequency content in impedance cardiography signals change with time. Since TFDs are calculated using concrete kernels, a new method to select the best kernel by using synthetic signals is presented. Optimized TFDs of ICG signals are then calculated to extract severa! features which are used to discriminate between different anesthesia states in patients undergoing surgery. TFD-derived features are also used to describe the whole surgical operations. Relationships between TFD-derived features are analyzed and prediction models for cardiac output are designed. These prediction models prove that the TFD-derived features are related to the patients' cardiac output. Finally, a validation study for the qCO monitor is presented. The qCO monitor has been designed using sorne of the techniques which are consequence of this Doctoral Thesis. The main outputs of this work have been protected with a patent which has already been filed. As a conclusion, this Doctoral Thesis has produced a considerable amount of clinical data and a variety of analysis and processing techniques of impedance cardiography signals which have been included into commercial medical devices already available on the market.El gasto cardíaco (GC) define el flujo de sangre que llega desde el corazón a los distintos órganos del cuerpo y es, por tanto, un determinante primario del transporte global de oxígeno. Se ha medido tradicionalmente usando métodos invasivos cuyos riesgos excedían en ocasiones las ventajas de su monitorización. En este contexto, la minimización del riesgo de la monitorización del gasto cardíaco en nuevas tecnologías no invasivas podría traducirse en mayores ventajas para médicos, hospitales y pacientes: facilidad de uso, disponibilidad del equipamiento y menor tiempo de recuperación y mejores resultados en el paciente. La impedancio-cardiografía o cardiografía de impedancia (ICG} es una prometedora tecnología no invasiva para la monitorización del gasto cardíaco. Sin embargo, la información disponible sobre las señales de ICG es más escasa que otras señales fisiológicas como el electrocardiograma (ECG). La presente Tesis Doctoral contribuye al desarrollo de técnicas de tratamiento de señal de ICG para así crear un monitor hemodinámico innovador. En primer lugar, se proporciona una extensa revisión bibliográfica sobre los aspectos básicos del contexto clínico en el que se utiliza la monitorización del gasto cardíaco así como sobre el estado del arte de los monitores de gasto cardíaco que existen en el mercado. Esta Tesis Doctoral ha producido una considerable cantidad de datos clínicos que también se explican en detalle. Dichos datos clínicos también son útiles para complementar las explicaciones teóricas de los índices de paciente de variabilidad cardíaca y el flujo y la presión sanguíneos. Además, se presenta un nuevo método de creación de señales sintéticas biomédicas con características de tiempo-frecuencia conocidas. Uno de los primeros análisis de esta Tesis Doctoral estudia la diferencia temporal entre los picos de los latidos cardíacos del ECG y del ICG: el segmento RC. Este segmento RC es una medida del retardo temporal entre la actividad eléctrica y mecánica del corazón. Se analiza la relación del segmento RC con la presión arterial y el intervalo cardíaco. La concordancia entre la duración de los latidos del ECG y del ICG es uno de los resultados claves para desarrollar nuevos algoritmos de detección de artefactos y el segmento RC también podría ser relevante en la descripción de la hemodinámica de los pacientes. Las distribuciones de tiempo-frecuencia (TFD, por sus siglas en inglés) se utilizan para caracterizar cómo el contenido de las señales de impedancia cardiográfica cambia con el tiempo. Dado que las TFDs deben calcularse usando núcleos (kernels, en inglés) concretos, se presenta un nuevo método para seleccionar el mejor núcleo mediante el uso de señales sintéticas. Las TFDs de ICG optimizadas se calculan para extraer distintas características que son usadas para discriminar entre los diferentes estados de anestesia en pacientes sometidos a procesos quirúrgicos. Las características derivadas de las distribuciones de tiempo-frecuencia también son utilizadas para describir las operaciones quirúrgicas durante toda su extensión temporal. La relación entre dichas características son analizadas y se proponen distintos modelos de predicción para el gasto cardíaco. Estos modelos de predicción demuestran que las características derivadas de las distribuciones tiempo-frecuencia de señales de ICG están relacionadas con el gasto cardíaco de los pacientes. Finalmente, se presenta un estudio de validación del monitor qCO, diseñado con alguna de las técnicas que son consecuencia de esta Tesis Doctoral. Las principales conclusiones de este trabajo han sido protegidas con una patente que ya ha sido registrada. Como conclusión, esta Tesis Doctoral ha producido una considerable cantidad de datos clínicos y una variedad de técnicas de procesado y análisis de señales de cardiografía de impedancia que han sido incluidas en dispositivos biomédicos disponibles en el mercadoPostprint (published version
    • …
    corecore