4,933 research outputs found
Analysis of Vocal Disorders in a Feature Space
This paper provides a way to classify vocal disorders for clinical
applications. This goal is achieved by means of geometric signal separation in
a feature space. Typical quantities from chaos theory (like entropy,
correlation dimension and first lyapunov exponent) and some conventional ones
(like autocorrelation and spectral factor) are analysed and evaluated, in order
to provide entries for the feature vectors. A way of quantifying the amount of
disorder is proposed by means of an healthy index that measures the distance of
a voice sample from the centre of mass of both healthy and sick clusters in the
feature space. A successful application of the geometrical signal separation is
reported, concerning distinction between normal and disordered phonation.Comment: 12 pages, 3 figures, accepted for publication in Medical Engineering
& Physic
Characterization of Healthy and Pathological Voice Through Measures Based on Nonlinear Dynamics
In this paper, we propose to quantify the quality of the recorded voice through objective nonlinear measures. Quantification of speech signal quality has been traditionally carried out with linear techniques since the classical model of voice production is a linear approximation. Nevertheless, nonlinear behaviors in the voice production process have been shown. This paper studies the usefulness of six nonlinear chaotic measures based on nonlinear dynamics theory in the discrimination between two levels of voice quality: healthy and pathological. The studied measures are first- and second-order Renyi entropies, the correlation entropy and the correlation dimension. These measures were obtained from the speech signal in the phase-space domain. The values of the first minimum of mutual information function and Shannon entropy were also studied. Two databases were used to assess the usefulness of the measures: a multiquality database composed of four levels of voice quality (healthy voice and three levels of pathological voice); and a commercial database (MEEI Voice Disorders) composed of two levels of voice quality (healthy and pathological voices). A classifier based on standard neural networks was implemented in order to evaluate the measures proposed. Global success rates of 82.47% (multiquality database) and 99.69% (commercial database) were obtained.Publicad
Analysis of complexity and modulation spectra parameterizations to characterize voice roughness
Disordered voices are frequently assessed by speech pathologists using acoustic perceptual evaluations. This might lead to problems due to the subjective nature of the process and due to the in uence of external factors which compromise the quality of the assessment. In order to increase the reliability of the evaluations the design of new indicator parameters obtained from voice signal processing is desirable. With that in mind, this paper presents an automatic evaluation system which emulates perceptual assessments of the roughness level in human voice. Two parameterization methods are used: complexity, which has already been used successfully in previous works, and modulation spectra. For the latter, a new group of parameters has been proposed as Low Modulation Ratio (LMR), Contrast (MSW) and Homogeneity (MSH). The tested methodology also employs PCA and LDA to reduce the dimensionality of the feature space, and GMM classiffers for evaluating the ability of the proposed features on distinguishing the different roughness levels. An effciency of 82% and a Cohen's Kappa Index of 0:73 is obtained using the modulation spectra parameters, while the complexity parameters performed 73% and 0:58 respectively. The obtained results indicate the usefulness of the proposed modulation spectra features for the automatic evaluation of voice roughness which can derive in new parameters to be useful for clinicians
Fog Computing in Medical Internet-of-Things: Architecture, Implementation, and Applications
In the era when the market segment of Internet of Things (IoT) tops the chart
in various business reports, it is apparently envisioned that the field of
medicine expects to gain a large benefit from the explosion of wearables and
internet-connected sensors that surround us to acquire and communicate
unprecedented data on symptoms, medication, food intake, and daily-life
activities impacting one's health and wellness. However, IoT-driven healthcare
would have to overcome many barriers, such as: 1) There is an increasing demand
for data storage on cloud servers where the analysis of the medical big data
becomes increasingly complex, 2) The data, when communicated, are vulnerable to
security and privacy issues, 3) The communication of the continuously collected
data is not only costly but also energy hungry, 4) Operating and maintaining
the sensors directly from the cloud servers are non-trial tasks. This book
chapter defined Fog Computing in the context of medical IoT. Conceptually, Fog
Computing is a service-oriented intermediate layer in IoT, providing the
interfaces between the sensors and cloud servers for facilitating connectivity,
data transfer, and queryable local database. The centerpiece of Fog computing
is a low-power, intelligent, wireless, embedded computing node that carries out
signal conditioning and data analytics on raw data collected from wearables or
other medical sensors and offers efficient means to serve telehealth
interventions. We implemented and tested an fog computing system using the
Intel Edison and Raspberry Pi that allows acquisition, computing, storage and
communication of the various medical data such as pathological speech data of
individuals with speech disorders, Phonocardiogram (PCG) signal for heart rate
estimation, and Electrocardiogram (ECG)-based Q, R, S detection.Comment: 29 pages, 30 figures, 5 tables. Keywords: Big Data, Body Area
Network, Body Sensor Network, Edge Computing, Fog Computing, Medical
Cyberphysical Systems, Medical Internet-of-Things, Telecare, Tele-treatment,
Wearable Devices, Chapter in Handbook of Large-Scale Distributed Computing in
Smart Healthcare (2017), Springe
Non uniform embedding based on relevance analysis with reduced computational complexity: application to the detection of pathologies from biosignal recordings
Nonlinear analysis tools for studying and characterizing the dynamics of physiological signals have gained popularity, mainly because tracking sudden alterations of the inherent complexity of biological processes might be an indicator of altered physiological states.
Typically, in order to perform an analysis with such tools, the physiological variables that describe the biological process under study are used to reconstruct the underlying dynamics of the biological processes. For that goal, a procedure called time-delay or uniform embedding is usually employed.
Nonetheless, there is evidence of its inability for dealing with non-stationary signals, as those recorded from many physiological processes.
To handle with such a drawback, this paper evaluates the utility of non-conventional time series reconstruction procedures based on non uniform embedding, applying them to automatic pattern recognition tasks. The paper compares a state of the art non uniform approach with a novel scheme
which fuses embedding and feature selection at once, searching for better reconstructions of the dynamics of the system. Moreover, results are also compared with two classic uniform embedding techniques. Thus, the goal is comparing uniform and non uniform reconstruction techniques, including the one proposed in this work, for pattern recognition in biomedical signal processing tasks. Once the state space is reconstructed, the scheme followed characterizes with three classic nonlinear dynamic features (Largest Lyapunov Exponent, Correlation Dimension and Recurrence Period Density Entropy), while classification is carried out by means of a simple k-nn classifier. In order to test its generalization capabilities, the approach was tested with three different physiological databases (Speech Pathologies, Epilepsy and Heart Murmurs).
In terms of the accuracy obtained to automatically detect the presence of pathologies, and for the three types of biosignals analyzed, the non uniform techniques used in this work lightly outperformed the results obtained using the uniform methods, suggesting their usefulness to characterize non-stationary biomedical signals in pattern recognition applications. On the other hand, in view of the
results obtained and its low computational load, the proposed technique suggests its applicability for the
applications under study
Fault Analysis of Electromechanical Systems using Information Entropy Concepts
Fault analysis of mechanical and electromechanical systems has been a subject of considerable interest in the systems and control research community. Entropy, under its various formulations is an important variable, which is unrivaled when it comes to measuring order (or organization) and/or disorder (or disorganization). Researchers have successfully used entropy based concepts to solve various challenging problems in engineering, mathematics, meteorology, biotechnology, medicine, statistics etc. This research tries to analyze faults in electromechanical systems using information entropy concepts. The objectives of this research are to develop a method to evaluate signal entropy of a dynamical system using only input/output measurements, and to use this entropy measure to analyze faults within a dynamical system. Given discrete-time signals corresponding to the three-phase voltages and currents of an electromechanical system being monitored, the problem is to analyze whether or not this system is healthy.
The concepts of Shannon entropy and relative entropy come from the field of Information Theory. They measure the degree of uncertainty that exists in a system. The main idea behind this approach is that the system's dynamics may have regularities hidden in measurements that are not obvious to see. The Shannon entropy and relative entropy measures are calculated by using probability distribution functions (PDF) that are formed by sampling the time series currents and voltages of a system. The system's health is monitored by, first, sampling the currents and voltages at certain time intervals, then generating the corresponding PDFs and, finally, calculating the information entropy measures. If the system dynamics are unchanged, or in other words, the system continues to be healthy, then the relative entropy measures will be consistently low or constant. But, if the system dynamics change due to damage, then the corresponding relative entropy and Shannon entropy measures will be increasing compared to the entropy of the system with less damage
Models and analysis of vocal emissions for biomedical applications: 5th International Workshop: December 13-15, 2007, Firenze, Italy
The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies. The Workshop has the sponsorship of: Ente Cassa Risparmio di Firenze, COST Action 2103, Biomedical Signal Processing and Control Journal (Elsevier Eds.), IEEE Biomedical Engineering Soc. Special Issues of International Journals have been, and will be, published, collecting selected papers from the conference
Automatic voice disorder detection using self-supervised representations
Many speech features and models, including Deep Neural Networks (DNN), are used for classification tasks between healthy and pathological speech with the Saarbruecken Voice Database (SVD). However, accuracy values of 80.71% for phrases or 82.8% for vowels /aiu/ are the highest reported for audio samples in SVD when the evaluation includes the wide amount of pathologies in the database, instead of a selection of some pathologies. This paper targets this top performance in the state-of-the-art Automatic Voice Disorder Detection (AVDD) systems. In the framework of a DNN-based AVDD system we study the capability of Self-Supervised (SS) representation learning for describing discriminative cues between healthy and pathological speech. The system processes the SS temporal sequence of features with a single feed-forward layer and Class-Token (CT) Transformer for obtaining the classification between healthy and pathological speech. Furthermore, there is evaluated a suitable data extension of the training set with out-of-domain data is also evaluated to deal with the low availability of data for using DNN-based models in voice pathology detection. Experimental results using audio samples corresponding to phrases in the SVD dataset, including all pathologies available, show classification accuracy values until 93.36%. This means that the proposed AVDD system achieved accuracy improvements of 4.1% without the training data extension, and 15.62% after the training data extension compared to the baseline system. Beyond the novelty of using SS representations for AVDD, the fact of obtaining accuracies over 90% in these conditions and using the whole set of pathologies in the SVD is a milestone for voice disorder-related research. Furthermore, the study on the amount of in-domain data in the training set related to the system performance show guidance for the data preparation stage. Lessons learned in this work suggest guidelines for taking advantage of DNN, to boost the performance in developing automatic systems for diagnosis, treatment, and monitoring of voice pathologies
COVID-19 activity screening by a smart-data-driven multi-band voice analysis
COVID-19 is a disease caused by the new coronavirus SARS-COV-2 which can lead to severe respiratory infections. Since its first detection it caused more than six million worldwide deaths. COVID-19 diagnosis non-invasive and low-cost methods with faster and accurate results are still needed for a fast disease control. In this research, 3 different signal analyses have been applied (per broadband, per sub-bands and per broadband & sub-bands) to Cough, Breathing & Speech signals of Coswara dataset to extract non-linear patterns (Energy, Entropies, Correlation Dimension, Detrended Fluctuation Analysis, Lyapunov Exponent & Fractal Dimensions) for feeding a XGBoost classifier to discriminate COVID-19 activity on its different stages. Classification accuracies ranged between 83.33% and 98.46% have been achieved, surpassing the state-of-art methods in some comparisons. It should be empathized the 98.46% of accuracy reached on pair Healthy Controls vs all COVID-19 stages. The results shows that the method may be adequate for COVID-19 diagnosis screening assistance.info:eu-repo/semantics/acceptedVersio
Big Data analytics to assess personality based on voice analysis
Trabajo Fin de Grado en IngenierĂa de TecnologĂas y Servicios de
TelecomunicaciĂłnWhen humans speak, the produced series of acoustic signs do not encode only the
linguistic message they wish to communicate, but also several other types of information
about themselves and their states that show glimpses of their personalities and can be
apprehended by judgers. As there is nowadays a trend to film job candidate’s interviews, the
aim of this Thesis is to explore possible correlations between speech features extracted from
interviews and personality characteristics established by experts, and to try to predict in a
candidate the Big Five personality traits: Conscientiousness, Agreeableness, Neuroticism,
Openness to Experience and Extraversion. The features were extracted from a genuine
database of 44 women video recordings acquired in 2020, and 78 in 2019 and before from a
previous study.
Even though many significant correlations were found for each years’ dataset, lots of
them were proven to be inconsistent through both studies. Only extraversion, and openness
in a more limited way, showed a good number of clear correlations. Essentially, extraversion
has been found to be related to the variation in the slope of the pitch (usually at the end of
sentences), which indicates that a more "singing" voice could be associated with a higher
score. In addition, spectral entropy and roll-off measurements have also been found to
indicate that larger changes in the spectrum (which may also be related to more "singing"
voices) could be associated with greater extraversion too.
Regarding predictive modelling algorithms, aimed to estimate personality traits from the
speech features obtained for the study, results were observed to be very limited in terms of
accuracy and RMSE, and also through scatter plots for regression models and confusion
matrixes for classification evaluation. Nevertheless, various results encourage to believe that
there are some predicting capabilities, and extraversion and openness also ended up being
the most predictable personality traits. Better outcomes were achieved when predictions
were performed based on one specific feature instead of all of them or a reduced group, as it
was the case for openness when estimated through linear and logistic regression based on
time over 90% of the variation range of the deltas from the entropy of the spectrum module.
Extraversion too, as it correlates well with features relating variation in F0 decreasing slope
and variations in the spectrum. For the predictions, several machine learning algorithms have
been used, such as linear regression, logistic regression and random forests
- …