4,933 research outputs found

    Analysis of Vocal Disorders in a Feature Space

    Full text link
    This paper provides a way to classify vocal disorders for clinical applications. This goal is achieved by means of geometric signal separation in a feature space. Typical quantities from chaos theory (like entropy, correlation dimension and first lyapunov exponent) and some conventional ones (like autocorrelation and spectral factor) are analysed and evaluated, in order to provide entries for the feature vectors. A way of quantifying the amount of disorder is proposed by means of an healthy index that measures the distance of a voice sample from the centre of mass of both healthy and sick clusters in the feature space. A successful application of the geometrical signal separation is reported, concerning distinction between normal and disordered phonation.Comment: 12 pages, 3 figures, accepted for publication in Medical Engineering & Physic

    Characterization of Healthy and Pathological Voice Through Measures Based on Nonlinear Dynamics

    Get PDF
    In this paper, we propose to quantify the quality of the recorded voice through objective nonlinear measures. Quantification of speech signal quality has been traditionally carried out with linear techniques since the classical model of voice production is a linear approximation. Nevertheless, nonlinear behaviors in the voice production process have been shown. This paper studies the usefulness of six nonlinear chaotic measures based on nonlinear dynamics theory in the discrimination between two levels of voice quality: healthy and pathological. The studied measures are first- and second-order Renyi entropies, the correlation entropy and the correlation dimension. These measures were obtained from the speech signal in the phase-space domain. The values of the first minimum of mutual information function and Shannon entropy were also studied. Two databases were used to assess the usefulness of the measures: a multiquality database composed of four levels of voice quality (healthy voice and three levels of pathological voice); and a commercial database (MEEI Voice Disorders) composed of two levels of voice quality (healthy and pathological voices). A classifier based on standard neural networks was implemented in order to evaluate the measures proposed. Global success rates of 82.47% (multiquality database) and 99.69% (commercial database) were obtained.Publicad

    Analysis of complexity and modulation spectra parameterizations to characterize voice roughness

    Get PDF
    Disordered voices are frequently assessed by speech pathologists using acoustic perceptual evaluations. This might lead to problems due to the subjective nature of the process and due to the in uence of external factors which compromise the quality of the assessment. In order to increase the reliability of the evaluations the design of new indicator parameters obtained from voice signal processing is desirable. With that in mind, this paper presents an automatic evaluation system which emulates perceptual assessments of the roughness level in human voice. Two parameterization methods are used: complexity, which has already been used successfully in previous works, and modulation spectra. For the latter, a new group of parameters has been proposed as Low Modulation Ratio (LMR), Contrast (MSW) and Homogeneity (MSH). The tested methodology also employs PCA and LDA to reduce the dimensionality of the feature space, and GMM classiffers for evaluating the ability of the proposed features on distinguishing the different roughness levels. An effciency of 82% and a Cohen's Kappa Index of 0:73 is obtained using the modulation spectra parameters, while the complexity parameters performed 73% and 0:58 respectively. The obtained results indicate the usefulness of the proposed modulation spectra features for the automatic evaluation of voice roughness which can derive in new parameters to be useful for clinicians

    Fog Computing in Medical Internet-of-Things: Architecture, Implementation, and Applications

    Full text link
    In the era when the market segment of Internet of Things (IoT) tops the chart in various business reports, it is apparently envisioned that the field of medicine expects to gain a large benefit from the explosion of wearables and internet-connected sensors that surround us to acquire and communicate unprecedented data on symptoms, medication, food intake, and daily-life activities impacting one's health and wellness. However, IoT-driven healthcare would have to overcome many barriers, such as: 1) There is an increasing demand for data storage on cloud servers where the analysis of the medical big data becomes increasingly complex, 2) The data, when communicated, are vulnerable to security and privacy issues, 3) The communication of the continuously collected data is not only costly but also energy hungry, 4) Operating and maintaining the sensors directly from the cloud servers are non-trial tasks. This book chapter defined Fog Computing in the context of medical IoT. Conceptually, Fog Computing is a service-oriented intermediate layer in IoT, providing the interfaces between the sensors and cloud servers for facilitating connectivity, data transfer, and queryable local database. The centerpiece of Fog computing is a low-power, intelligent, wireless, embedded computing node that carries out signal conditioning and data analytics on raw data collected from wearables or other medical sensors and offers efficient means to serve telehealth interventions. We implemented and tested an fog computing system using the Intel Edison and Raspberry Pi that allows acquisition, computing, storage and communication of the various medical data such as pathological speech data of individuals with speech disorders, Phonocardiogram (PCG) signal for heart rate estimation, and Electrocardiogram (ECG)-based Q, R, S detection.Comment: 29 pages, 30 figures, 5 tables. Keywords: Big Data, Body Area Network, Body Sensor Network, Edge Computing, Fog Computing, Medical Cyberphysical Systems, Medical Internet-of-Things, Telecare, Tele-treatment, Wearable Devices, Chapter in Handbook of Large-Scale Distributed Computing in Smart Healthcare (2017), Springe

    Non uniform embedding based on relevance analysis with reduced computational complexity: application to the detection of pathologies from biosignal recordings

    Get PDF
    Nonlinear analysis tools for studying and characterizing the dynamics of physiological signals have gained popularity, mainly because tracking sudden alterations of the inherent complexity of biological processes might be an indicator of altered physiological states. Typically, in order to perform an analysis with such tools, the physiological variables that describe the biological process under study are used to reconstruct the underlying dynamics of the biological processes. For that goal, a procedure called time-delay or uniform embedding is usually employed. Nonetheless, there is evidence of its inability for dealing with non-stationary signals, as those recorded from many physiological processes. To handle with such a drawback, this paper evaluates the utility of non-conventional time series reconstruction procedures based on non uniform embedding, applying them to automatic pattern recognition tasks. The paper compares a state of the art non uniform approach with a novel scheme which fuses embedding and feature selection at once, searching for better reconstructions of the dynamics of the system. Moreover, results are also compared with two classic uniform embedding techniques. Thus, the goal is comparing uniform and non uniform reconstruction techniques, including the one proposed in this work, for pattern recognition in biomedical signal processing tasks. Once the state space is reconstructed, the scheme followed characterizes with three classic nonlinear dynamic features (Largest Lyapunov Exponent, Correlation Dimension and Recurrence Period Density Entropy), while classification is carried out by means of a simple k-nn classifier. In order to test its generalization capabilities, the approach was tested with three different physiological databases (Speech Pathologies, Epilepsy and Heart Murmurs). In terms of the accuracy obtained to automatically detect the presence of pathologies, and for the three types of biosignals analyzed, the non uniform techniques used in this work lightly outperformed the results obtained using the uniform methods, suggesting their usefulness to characterize non-stationary biomedical signals in pattern recognition applications. On the other hand, in view of the results obtained and its low computational load, the proposed technique suggests its applicability for the applications under study

    Fault Analysis of Electromechanical Systems using Information Entropy Concepts

    Get PDF
    Fault analysis of mechanical and electromechanical systems has been a subject of considerable interest in the systems and control research community. Entropy, under its various formulations is an important variable, which is unrivaled when it comes to measuring order (or organization) and/or disorder (or disorganization). Researchers have successfully used entropy based concepts to solve various challenging problems in engineering, mathematics, meteorology, biotechnology, medicine, statistics etc. This research tries to analyze faults in electromechanical systems using information entropy concepts. The objectives of this research are to develop a method to evaluate signal entropy of a dynamical system using only input/output measurements, and to use this entropy measure to analyze faults within a dynamical system. Given discrete-time signals corresponding to the three-phase voltages and currents of an electromechanical system being monitored, the problem is to analyze whether or not this system is healthy. The concepts of Shannon entropy and relative entropy come from the field of Information Theory. They measure the degree of uncertainty that exists in a system. The main idea behind this approach is that the system's dynamics may have regularities hidden in measurements that are not obvious to see. The Shannon entropy and relative entropy measures are calculated by using probability distribution functions (PDF) that are formed by sampling the time series currents and voltages of a system. The system's health is monitored by, first, sampling the currents and voltages at certain time intervals, then generating the corresponding PDFs and, finally, calculating the information entropy measures. If the system dynamics are unchanged, or in other words, the system continues to be healthy, then the relative entropy measures will be consistently low or constant. But, if the system dynamics change due to damage, then the corresponding relative entropy and Shannon entropy measures will be increasing compared to the entropy of the system with less damage

    Models and analysis of vocal emissions for biomedical applications: 5th International Workshop: December 13-15, 2007, Firenze, Italy

    Get PDF
    The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies. The Workshop has the sponsorship of: Ente Cassa Risparmio di Firenze, COST Action 2103, Biomedical Signal Processing and Control Journal (Elsevier Eds.), IEEE Biomedical Engineering Soc. Special Issues of International Journals have been, and will be, published, collecting selected papers from the conference

    Automatic voice disorder detection using self-supervised representations

    Get PDF
    Many speech features and models, including Deep Neural Networks (DNN), are used for classification tasks between healthy and pathological speech with the Saarbruecken Voice Database (SVD). However, accuracy values of 80.71% for phrases or 82.8% for vowels /aiu/ are the highest reported for audio samples in SVD when the evaluation includes the wide amount of pathologies in the database, instead of a selection of some pathologies. This paper targets this top performance in the state-of-the-art Automatic Voice Disorder Detection (AVDD) systems. In the framework of a DNN-based AVDD system we study the capability of Self-Supervised (SS) representation learning for describing discriminative cues between healthy and pathological speech. The system processes the SS temporal sequence of features with a single feed-forward layer and Class-Token (CT) Transformer for obtaining the classification between healthy and pathological speech. Furthermore, there is evaluated a suitable data extension of the training set with out-of-domain data is also evaluated to deal with the low availability of data for using DNN-based models in voice pathology detection. Experimental results using audio samples corresponding to phrases in the SVD dataset, including all pathologies available, show classification accuracy values until 93.36%. This means that the proposed AVDD system achieved accuracy improvements of 4.1% without the training data extension, and 15.62% after the training data extension compared to the baseline system. Beyond the novelty of using SS representations for AVDD, the fact of obtaining accuracies over 90% in these conditions and using the whole set of pathologies in the SVD is a milestone for voice disorder-related research. Furthermore, the study on the amount of in-domain data in the training set related to the system performance show guidance for the data preparation stage. Lessons learned in this work suggest guidelines for taking advantage of DNN, to boost the performance in developing automatic systems for diagnosis, treatment, and monitoring of voice pathologies

    COVID-19 activity screening by a smart-data-driven multi-band voice analysis

    Get PDF
    COVID-19 is a disease caused by the new coronavirus SARS-COV-2 which can lead to severe respiratory infections. Since its first detection it caused more than six million worldwide deaths. COVID-19 diagnosis non-invasive and low-cost methods with faster and accurate results are still needed for a fast disease control. In this research, 3 different signal analyses have been applied (per broadband, per sub-bands and per broadband & sub-bands) to Cough, Breathing & Speech signals of Coswara dataset to extract non-linear patterns (Energy, Entropies, Correlation Dimension, Detrended Fluctuation Analysis, Lyapunov Exponent & Fractal Dimensions) for feeding a XGBoost classifier to discriminate COVID-19 activity on its different stages. Classification accuracies ranged between 83.33% and 98.46% have been achieved, surpassing the state-of-art methods in some comparisons. It should be empathized the 98.46% of accuracy reached on pair Healthy Controls vs all COVID-19 stages. The results shows that the method may be adequate for COVID-19 diagnosis screening assistance.info:eu-repo/semantics/acceptedVersio

    Big Data analytics to assess personality based on voice analysis

    Full text link
    Trabajo Fin de Grado en Ingeniería de Tecnologías y Servicios de TelecomunicaciónWhen humans speak, the produced series of acoustic signs do not encode only the linguistic message they wish to communicate, but also several other types of information about themselves and their states that show glimpses of their personalities and can be apprehended by judgers. As there is nowadays a trend to film job candidate’s interviews, the aim of this Thesis is to explore possible correlations between speech features extracted from interviews and personality characteristics established by experts, and to try to predict in a candidate the Big Five personality traits: Conscientiousness, Agreeableness, Neuroticism, Openness to Experience and Extraversion. The features were extracted from a genuine database of 44 women video recordings acquired in 2020, and 78 in 2019 and before from a previous study. Even though many significant correlations were found for each years’ dataset, lots of them were proven to be inconsistent through both studies. Only extraversion, and openness in a more limited way, showed a good number of clear correlations. Essentially, extraversion has been found to be related to the variation in the slope of the pitch (usually at the end of sentences), which indicates that a more "singing" voice could be associated with a higher score. In addition, spectral entropy and roll-off measurements have also been found to indicate that larger changes in the spectrum (which may also be related to more "singing" voices) could be associated with greater extraversion too. Regarding predictive modelling algorithms, aimed to estimate personality traits from the speech features obtained for the study, results were observed to be very limited in terms of accuracy and RMSE, and also through scatter plots for regression models and confusion matrixes for classification evaluation. Nevertheless, various results encourage to believe that there are some predicting capabilities, and extraversion and openness also ended up being the most predictable personality traits. Better outcomes were achieved when predictions were performed based on one specific feature instead of all of them or a reduced group, as it was the case for openness when estimated through linear and logistic regression based on time over 90% of the variation range of the deltas from the entropy of the spectrum module. Extraversion too, as it correlates well with features relating variation in F0 decreasing slope and variations in the spectrum. For the predictions, several machine learning algorithms have been used, such as linear regression, logistic regression and random forests
    • …
    corecore