6 research outputs found
A Semi-Supervised Algorithm for Improving the Consistency of Crowdsourced Datasets: The COVID-19 Case Study on Respiratory Disorder Classification
Cough audio signal classification is a potentially useful tool in screening
for respiratory disorders, such as COVID-19. Since it is dangerous to collect
data from patients with such contagious diseases, many research teams have
turned to crowdsourcing to quickly gather cough sound data, as it was done to
generate the COUGHVID dataset. The COUGHVID dataset enlisted expert physicians
to diagnose the underlying diseases present in a limited number of uploaded
recordings. However, this approach suffers from potential mislabeling of the
coughs, as well as notable disagreement between experts. In this work, we use a
semi-supervised learning (SSL) approach to improve the labeling consistency of
the COUGHVID dataset and the robustness of COVID-19 versus healthy cough sound
classification. First, we leverage existing SSL expert knowledge aggregation
techniques to overcome the labeling inconsistencies and sparsity in the
dataset. Next, our SSL approach is used to identify a subsample of re-labeled
COUGHVID audio samples that can be used to train or augment future cough
classification models. The consistency of the re-labeled data is demonstrated
in that it exhibits a high degree of class separability, 3x higher than that of
the user-labeled data, despite the expert label inconsistency present in the
original dataset. Furthermore, the spectral differences in the user-labeled
audio segments are amplified in the re-labeled data, resulting in significantly
different power spectral densities between healthy and COVID-19 coughs, which
demonstrates both the increased consistency of the new dataset and its
explainability from an acoustic perspective. Finally, we demonstrate how the
re-labeled dataset can be used to train a cough classifier. This SSL approach
can be used to combine the medical knowledge of several experts to improve the
database consistency for any diagnostic classification task
The COUGHVID crowdsourcing dataset: A corpus for the study of large-scale cough analysis algorithms
Cough audio signal classification has been successfully used to diagnose a variety of respiratory conditions, and there has been significant interest in leveraging Machine Learning (ML) to provide widespread COVID-19 screening. The COUGHVID dataset provides over 20,000 crowdsourced cough recordings representing a wide range of subject ages, genders, geographic locations, and COVID-19 statuses. Furthermore, experienced pulmonologists labeled more than 2,000 recordings to diagnose medical abnormalities present in the coughs, thereby contributing one of the largest expert-labeled cough datasets in existence that can be used for a plethora of cough audio classification tasks. As a result, the COUGHVID dataset contributes a wealth of cough recordings for training ML models to address the worldâs most urgent health crises.For more information about the data collection, pre-processing, validation, and data structure, please refer to the following publication: https://arxiv.org/abs/2009.11644
The cough pre-processing and feature extraction code is available from the following c4science repository: https://c4science.ch/diffusion/10770
Towards Continuous and Ambulatory Blood Pressure Monitoring: Methods for Efficient Data Acquisition for Pulse Transit Time Estimation
We developed a prototype for measuring physiological data for pulse transit time (PTT) estimation that will be used for ambulatory blood pressure (BP) monitoring. The device is comprised of an embedded system with multimodal sensors that streams high-throughput data to a custom Android application. The primary focus of this paper is on the hardwareâsoftware codesign that we developed to address the challenges associated with reliably recording data over Bluetooth on a resource-constrained platform. In particular, we developed a lossless compression algorithm that is based on optimally selective Huffman coding and Huffman prefixed coding, which yields virtually identical compression ratios to the standard algorithm, but with a 67â99% reduction in the size of the compression tables. In addition, we developed a hybrid softwareâhardware flow control method to eliminate microcontroller (MCU) interrupt-latency related data loss when multi-byte packets are sent from the phone to the embedded system via a Bluetooth module at baud rates exceeding 115,200 bit/s. The empirical error rate obtained with the proposed method with the baud rate set to 460,800 bit/s was identically equal to 0%. Our robust and computationally efficient physiological data acquisition system will enable field experiments that will drive the development of novel algorithms for PTT-based continuous BP monitoring
Unobtrusive Estimation of Cardiac Contractility and Stroke Volume Changes Using Ballistocardiogram Measurements on a High Bandwidth Force Plate
Unobtrusive and inexpensive technologies for monitoring the cardiovascular health of heart failure (HF) patients outside the clinic can potentially improve their continuity of care by enabling therapies to be adjusted dynamically based on the changing needs of the patients. Specifically, cardiac contractility and stroke volume (SV) are two key aspects of cardiovascular health that change significantly for HF patients as their condition worsens, yet these parameters are typically measured only in hospital/clinical settings, or with implantable sensors. In this work, we demonstrate accurate measurement of cardiac contractility (based on pre-ejection period, PEP, timings) and SV changes in subjects using ballistocardiogram (BCG) signals detected via a high bandwidth force plate. The measurement is unobtrusive, as it simply requires the subject to stand still on the force plate while holding electrodes in the hands for simultaneous electrocardiogram (ECG) detection. Specifically, we aimed to assess whether the high bandwidth force plate can provide accuracy beyond what is achieved using modified weighing scales we have developed in prior studies, based on timing intervals, as well as signal-to-noise ratio (SNR) estimates. Our results indicate that the force plate BCG measurement provides more accurate timing information and allows for better estimation of PEP than the scale BCG (r2 = 0.85 vs. r2 = 0.81) during resting conditions. This correlation is stronger during recovery after exercise due to more significant changes in PEP (r2 = 0.92). The improvement in accuracy can be attributed to the wider bandwidth of the force plate. âSV (i.e., changes in stroke volume) estimations from the force plate BCG resulted in an average error percentage of 5.3% with a standard deviation of ±4.2% across all subjects. Finally, SNR calculations showed slightly better SNR in the force plate measurements among all subjects but the small difference confirmed that SNR is limited by motion artifacts rather than instrumentation
REWARD: Design, Optimization, and Evaluation of a Real-Time Relative-Energy Wearable R-Peak Detection Algorithm *
Wearable devices are an unobtrusive, cost-effective means of continuous ambulatory monitoring of chronic cardiovascular diseases. However, on these resource-constrained systems, electrocardiogram (ECG) processing algorithms must consume minimal power and memory, yet robustly provide accurate physiological information. This work presents REWARD, the Relative-Energy-based WeArable R-Peak Detection algorithm, which is a novel ECG R-peak detection mechanism based on a nonlinear filtering method called Relative-Energy (Rel-En). REWARD is designed and optimized for real-time execution on wearable systems. Then, this novel algorithm is compared against three state-of-the-art real-time R-peak detection algorithms in terms of accuracy, memory footprint, and energy consumption. The Physionet QT and NST Databases were employed to evaluate the algorithms' accuracy and robustness to noise, respectively. Then, a 32-bit ARM Cortex-M3-based microcontroller was used to measure the energy usage, computational burden, and memory footprint of the four algorithms. REWARD consumed at least 63% less energy and 32% less RAM than the other algorithms while obtaining comparable accuracy results. Therefore, REWARD would be a suitable choice of R-peak detection mechanism for wearable devices that perform more complex ECG analysis, whose algorithms require additional energy and memory resources