8 research outputs found
Sounds of COVID-19: exploring realistic performance of audio-based digital testing.
To identify Coronavirus disease (COVID-19) cases efficiently, affordably, and at scale, recent work has shown how audio (including cough, breathing and voice) based approaches can be used for testing. However, there is a lack of exploration of how biases and methodological decisions impact these tools' performance in practice. In this paper, we explore the realistic performance of audio-based digital testing of COVID-19. To investigate this, we collected a large crowdsourced respiratory audio dataset through a mobile app, alongside symptoms and COVID-19 test results. Within the collected dataset, we selected 5240 samples from 2478 English-speaking participants and split them into participant-independent sets for model development and validation. In addition to controlling the language, we also balanced demographics for model training to avoid potential acoustic bias. We used these audio samples to construct an audio-based COVID-19 prediction model. The unbiased model took features extracted from breathing, coughs and voice signals as predictors and yielded an AUC-ROC of 0.71 (95% CI: 0.65-0.77). We further explored several scenarios with different types of unbalanced data distributions to demonstrate how biases and participant splits affect the performance. With these different, but less appropriate, evaluation strategies, the performance could be overestimated, reaching an AUC up to 0.90 (95% CI: 0.85-0.95) in some circumstances. We found that an unrealistic experimental setting can result in misleading, sometimes over-optimistic, performance. Instead, we reported complete and reliable results on crowd-sourced data, which would allow medical professionals and policy makers to accurately assess the value of this technology and facilitate its deployment
Recommended from our members
A causal perspective on model robustness: case studies in health and sensor data
Robustness of predictive deep models is a challenging problem with many implications.
It is of particular importance when models are used in safety-critical applications,
such as healthcare. However, there is yet to be agreement on a comprehensive definition
on what it means for a model to be robust, and a theory on why these issues arise.
Given the general nature of the problem, existing work related to robustness is spread
across different areas of research. Existing research has considered a range of robustness
aspects, for instance robustness to small input perturbations, which arise from the study
of adversarial examples, but there is also robustness to different domains for the same
task, and robustness issues which arise from object placement, transplanting, lighting,
weather conditions, or object style, as some examples.
This thesis explores a formulation of robustness in terms of the assumed structural causal
model (SCM) which generates the observed data.The SCM allows these different types of
robustness issues to be viewed in a unifying way. Using this view, this work furthers
the connection between prediction robustness and the assumed structural causal model by
suggesting that optimising for prediction performance across a diverse set of distributions
from the same SCM will move the model closer to the causal predictor of the target variable,
providing a theoretical foundation to optimise purely for prediction in the setting where
training and testing data are not independently and identically distributed.
Formulating robustness in this way suggests that large deep models should, in general,
be more susceptible to robustness issues; while some of these issues have been observed
in applications such as computer vision, it has been less discussed in others. We
investigate the robustness of state-of-the-art deep (SotA) classifiers in human activity
recognition using a new proposed benchmark informed by the causal formulation, and show
that a simpler model is at least as robust as SotA deep models whilst being at least ten
times faster to train. The causal view of robustness additionally hints at the idea that
less data can be beneficial for robustness, contrary to popular belief that more data
is always better. To test this idea, a data selection algorithm is proposed based on
inverting the idea of a popular causal inference procedure for tabular data. The robustness
of a model trained on the selected subset of data is evaluated through synthetic and
semi-synthetic data experiments. Under certain conditions the data subset improves
robustness and subsequently data efficiency.Cambridge Trust and King's Colleg
Exploring Automatic COVID-19 Diagnosis via Voice and Symptoms from Crowdsourced Data.
The development of fast and accurate screening tools, which could facilitate testing and prevent more costly clinical tests, is key to the current pandemic of COVID-19. In this context, some initial work shows promise in detecting diagnostic signals of COVID-19 from audio sounds. In this paper, we propose a voice-based framework to automatically detect individuals who have tested positive for COVID-19. We evaluate the performance of the proposed framework on a subset of data crowdsourced from our app, containing 828 samples from 343 participants. By combining voice signals and reported symptoms, an AUC of 0.79 has been attained, with a sensitivity of 0.68 and a specificity of 0.82. We hope that this study opens the door to rapid, low-cost, and convenient pre-screening tools to automatically detect the disease.ERC Project 833296 (EAR
Exploring Longitudinal Cough, Breath, and Voice Data for COVID-19 Progression Prediction via Sequential Deep Learning: Model Development and Validation.
BACKGROUND: Recent work has shown the potential of using audio data (eg, cough, breathing, and voice) in the screening for COVID-19. However, these approaches only focus on one-off detection and detect the infection, given the current audio sample, but do not monitor disease progression in COVID-19. Limited exploration has been put forward to continuously monitor COVID-19 progression, especially recovery, through longitudinal audio data. Tracking disease progression characteristics and patterns of recovery could bring insights and lead to more timely treatment or treatment adjustment, as well as better resource management in health care systems. OBJECTIVE: The primary objective of this study is to explore the potential of longitudinal audio samples over time for COVID-19 progression prediction and, especially, recovery trend prediction using sequential deep learning techniques. METHODS: Crowdsourced respiratory audio data, including breathing, cough, and voice samples, from 212 individuals over 5-385 days were analyzed, alongside their self-reported COVID-19 test results. We developed and validated a deep learning-enabled tracking tool using gated recurrent units (GRUs) to detect COVID-19 progression by exploring the audio dynamics of the individuals' historical audio biomarkers. The investigation comprised 2 parts: (1) COVID-19 detection in terms of positive and negative (healthy) tests using sequential audio signals, which was primarily assessed in terms of the area under the receiver operating characteristic curve (AUROC), sensitivity, and specificity, with 95% CIs, and (2) longitudinal disease progression prediction over time in terms of probability of positive tests, which was evaluated using the correlation between the predicted probability trajectory and self-reported labels. RESULTS: We first explored the benefits of capturing longitudinal dynamics of audio biomarkers for COVID-19 detection. The strong performance, yielding an AUROC of 0.79, a sensitivity of 0.75, and a specificity of 0.71 supported the effectiveness of the approach compared to methods that do not leverage longitudinal dynamics. We further examined the predicted disease progression trajectory, which displayed high consistency with longitudinal test results with a correlation of 0.75 in the test cohort and 0.86 in a subset of the test cohort with 12 (57.1%) of 21 COVID-19-positive participants who reported disease recovery. Our findings suggest that monitoring COVID-19 evolution via longitudinal audio data has potential in the tracking of individuals' disease progression and recovery. CONCLUSIONS: An audio-based COVID-19 progression monitoring system was developed using deep learning techniques, with strong performance showing high consistency between the predicted trajectory and the test results over time, especially for recovery trend predictions. This has good potential in the postpeak and postpandemic era that can help guide medical treatment and optimize hospital resource allocations. The changes in longitudinal audio samples, referred to as audio dynamics, are associated with COVID-19 progression; thus, modeling the audio dynamics can potentially capture the underlying disease progression process and further aid COVID-19 progression prediction. This framework provides a flexible, affordable, and timely tool for COVID-19 tracking, and more importantly, it also provides a proof of concept of how telemonitoring could be applicable to respiratory diseases monitoring, in general
A summary of the ComParE COVID-19 challenges.
Peer reviewed: TrueThe COVID-19 pandemic has caused massive humanitarian and economic damage. Teams of scientists from a broad range of disciplines have searched for methods to help governments and communities combat the disease. One avenue from the machine learning field which has been explored is the prospect of a digital mass test which can detect COVID-19 from infected individuals' respiratory sounds. We present a summary of the results from the INTERSPEECH 2021 Computational Paralinguistics Challenges: COVID-19 Cough, (CCS) and COVID-19 Speech, (CSS)
The INTERSPEECH 2021 Computational Paralinguistics Challenge: COVID-19 Cough, COVID-19 Speech, Escalation & Primates
The INTERSPEECH 2021 Computational Paralinguistics Challenge addresses four different problems for the first time in a research competition under well-defined conditions: In the COVID-19 Cough and COVID-19 Speech Sub-Challenges, a binary classification on COVID-19 infection has to be made based on coughing sounds and speech; in the Escalation Sub- Challenge, a three-way assessment of the level of escalation in a dialogue is featured; and in the Primates Sub-Challenge, four species vs background need to be classified. We describe the Sub-Challenges, baseline feature extraction, and classifiers based on the 'usual' COMPARE and BoAW features as well as deep unsupervised representation learning using the AUDEEP toolkit, and deep feature extraction from pre-trained CNNs using the DEEP SPECTRUM toolkit; in addition, we add deep end-to-end sequential modelling, and partially linguistic analysis
The INTERSPEECH 2021 Computational Paralinguistics Challenge: COVID-19 cough, COVID-19 speech, escalation & primates
The INTERSPEECH 2021 Computational Paralinguistics Challenge addresses four different problems for the first time in a research competition under well-defined conditions: In the COVID-19 Cough and COVID-19 Speech Sub-Challenges, a binary classification on COVID-19 infection has to be made based on coughing sounds and speech; in the Escalation Sub- Challenge, a three-way assessment of the level of escalation in a dialogue is featured; and in the Primates Sub-Challenge, four species vs background need to be classified. We describe the Sub-Challenges, baseline feature extraction, and classifiers based on the 'usual' COMPARE and BoAW features as well as deep unsupervised representation learning using the AUDEEP toolkit, and deep feature extraction from pre-trained CNNs using the DEEP SPECTRUM toolkit; in addition, we add deep end-to-end sequential modelling, and partially linguistic analysis