34 research outputs found
Pediatric Sleep Scoring In-the-wild from Millions of Multi-channel EEG Signals
Sleep is critical to the health and development of infants, children, and
adolescents, but pediatric sleep is severely under-researched compared to adult
sleep in the context of machine learning for health and well-being. Here, we
present the first automated pediatric sleep scoring results on a recent
large-scale sleep study dataset that was collected during standard clinical
care. We develop a transformer-based deep neural network model that learns to
classify five sleep stages from millions of multi-channel electroencephalogram
(EEG) signals with 78% overall accuracy. Further, we conduct an in-depth
analysis of the model performance based on patient demographics and EEG
channels
On Out-of-Distribution Detection for Audio with Deep Nearest Neighbors
Out-of-distribution (OOD) detection is concerned with identifying data points
that do not belong to the same distribution as the model's training data. For
the safe deployment of predictive models in a real-world environment, it is
critical to avoid making confident predictions on OOD inputs as it can lead to
potentially dangerous consequences. However, OOD detection largely remains an
under-explored area in the audio (and speech) domain. This is despite the fact
that audio is a central modality for many tasks, such as speaker diarization,
automatic speech recognition, and sound event detection. To address this, we
propose to leverage feature-space of the model with deep k-nearest neighbors to
detect OOD samples. We show that this simple and flexible method effectively
detects OOD inputs across a broad category of audio (and speech) datasets.
Specifically, it improves the false positive rate (FPR@TPR95) by 17% and the
AUROC score by 7% than other prior techniques
Multi-task Self-Supervised Learning for Human Activity Detection
Deep learning methods are successfully used in applications pertaining to
ubiquitous computing, health, and well-being. Specifically, the area of human
activity recognition (HAR) is primarily transformed by the convolutional and
recurrent neural networks, thanks to their ability to learn semantic
representations from raw input. However, to extract generalizable features,
massive amounts of well-curated data are required, which is a notoriously
challenging task; hindered by privacy issues, and annotation costs. Therefore,
unsupervised representation learning is of prime importance to leverage the
vast amount of unlabeled data produced by smart devices. In this work, we
propose a novel self-supervised technique for feature learning from sensory
data that does not require access to any form of semantic labels. We learn a
multi-task temporal convolutional network to recognize transformations applied
on an input signal. By exploiting these transformations, we demonstrate that
simple auxiliary tasks of the binary classification result in a strong
supervisory signal for extracting useful features for the downstream task. We
extensively evaluate the proposed approach on several publicly available
datasets for smartphone-based HAR in unsupervised, semi-supervised, and
transfer learning settings. Our method achieves performance levels superior to
or comparable with fully-supervised networks, and it performs significantly
better than autoencoders. Notably, for the semi-supervised case, the
self-supervised features substantially boost the detection rate by attaining a
kappa score between 0.7-0.8 with only 10 labeled examples per class. We get
similar impressive performance even if the features are transferred from a
different data source. While this paper focuses on HAR as the application
domain, the proposed technique is general and could be applied to a wide
variety of problems in other areas
Active Learning of Non-semantic Speech Tasks with Pretrained Models
Pretraining neural networks with massive unlabeled datasets has become
popular as it equips the deep models with a better prior to solve downstream
tasks. However, this approach generally assumes that for downstream tasks, we
have access to annotated data of sufficient size. In this work, we propose
ALOE, a novel system for improving the data- and label-efficiency of
non-semantic speech tasks with active learning (AL). ALOE uses pre-trained
models in conjunction with active learning to label data incrementally and
learns classifiers for downstream tasks, thereby mitigating the need to acquire
labeled data beforehand. We demonstrate the effectiveness of ALOE on a wide
range of tasks, uncertainty-based acquisition functions, and model
architectures. Training a linear classifier on top of a frozen encoder with
ALOE is shown to achieve performance similar to several baselines that utilize
the entire labeled data
Federated Self-Supervised Learning of Multi-Sensor Representations for Embedded Intelligence
Smartphones, wearables, and Internet of Things (IoT) devices produce a wealth
of data that cannot be accumulated in a centralized repository for learning
supervised models due to privacy, bandwidth limitations, and the prohibitive
cost of annotations. Federated learning provides a compelling framework for
learning models from decentralized data, but conventionally, it assumes the
availability of labeled samples, whereas on-device data are generally either
unlabeled or cannot be annotated readily through user interaction. To address
these issues, we propose a self-supervised approach termed
\textit{scalogram-signal correspondence learning} based on wavelet transform to
learn useful representations from unlabeled sensor inputs, such as
electroencephalography, blood volume pulse, accelerometer, and WiFi channel
state information. Our auxiliary task requires a deep temporal neural network
to determine if a given pair of a signal and its complementary viewpoint (i.e.,
a scalogram generated with a wavelet transform) align with each other or not
through optimizing a contrastive objective. We extensively assess the quality
of learned features with our multi-view strategy on diverse public datasets,
achieving strong performance in all domains. We demonstrate the effectiveness
of representations learned from an unlabeled input collection on downstream
tasks with training a linear classifier over pretrained network, usefulness in
low-data regime, transfer learning, and cross-validation. Our methodology
achieves competitive performance with fully-supervised networks, and it
outperforms pre-training with autoencoders in both central and federated
contexts. Notably, it improves the generalization in a semi-supervised setting
as it reduces the volume of labeled data required through leveraging
self-supervised learning.Comment: Accepted for publication at IEEE Internet of Things Journa
Global, regional, and national burden of disorders affecting the nervous system, 1990–2021: a systematic analysis for the Global Burden of Disease Study 2021
BackgroundDisorders affecting the nervous system are diverse and include neurodevelopmental disorders, late-life neurodegeneration, and newly emergent conditions, such as cognitive impairment following COVID-19. Previous publications from the Global Burden of Disease, Injuries, and Risk Factor Study estimated the burden of 15 neurological conditions in 2015 and 2016, but these analyses did not include neurodevelopmental disorders, as defined by the International Classification of Diseases (ICD)-11, or a subset of cases of congenital, neonatal, and infectious conditions that cause neurological damage. Here, we estimate nervous system health loss caused by 37 unique conditions and their associated risk factors globally, regionally, and nationally from 1990 to 2021.MethodsWe estimated mortality, prevalence, years lived with disability (YLDs), years of life lost (YLLs), and disability-adjusted life-years (DALYs), with corresponding 95% uncertainty intervals (UIs), by age and sex in 204 countries and territories, from 1990 to 2021. We included morbidity and deaths due to neurological conditions, for which health loss is directly due to damage to the CNS or peripheral nervous system. We also isolated neurological health loss from conditions for which nervous system morbidity is a consequence, but not the primary feature, including a subset of congenital conditions (ie, chromosomal anomalies and congenital birth defects), neonatal conditions (ie, jaundice, preterm birth, and sepsis), infectious diseases (ie, COVID-19, cystic echinococcosis, malaria, syphilis, and Zika virus disease), and diabetic neuropathy. By conducting a sequela-level analysis of the health outcomes for these conditions, only cases where nervous system damage occurred were included, and YLDs were recalculated to isolate the non-fatal burden directly attributable to nervous system health loss. A comorbidity correction was used to calculate total prevalence of all conditions that affect the nervous system combined.FindingsGlobally, the 37 conditions affecting the nervous system were collectively ranked as the leading group cause of DALYs in 2021 (443 million, 95% UI 378–521), affecting 3·40 billion (3·20–3·62) individuals (43·1%, 40·5–45·9 of the global population); global DALY counts attributed to these conditions increased by 18·2% (8·7–26·7) between 1990 and 2021. Age-standardised rates of deaths per 100 000 people attributed to these conditions decreased from 1990 to 2021 by 33·6% (27·6–38·8), and age-standardised rates of DALYs attributed to these conditions decreased by 27·0% (21·5–32·4). Age-standardised prevalence was almost stable, with a change of 1·5% (0·7–2·4). The ten conditions with the highest age-standardised DALYs in 2021 were stroke, neonatal encephalopathy, migraine, Alzheimer's disease and other dementias, diabetic neuropathy, meningitis, epilepsy, neurological complications due to preterm birth, autism spectrum disorder, and nervous system cancer.InterpretationAs the leading cause of overall disease burden in the world, with increasing global DALY counts, effective prevention, treatment, and rehabilitation strategies for disorders affecting the nervous system are needed