552 research outputs found

    AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions

    Get PDF
    This paper introduces a video dataset of spatio-temporally localized Atomic Visual Actions (AVA). The AVA dataset densely annotates 80 atomic visual actions in 430 15-minute video clips, where actions are localized in space and time, resulting in 1.58M action labels with multiple labels per person occurring frequently. The key characteristics of our dataset are: (1) the definition of atomic visual actions, rather than composite actions; (2) precise spatio-temporal annotations with possibly multiple annotations for each person; (3) exhaustive annotation of these atomic actions over 15-minute video clips; (4) people temporally linked across consecutive segments; and (5) using movies to gather a varied set of action representations. This departs from existing datasets for spatio-temporal action recognition, which typically provide sparse annotations for composite actions in short video clips. We will release the dataset publicly. AVA, with its realistic scene and action complexity, exposes the intrinsic difficulty of action recognition. To benchmark this, we present a novel approach for action localization that builds upon the current state-of-the-art methods, and demonstrates better performance on JHMDB and UCF101-24 categories. While setting a new state of the art on existing datasets, the overall results on AVA are low at 15.6% mAP, underscoring the need for developing new approaches for video understanding.Comment: To appear in CVPR 2018. Check dataset page https://research.google.com/ava/ for detail

    A Graph-Temporal fused dual-input Convolutional Neural Network for Detecting Sleep Stages from EEG Signals

    Get PDF
    This work was supported by the National Natural Science Foundation of China under Grant Nos. 61922062, and 61873181.Peer reviewedPostprin

    Deep learning with convolutional neural networks for decoding and visualization of EEG pathology

    Get PDF
    We apply convolutional neural networks (ConvNets) to the task of distinguishing pathological from normal EEG recordings in the Temple University Hospital EEG Abnormal Corpus. We use two basic, shallow and deep ConvNet architectures recently shown to decode task-related information from EEG at least as well as established algorithms designed for this purpose. In decoding EEG pathology, both ConvNets reached substantially better accuracies (about 6% better, ~85% vs. ~79%) than the only published result for this dataset, and were still better when using only 1 minute of each recording for training and only six seconds of each recording for testing. We used automated methods to optimize architectural hyperparameters and found intriguingly different ConvNet architectures, e.g., with max pooling as the only nonlinearity. Visualizations of the ConvNet decoding behavior showed that they used spectral power changes in the delta (0-4 Hz) and theta (4-8 Hz) frequency range, possibly alongside other features, consistent with expectations derived from spectral analysis of the EEG data and from the textual medical reports. Analysis of the textual medical reports also highlighted the potential for accuracy increases by integrating contextual information, such as the age of subjects. In summary, the ConvNets and visualization techniques used in this study constitute a next step towards clinically useful automated EEG diagnosis and establish a new baseline for future work on this topic.Comment: Published at IEEE SPMB 2017 https://www.ieeespmb.org/2017

    Visualising Convolutional Neural Network Decisions in Automatic Sleep Scoring

    Get PDF
    Current sleep medicine relies on the supervised analysis of polysomnographic recordings, which comprise amongst others electroencephalogram (EEG), electromyogram (EMG), and electrooculogram (EOG) signals. Convolutional neural networks (CNN) provide an interesting framework for automated sleep classification, however, the lack of interpretability of its results has hampered CNN's further use in medicine. In this study, we train a CNN using as input Continuous Wavelet transformed EEG, EOG and EMG recordings from a publicly available dataset. The network achieved a 10-fold cross-validation Cohen's Kappa score of κ=0.71±0.01\kappa = 0.71 \pm 0.01. Further, we provide insights on how this network classifies individual epochs of sleep using Guided Gradient-weighted Class Activation Maps (Guided Grad-CAM). The proposed approach is able to produce fine-grained activation maps on time-frequency domain for each signal providing a useful tool for identifying relevant features in CNNs

    Deep Learning in Cardiology

    Full text link
    The medical field is creating large amount of data that physicians are unable to decipher and use efficiently. Moreover, rule-based expert systems are inefficient in solving complicated medical tasks or for creating insights using big data. Deep learning has emerged as a more accurate and effective technology in a wide range of medical problems such as diagnosis, prediction and intervention. Deep learning is a representation learning method that consists of layers that transform the data non-linearly, thus, revealing hierarchical relationships and structures. In this review we survey deep learning application papers that use structured data, signal and imaging modalities from cardiology. We discuss the advantages and limitations of applying deep learning in cardiology that also apply in medicine in general, while proposing certain directions as the most viable for clinical use.Comment: 27 pages, 2 figures, 10 table

    U-Time: A Fully Convolutional Network for Time Series Segmentation Applied to Sleep Staging

    Full text link
    Neural networks are becoming more and more popular for the analysis of physiological time-series. The most successful deep learning systems in this domain combine convolutional and recurrent layers to extract useful features to model temporal relations. Unfortunately, these recurrent models are difficult to tune and optimize. In our experience, they often require task-specific modifications, which makes them challenging to use for non-experts. We propose U-Time, a fully feed-forward deep learning approach to physiological time series segmentation developed for the analysis of sleep data. U-Time is a temporal fully convolutional network based on the U-Net architecture that was originally proposed for image segmentation. U-Time maps sequential inputs of arbitrary length to sequences of class labels on a freely chosen temporal scale. This is done by implicitly classifying every individual time-point of the input signal and aggregating these classifications over fixed intervals to form the final predictions. We evaluated U-Time for sleep stage classification on a large collection of sleep electroencephalography (EEG) datasets. In all cases, we found that U-Time reaches or outperforms current state-of-the-art deep learning models while being much more robust in the training process and without requiring architecture or hyperparameter adaptation across tasks.Comment: To appear in Advances in Neural Information Processing Systems (NeurIPS), 201

    Data-efficient Deep Learning Approach for Single-Channel EEG-Based Sleep Stage Classification with Model Interpretability

    Full text link
    Sleep, a fundamental physiological process, occupies a significant portion of our lives. Accurate classification of sleep stages serves as a crucial tool for evaluating sleep quality and identifying probable sleep disorders. Our work introduces a novel methodology that utilizes a SE-Resnet-Bi-LSTM architecture to classify sleep into five separate stages. The classification process is based on the analysis of single-channel electroencephalograms (EEGs). The suggested framework consists of two fundamental elements: a feature extractor that utilizes SE-ResNet, and a temporal context encoder that uses stacks of Bi-LSTM units. The effectiveness of our approach is substantiated by thorough assessments conducted on three different datasets, namely SleepEDF-20, SleepEDF-78, and SHHS. The proposed methodology achieves significant model performance, with Macro-F1 scores of 82.5, 78.9, and 81.9 for the respective datasets. We employ 1D-GradCAM visualization as a methodology to elucidate the decision-making process inherent in our model in the realm of sleep stage classification. This visualization method not only provides valuable insights into the model's classification rationale but also aligns its outcomes with the annotations made by sleep experts. One notable feature of our research lies in the incorporation of an efficient training approach, which adeptly upholds the model's resilience in terms of performance. The experimental evaluations provide a comprehensive evaluation of the effectiveness of our proposed model in comparison to the existing approaches, highlighting its potential for practical applications

    U-Sleep's resilience to AASM guidelines

    Full text link
    AASM guidelines are the result of decades of efforts aiming at standardizing sleep scoring procedure, with the final goal of sharing a worldwide common methodology. The guidelines cover several aspects from the technical/digital specifications,e.g., recommended EEG derivations, to detailed sleep scoring rules accordingly to age. Automated sleep scoring systems have always largely exploited the standards as fundamental guidelines. In this context, deep learning has demonstrated better performance compared to classical machine learning. Our present work shows that a deep learning based sleep scoring algorithm may not need to fully exploit the clinical knowledge or to strictly adhere to the AASM guidelines. Specifically, we demonstrate that U-Sleep, a state-of-the-art sleep scoring algorithm, can be strong enough to solve the scoring task even using clinically non-recommended or non-conventional derivations, and with no need to exploit information about the chronological age of the subjects. We finally strengthen a well-known finding that using data from multiple data centers always results in a better performing model compared with training on a single cohort. Indeed, we show that this latter statement is still valid even by increasing the size and the heterogeneity of the single data cohort. In all our experiments we used 28528 polysomnography studies from 13 different clinical studies
    corecore