thesis

Detection of microsleeps from the eeg via optimized classification techniques.

Abstract

Microsleeps are complete breaks in responsiveness for 0.5–15 s. They can lead to multiple fatalities in certain occupational fields (e.g., transportation and military) due to the need in such occupations for extended and continuous vigilance. Therefore, an automated microsleep detection system may assist in the reduction of poor job performance and occupational fatalities. An EEG-based microsleep detector offers advantages over a videobased microsleep detector, including speed and temporal resolution. A series of software modules were implemented to examine different feature sets to determine the optimal circumstances for automated EEG-based microsleep detection. The microsleep detection system was organized in a similar manner to an EEG-based brain-computer interface (BCI). EEG data underwent baseline removal and filtering to remove overhead noise. Following this, feature extraction generated spectral features based upon an estimate of the power spectrum or its logarithmic transform. Following this, feature selection/reduction (FS/R) was used to select the most relevant information across all the spectral features. A trained classifier was then tested on data from a subject it had not seen before. In certain cases, an ensemble of classifiers was used instead of a single classifier. The performance measures from all cases were then averaged together in leave-one-out crossvalidation (LOOCV). Sets of artificial data were generated to test a prototype EEG-based microsleep detection system, consisting of a combination of EEG and 2-s bursts of 15 Hz sinusoids of varied signal-to-noise ratios (SNRs) ranging from 16 down to 0.03. The balance between events and non-events was varied between evenly balanced and highly imbalanced (e.g., events occurring only 2% of the time). Features were spectral estimates of various EEG bands (e.g., alpha band power) or ratios between them. A total of 34 features for each of the 16 channels yielded a total of 544 features. Five minutes of EEG from eight subjects were used in the generation of the dummy data, and each subject yielded a matrix of 300 observations of 544 features. Datasets from two prior microsleep studies were employed after validating the system on the artificial data. The first, Study A (N = 8), had 16 channels sampled at 256 Hz from two 1-hour sessions per subject and the second, Study C (N = 10), had one 50-min session with 30-62 channels per subject sampled at 250 Hz. A vector of 34 spectral features from each channel was concatenated into a feature vector for each 2-s interval, with each interval having a 1-s overlap with the prior one. In both cases, microsleeps had been identified via a combination of video recording and performance on a continuous tracking task. Study A provided four datasets to compare effects of various preprocessing techniques on performance: (1) Study A bipolar EEG with Independent Component Analysis (ICA) preprocessing and artefact pruning (total automated rejection of artefact-containing epochs) and logarithmic transforms of the spectral features (SABIL); (2) Study A bipolar EEG with ICA-based eye blink removal and artefact removal with pruning of epochs with major artefacts, and linear spectral features (SABIS); (3) Study A referential EEG unprocessed by ICA with spectral features (SARUS); and (4) Study A bipolar EEG unprocessed by ICA with spectral features (SABUS). The second study had one primary feature set, the Study C referential EEG ICA preprocessed spectral feature (SCRIS) variant. LOOCV was evaluated based on the phi correlation coefficient. After replicating prior work, several FS/R and classifier structures were investigated with both the artificially balanced and unbalanced data. Feature selection/reduction methods included principal component analysis (PCA), common spatial patterns (CSP), projection to latent structures (PLS), a new method based on average distance between events and nonevents (ADEN), ADEN normalized with a z-score transform (ADENZ), genetic algorithms in concert with ADEN (GADEN), and genetic algorithms in concert with ADENZ (GADENZ). Several pattern recognition algorithms were investigated: linear discriminant analysis (LDA), radial basis functions (RBFs), and Support Vector Machines with Gaussian (SVMG) and polynomial (SVMP) kernels. Classifier structures examined included single classifiers, bagging, boosting, stacking, and adaptive boosting (AdaBoost). The highest LOOCV results on artificial data (SNR = 0.3) corresponded to GADEN with 10 features and a single LDA classifier with a mean phi value of 0.96. Of the four Study A datasets, PCA with 150 features and a stacking ensemble achieved the highest mean phi of 0.40 with the SABIL feature set, and ADEN with 20 features with a single LDA classifier achieved the highest mean phi of 0.10 with Study C. Other machine-learning methodologies, such as training on artificially balanced data, decreasing the training size, within-subject training and testing, and randomly mixed data from across subjects, were also examined. Training on artificially balanced data did not improve performance. An issue found by performing within-subject training and testing was that, for certain subjects, a classifier trained on one-half of the subject’s data and then tested on the other half was that classifier performance dropped to random guessing. The low phi values on within-subject tests occurred independently of the feature selection/reduction method explored. As such, performance of a standard LOOCV was often dependent on whether a particular testing subject had a low (< 0.15) within-subjects mean phi correlation coefficient. Training on only the higher mean phi values did not boost performance. Additional tests found correlations (r = 0.57, p = 0.003 for Study A and r = 0.67, p 0.15) and longer mean microsleep durations. Other individual subject characteristics, such as number of microsleeps and subject age, did not have significant differences. The primary findings highlighted the strengths and limitations of supervised feature selection and linear classifiers trained upon highly variable between-subject features across two studies. Findings suggested that a classifier performs best when individuals have high mean microsleep durations. On the configurations investigated, preprocessing factors, such as ICA preprocessing, feature extraction method, and artefact pruning, affected the performance more than changing specific module configurations. No significant differences between the SABIL features and the lower performing Study A feature sets were found due to overlapping ranges of performance (p = 0.15). The findings suggest that the investigated techniques plateaued in performance on the Study A data, reaching a point of diminishing returns without fundamentally changing the nature of the classification problem. The different number of channels of varying quality across all subjects in Study C rendered microsleep classification extremely difficult, but even a linear classifier can properly generalize if exposed to a large enough variety of data from across the entire set. Many of the techniques explored are also relevant to other fields, such as braincomputer interface (BCI) and machine learning

    Similar works