12 research outputs found

    Data-efficient Deep Learning Approach for Single-Channel EEG-Based Sleep Stage Classification with Model Interpretability

    Full text link
    Sleep, a fundamental physiological process, occupies a significant portion of our lives. Accurate classification of sleep stages serves as a crucial tool for evaluating sleep quality and identifying probable sleep disorders. Our work introduces a novel methodology that utilizes a SE-Resnet-Bi-LSTM architecture to classify sleep into five separate stages. The classification process is based on the analysis of single-channel electroencephalograms (EEGs). The suggested framework consists of two fundamental elements: a feature extractor that utilizes SE-ResNet, and a temporal context encoder that uses stacks of Bi-LSTM units. The effectiveness of our approach is substantiated by thorough assessments conducted on three different datasets, namely SleepEDF-20, SleepEDF-78, and SHHS. The proposed methodology achieves significant model performance, with Macro-F1 scores of 82.5, 78.9, and 81.9 for the respective datasets. We employ 1D-GradCAM visualization as a methodology to elucidate the decision-making process inherent in our model in the realm of sleep stage classification. This visualization method not only provides valuable insights into the model's classification rationale but also aligns its outcomes with the annotations made by sleep experts. One notable feature of our research lies in the incorporation of an efficient training approach, which adeptly upholds the model's resilience in terms of performance. The experimental evaluations provide a comprehensive evaluation of the effectiveness of our proposed model in comparison to the existing approaches, highlighting its potential for practical applications

    Non-invasive fetal electrocardiogram : analysis and interpretation

    Get PDF
    High-risk pregnancies are becoming more and more prevalent because of the progressively higher age at which women get pregnant. Nowadays about twenty percent of all pregnancies are complicated to some degree, for instance because of preterm delivery, fetal oxygen deficiency, fetal growth restriction, or hypertension. Early detection of these complications is critical to permit timely medical intervention, but is hampered by strong limitations of existing monitoring technology. This technology is either only applicable in hospital settings, is obtrusive, or is incapable of providing, in a robust way, reliable information for diagnosis of the well-being of the fetus. The most prominent method for monitoring of the fetal health condition is monitoring of heart rate variability in response to activity of the uterus (cardiotocography; CTG). Generally, in obstetrical practice, the heart rate is determined in either of two ways: unobtrusively with a (Doppler) ultrasound probe on the maternal abdomen, or obtrusively with an invasive electrode fixed onto the fetal scalp. The first method is relatively inaccurate but is non-invasive and applicable in all stages of pregnancy. The latter method is far more accurate but can only be applied following rupture of the membranes and sufficient dilatation, restricting its applicability to only the very last phase of pregnancy. Besides these accuracy and applicability issues, the use of CTG in obstetrical practice also has another limitation: despite its high sensitivity, the specificity of CTG is relatively low. This means that in most cases of fetal distress the CTG reveals specific patterns of heart rate variability, but that these specific patterns can also be encountered for healthy fetuses, complicating accurate diagnosis of the fetal condition. Hence, a prerequisite for preventing unnecessary interventions that are based on CTG alone, is the inclusion of additional information in diagnostics. Monitoring of the fetal electrocardiogram (ECG), as a supplement of CTG, has been demonstrated to have added value for monitoring of the fetal health condition. Unfortunately the application of the fetal ECG in obstetrical diagnostics is limited because at present the fetal ECG can only be measured reliably by means of an invasive scalp electrode. To overcome this limited applicability, many attempts have been made to record the fetal ECG non-invasively from the maternal abdomen, but these attempts have not yet led to approaches that permit widespread clinical application. One key difficulty is that the signal to noise ratio (SNR) of the transabdominal ECG recordings is relatively low. Perhaps even more importantly, the abdominal ECG recordings yield ECG signals for which the morphology depends strongly on the orientation of the fetus within the maternal uterus. Accordingly, for any fetal orientation, the ECG morphology is different. This renders correct clinical interpretation of the recorded ECG signals complicated, if not impossible. This thesis aims to address these difficulties and to provide new contributions on the clinical interpretation of the fetal ECG. At first the SNR of the recorded signals is enhanced through a series of signal processing steps that exploit specific and a priori known properties of the fetal ECG. More particularly, the dominant interference (i.e. the maternal ECG) is suppressed by exploiting the absence of temporal correlation between the maternal and fetal ECG. In this suppression, the maternal ECG complex is dynamically segmented into individual ECG waves and each of these waves is estimated through averaging corresponding waves from preceding ECG complexes. The maternal ECG template generated by combining the estimated waves is subsequently subtracted from the original signal to yield a non-invasive recording in which the maternal ECG has been suppressed. This suppression method is demonstrated to be more accurate than existing methods. Other interferences and noise are (partly) suppressed by exploiting the quasiperiodicity of the fetal ECG through averaging consecutive ECG complexes or by exploiting the spatial correlation of the ECG. The averaging of several consecutive ECG complexes, synchronized on their QRS complex, enhances the SNR of the ECG but also can suppress morphological variations in the ECG that are clinically relevant. The number of ECG complexes included in the average hence constitutes a trade-off between SNR enhancement on the one hand and loss of morphological variability on the other hand. To relax this trade-off, in this thesis a method is presented that can adaptively estimate the number of ECG complexes included in the average. In cases of morphological variations, this number is decreased ensuring that the variations are not suppressed. In cases of no morphological variability, this number is increased to ensure adequate SNR enhancement. The further suppression of noise by exploiting the spatial correlation of the ECG is based on the fact that all ECG signals recorded at several locations on the maternal abdomen originate from the same electrical source, namely the fetal heart. The electrical activity of the fetal heart at any point in time can be modeled as a single electrical field vector with stationary origin. This vector varies in both amplitude and orientation in three-dimensional space during the cardiac cycle and the time-path described by this vector is referred to as the fetal vectorcardiogram (VCG). In this model, the abdominal ECG constitutes the projection of the VCG onto the vector that describes the position of the abdominal electrode with respect to a reference electrode. This means that when the VCG is known, any desired ECG signal can be calculated. Equivalently, this also means that when enough ECG signals (i.e. at least three independent signals) are known, the VCG can be calculated. By using more than three ECG signals for the calculation of the VCG, redundancy in the ECG signals can be exploited for added noise suppression. Unfortunately, when calculating the fetal VCG from the ECG signals recorded from the maternal abdomen, the distance between the fetal heart and the electrodes is not the same for each electrode. Because the amplitude of the ECG signals decreases with propagation to the abdominal surface, these different distances yield a specific, unknown attenuation for each ECG signal. Existing methods for estimating the VCG operate with a fixed linear combination of the ECG signals and, hence, cannot account for variations in signal attenuation. To overcome this problem and be able to account for fetal movement, in this thesis a method is presented that estimates both the VCG and, to some extent, also the signal attenuation. This is done by determining for which VCG and signal attenuation the joint probability over both these variables is maximal given the observed ECG signals. The underlying joint probability distribution is determined by assuming the ECG signals to originate from scaled VCG projections and additive noise. With this method, a VCG, tailored to each specific patient, is determined. With respect to the fixed linear combinations, the presented method performs significantly better in the accurate estimation of the VCG. Besides describing the electrical activity of the fetal heart in three dimensions, the fetal VCG also provides a framework to account for the fetal orientation in the uterus. This framework enables the detection of the fetal orientation over time and allows for rotating the fetal VCG towards a prescribed orientation. From the normalized fetal VCG obtained in this manner, standardized ECG signals can be calculated, facilitating correct clinical interpretation of the non-invasive fetal ECG signals. The potential of the presented approach (i.e. the combination of all methods described above) is illustrated for three different clinical cases. In the first case, the fetal ECG is analyzed to demonstrate that the electrical behavior of the fetal heart differs significantly from the adult heart. In fact, this difference is so substantial that diagnostics based on the fetal ECG should be based on different guidelines than those for adult ECG diagnostics. In the second case, the fetal ECG is used to visualize the origin of fetal supraventricular extrasystoles and the results suggest that the fetal ECG might in future serve as diagnostic tool for relating fetal arrhythmia to congenital heart diseases. In the last case, the non-invasive fetal ECG is compared to the invasively recorded fetal ECG to gauge the SNR of the transabdominal recordings and to demonstrate the suitability of the non-invasive fetal ECG in clinical applications that, as yet, are only possible for the invasive fetal ECG

    Novel Low Complexity Biomedical Signal Processing Techniques for Online Applications

    Get PDF
    Biomedical signal processing has become a very active domain of research nowadays. With the advent of portable monitoring devices, from accelerometer-enabled bracelets and smart-phones to more advanced vital sign tracking body area networks, this field has been receiving unprecedented attention. Indeed, portable health monitoring can help uncover the underlying dynamics of human health in a way that has not been possible before. Several challenges have emerged however, as these devices present key differences in terms of signal acquisition and processing in comparison with conventional methods. Hardware constraints such as processing power and limited battery capacity make most established techniques unsuitable and therefore, the need for low-complexity yet robust signal processing methods has appeared. Another issue that needs to be addressed is the quality of the signals captured by these devices. Unlike in clinical scenarios, in portable health monitoring subjects are constantly performing their daily activities. Moreover, signals maybe captured from unconventional locations and subsequently, be prone to perturbations. In order to obtain reliable measures from these monitoring devices, one needs to acquire dependable signal quality measures, to avoid false alarms. Indeed, hardware limitations and low-quality signals can greatly influence the performance of portable monitoring devices. Nevertheless, most devices offer simultaneous acquisition of multiple physiological parameters, such as electrocardiogram (ECG) and photoplethysmogram (PPG). Through multi-modal signal processing the overall performance can be improved, for instance by deriving parameters such as heart rate estimation from the most reliable and uncontaminated source. This thesis is therefore, dedicated to propose novel low-complexity biomedical processing techniques for real-time/online applications. Throughout this dissertation, several bio-signals such as the ECG, PPG, and electroencephalogram (EEG) are investigated. %There is an emphasis on ECG processing techniques, as most of the bio-signals recorded today reflect information about the heart. The main contribution of this dissertation consists in two signal processing techniques: 1) a novel ECG QRS-complex detection and delineation technique, and 2) a short-term event extraction technique for biomedical signals. The former is based on a processing technique called mathematical morphology (MM), and adaptively uses subject QRS-complex amplitude- and morphological attributes for a robust detection and delineation. This method is generalized to intra-cardiac electrograms for atrial activation detection during atrial fibrillation. The second method, called the Relative-Energy algorithm, uses short- and long-term signal energies to highlight events of interest and discard unwanted activities. Collectively, the results obtained by these methods suggest that while presenting low-computational costs, they can efficiently and robustly extract biomedical events of interest. Using the relative energy algorithm, a continuous non-binary ECG signal quality index is presented. The ECG quality is determined by creating a cleaned-up version of the input ECG and calculating the correlation coefficient between the cleaned-up and the original ECG. The proposed quality index is fast and can be implemented online, making it suitable for portable monitoring scenarios

    Overlapped speech and music segmentation using singular spectrum analysis and random forests

    Get PDF
    Recent years have seen ever-increasing volumes of digital media archives and an enormous amount of user-contributed content. As demand for indexing and searching these resources has increased, and new technologies such as multimedia content management systems, en-hanced digital broadcasting, and semantic web have emerged, audio information mining and automated metadata generation have received much attention. Manual indexing and metadata tagging are time-consuming and subject to the biases of individual workers. An automated architecture able to extract information from audio signals, generate content-related text descriptors or metadata, and enable further information mining and searching would be a tangible and valuable solution. In the field of audio classification, audio signals may be broadly divided into speech or music. Most studies, however, neglect the fact that real audio soundtracks may have either speech or music, or a combination of the two, and this is considered the major hurdle to achieving high performance in automatic audio classification, since overlapping can contaminate relevant characteristics and features, causing incorrect classification or information loss. This research undertakes an extensive review of the state of the art by outlining the well-established audio features and machine learning techniques that have been applied in a broad range of audio segmentation and recognition areas. Audio classification systems and the suggested solutions for the mixed soundtracks problem are presented. The suggested solutions can be listed as follows: developing augmented and modified features for recognising audio classes even in the presence of overlaps between them; robust segmentation of a given overlapped soundtrack stream depends on an innovative method of audio decomposition using Singular Spectrum Analysis (SSA) that has been studied extensively and has received increasing attention in the past two decades as a time series decomposition method with many applications; adoption and development of driven classification methods; and finally a technique for continuous time series tasks. In this study, SSA has been investigated and found to be an efficient way to discriminate speech/music in mixed soundtracks by two different methods, each of which has been developed and validated in this research. The first method serves to mitigate the overlapping ratio between speech and music in the mixed soundtracks by generating two new soundtracks with a lower level of overlapping. Next, feature space is calculated for the output audio streams, and these are classified using random forests into either speech or music. One of the distinct characteristics of this method is the separation of the speech/music key features that lead to improve the classification performance. Nevertheless, that did encounter a few obstructions, including excessively long processing time, increased storage requirements (each frame symbolised by two outputs), and this all leads to greater computational load than previously. Meanwhile, the second method em-ploys the SSA technique to decompose a given audio signal into a series of Principal Components (PCs), where each PC corresponds to a particular pattern of oscillation. Then, the transformed well-established feature is measured for each PC in order to classify it into either speech or music based on the baseline classification system using a RF machine learning technique. The classification performance of real-world soundtracks is effectively improved, which is demonstrated by comparing speech/music recognition using conventional classification methods and the proposed SSA method. The second proposed and de-veloped method can detect pure speech, pure music, and mix with a much lower complexity level

    Reduction of wind induced microphone noise using singular spectrum analysis technique

    Get PDF
    Wind induced noise in microphone signals is one of the major concerns of outdoor acoustic signal acquisition. It affects many field measurement and audio recording scenarios. Filtering such noise is known to be difficult due to its broadband and time varying nature. This thesis is presented in the context of handling microphone signals acquired outdoor for acoustic sensing and environmental noise monitoring or soundscapes sampling.Thethesis presents a new approach to wind noise problem. Instead of filtering, a separation technique is developed. Signals are separated into wanted sounds of specific interest and wind noise based on the statistical feature of wind noise. The new technique is based on the Singular Spectrum Analysis methodwhich has recently seen many successful paradigms in the separation of biomedical signals, e.g., separating heart soundfrom lung noise. It has also been successfully implemented to de-noise signals in various applications.The thesis set out with particular emphasison investigating the factor that determines and improves the separability towards obtaining satisfactory results in terms of separating wind noise components out from noisy acoustic signals. A systematicapproach has been established and developed within the framework of singular spectral separation of acoustic signals contaminated by wind noise. This approach, which utilisesa conceptual framework, has, in its final form, three key objectives; grouping, reconstruction and separability. This approach is offered through introducing new mathematical models particularly for window length optimisation along with new descriptive figures.The research question has therefore been addressed considering developing algorithms according to updated requirements from method justification to verification and validation of the developed system. This thesis follows suitable testing criteria by conducting several experiments and a case-study design, with in-depth analysis of the results using visual tools of the method and related techniques.For system verification, an empirical study using testing signals thatintroduces a large number of experiments has been conducted. Empirical study with real-world sounds has been introduced next in system validation phase after rigorously selecting and preparing the dataset whichis drawn from two main sources: freefield1010 dataset, internet-based Freesound recordings. Results show that microphone wind noise is separable in the singular spectrum domain after validating and critically evaluating the developed system objectively. The findings indicate the effectiveness of the developed grouping and reconstruction techniques with significant improvement in the separability evidenced by w-correlation matrix.The developed method might be generalised to other outdoor sound acquisition applications

    Leveraging Artificial Intelligence to Improve EEG-fNIRS Data Analysis

    Get PDF
    La spectroscopie proche infrarouge fonctionnelle (fNIRS) est apparue comme une technique de neuroimagerie qui permet une surveillance non invasive et à long terme de l'hémodynamique corticale. Les technologies de neuroimagerie multimodale en milieu clinique permettent d'étudier les maladies neurologiques aiguës et chroniques. Dans ce travail, nous nous concentrons sur l'épilepsie - un trouble chronique du système nerveux central affectant près de 50 millions de personnes dans le monde entier prédisposant les individus affectés à des crises récurrentes. Les crises sont des aberrations transitoires de l'activité électrique du cerveau qui conduisent à des symptômes physiques perturbateurs tels que des changements aigus ou chroniques des compétences cognitives, des hallucinations sensorielles ou des convulsions de tout le corps. Environ un tiers des patients épileptiques sont récalcitrants au traitement pharmacologique et ces crises intraitables présentent un risque grave de blessure et diminuent la qualité de vie globale. Dans ce travail, nous étudions 1. l'utilité des informations hémodynamiques dérivées des signaux fNIRS dans une tâche de détection des crises et les avantages qu'elles procurent dans un environnement multimodal par rapport aux signaux électroencéphalographiques (EEG) seuls, et 2. la capacité des signaux neuronaux, dérivé de l'EEG, pour prédire l'hémodynamique dans le cerveau afin de mieux comprendre le cerveau épileptique. Sur la base de données rétrospectives EEG-fNIRS recueillies auprès de 40 patients épileptiques et utilisant de nouveaux modèles d'apprentissage en profondeur, la première étude de cette thèse suggère que les signaux fNIRS offrent une sensibilité et une spécificité accrues pour la détection des crises par rapport à l'EEG seul. La validation du modèle a été effectuée à l'aide de l'ensemble de données CHBMIT open source documenté et bien référencé avant d'utiliser notre ensemble de données EEG-fNIRS multimodal interne. Les résultats de cette étude ont démontré que fNIRS améliore la détection des crises par rapport à l'EEG seul et ont motivé les expériences ultérieures qui ont déterminé la capacité prédictive d'un modèle d'apprentissage approfondi développé en interne pour décoder les signaux d'état de repos hémodynamique à partir du spectre complet et d'une bande de fréquences neuronale codée spécifique signaux d'état de repos (signaux sans crise). Ces résultats suggèrent qu'un autoencodeur multimodal peut apprendre des relations multimodales pour prédire les signaux d'état de repos. Les résultats suggèrent en outre que des gammes de fréquences EEG plus élevées prédisent l'hémodynamique avec une erreur de reconstruction plus faible par rapport aux gammes de fréquences EEG plus basses. De plus, les connexions fonctionnelles montrent des modèles spatiaux similaires entre l'état de repos expérimental et les prédictions fNIRS du modèle. Cela démontre pour la première fois que l'auto-encodage intermodal à partir de signaux neuronaux peut prédire l'hémodynamique cérébrale dans une certaine mesure. Les résultats de cette thèse avancent le potentiel de l'utilisation d'EEG-fNIRS pour des tâches cliniques pratiques (détection des crises, prédiction hémodynamique) ainsi que l'examen des relations fondamentales présentes dans le cerveau à l'aide de modèles d'apprentissage profond. S'il y a une augmentation du nombre d'ensembles de données disponibles à l'avenir, ces modèles pourraient être en mesure de généraliser les prédictions qui pourraient éventuellement conduire à la technologie EEG-fNIRS à être utilisée régulièrement comme un outil clinique viable dans une grande variété de troubles neuropathologiques.----------ABSTRACT Functional near-infrared spectroscopy (fNIRS) has emerged as a neuroimaging technique that allows for non-invasive and long-term monitoring of cortical hemodynamics. Multimodal neuroimaging technologies in clinical settings allow for the investigation of acute and chronic neurological diseases. In this work, we focus on epilepsy—a chronic disorder of the central nervous system affecting almost 50 million people world-wide predisposing affected individuals to recurrent seizures. Seizures are transient aberrations in the brain's electrical activity that lead to disruptive physical symptoms such as acute or chronic changes in cognitive skills, sensory hallucinations, or whole-body convulsions. Approximately a third of epileptic patients are recalcitrant to pharmacological treatment and these intractable seizures pose a serious risk for injury and decrease overall quality of life. In this work, we study 1) the utility of hemodynamic information derived from fNIRS signals in a seizure detection task and the benefit they provide in a multimodal setting as compared to electroencephalographic (EEG) signals alone, and 2) the ability of neural signals, derived from EEG, to predict hemodynamics in the brain in an effort to better understand the epileptic brain. Based on retrospective EEG-fNIRS data collected from 40 epileptic patients and utilizing novel deep learning models, the first study in this thesis suggests that fNIRS signals offer increased sensitivity and specificity metrics for seizure detection when compared to EEG alone. Model validation was performed using the documented open source and well referenced CHBMIT dataset before using our in-house multimodal EEG-fNIRS dataset. The results from this study demonstrated that fNIRS improves seizure detection as compared to EEG alone and motivated the subsequent experiments which determined the predictive capacity of an in-house developed deep learning model to decode hemodynamic resting state signals from full spectrum and specific frequency band encoded neural resting state signals (seizure free signals). These results suggest that a multimodal autoencoder can learn multimodal relations to predict resting state signals. Findings further suggested that higher EEG frequency ranges predict hemodynamics with lower reconstruction error in comparison to lower EEG frequency ranges. Furthermore, functional connections show similar spatial patterns between experimental resting state and model fNIRS predictions. This demonstrates for the first time that intermodal autoencoding from neural signals can predict cerebral hemodynamics to a certain extent. The results of this thesis advance the potential of using EEG-fNIRS for practical clinical tasks (seizure detection, hemodynamic prediction) as well as examining fundamental relationships present in the brain using deep learning models. If there is an increase in the number of datasets available in the future, these models may be able to generalize predictions which would possibly lead to EEG-fNIRS technology to be routinely used as a viable clinical tool in a wide variety of neuropathological disorders
    corecore