15 research outputs found

    Improvement of P300-Based Brain-Computer Interfaces for Home Appliances Control by Data Balancing Techniques

    Get PDF
    The oddball paradigm used in P300-based brain-computer interfaces (BCIs) intrinsically poses the issue of data imbalance between target stimuli and nontarget stimuli. Data imbalance can cause overfitting problems and, consequently, poor classification performance. The purpose of this study is to improve BCI performance by solving this data imbalance problem with sampling techniques. The sampling techniques were applied to BCI data in 15 subjects controlling a door lock, 15 subjects an electric light, and 14 subjects a Bluetooth speaker. We explored two categories of sampling techniques: oversampling and undersampling. Oversampling techniques, including random oversampling, synthetic minority oversampling technique (SMOTE), borderline-SMOTE, support vector machine (SVM) SMOTE, and adaptive synthetic sampling, were used to increase the number of samples for the class of target stimuli. Undersampling techniques, including random undersampling, neighborhood cleaning rule, Tomek's links, and weighted undersampling bagging, were used to reduce the class size of nontarget stimuli. The over- or undersampled data were classified by an SVM classifier. Overall, some oversampling techniques improved BCI performance while undersampling techniques often degraded performance. Particularly, using borderline-SMOTE yielded the highest accuracy (87.27%) and information transfer rate (8.82 bpm) across all three appliances. Moreover, borderline-SMOTE led to performance improvement, especially for poor performers. A further analysis showed that borderline-SMOTE improved SVM by generating more support vectors within the target class and enlarging margins. However, there was no difference in the accuracy between borderline-SMOTE and the method of applying the weighted regularization parameter of the SVM. Our results suggest that although oversampling improves performance of P300-based BCIs, it is not just the effect of the oversampling techniques, but rather the effect of solving the data imbalance problem

    Probabilistic Graphical Models for ERP-Based Brain Computer Interfaces

    Get PDF
    An event related potential (ERP) is an electrical potential recorded from the nervous system of humans or other animals. An ERP is observed after the presentation of a stimulus. Some examples of the ERPs are P300, N400, among others. Although ERPs are used very often in neuroscience, its generation is not yet well understood and different theories have been proposed to explain the phenomena. ERPs could be generated due to changes in the alpha rhythm, an internal neural control that reset the ongoing oscillations in the brain, or separate and distinct additive neuronal phenomena. When different repetitions of the same stimuli are averaged, a coherence addition of the oscillations is obtained which explain the increase in amplitude in the signals. Two ERPs are mostly studied: N400 and P300. N400 signals arise when a subject tries to make semantic operations that support neural circuits for explicit memory. N400 potentials have been observed mostly in the rhinal cortex. P300 signals are related to attention and memory operations. When a new stimulus appears, a P300 ERP (named P3a) is generated in the frontal lobe. In contrast, when a subject perceives an expected stimulus, a P300 ERP (named P3b) is generated in the temporal – parietal areas. This implicates P3a and P3b are related, suggesting a circuit pathway between the frontal and temporal–parietal regions, whose existence has not been verified. Un potencial relacionado con un evento (ERP) es un potencial eléctrico registrado en el sistema nervioso de los seres humanos u otros animales. Un ERP se observa tras la presentación de un estímulo. Aunque los ERPs se utilizan muy a menudo en neurociencia, su generación aún no se entiende bien y se han propuesto diferentes teorías para explicar el fenómeno. Una interfaz cerebro-computador (BCI) es un sistema de comunicación en el que los mensajes o las órdenes que un sujeto envía al mundo exterior proceden de algunas señales cerebrales en lugar de los nervios y músculos periféricos. La BCI utiliza ritmos sensorimotores o señales ERP, por lo que se necesita un clasificador para distinguir entre los estímulos correctos y los incorrectos. En este trabajo, proponemos utilizar modelos probabilísticos gráficos para el modelado de la dinámica temporal y espacial de las señales cerebrales con aplicaciones a las BCIs. Los modelos gráficos han sido seleccionados por su flexibilidad y capacidad de incorporar información previa. Esta flexibilidad se ha utilizado anteriormente para modelar únicamente la dinámica temporal. Esperamos que el modelo refleje algunos aspectos del funcionamiento del cerebro relacionados con los ERPs, al incluir información espacial y temporal.DoctoradoDoctor en Ingeniería Eléctrica y Electrónic

    Advanced Biometrics with Deep Learning

    Get PDF
    Biometrics, such as fingerprint, iris, face, hand print, hand vein, speech and gait recognition, etc., as a means of identity management have become commonplace nowadays for various applications. Biometric systems follow a typical pipeline, that is composed of separate preprocessing, feature extraction and classification. Deep learning as a data-driven representation learning approach has been shown to be a promising alternative to conventional data-agnostic and handcrafted pre-processing and feature extraction for biometric systems. Furthermore, deep learning offers an end-to-end learning paradigm to unify preprocessing, feature extraction, and recognition, based solely on biometric data. This Special Issue has collected 12 high-quality, state-of-the-art research papers that deal with challenging issues in advanced biometric systems based on deep learning. The 12 papers can be divided into 4 categories according to biometric modality; namely, face biometrics, medical electronic signals (EEG and ECG), voice print, and others

    Meta Heuristics based Machine Learning and Neural Mass Modelling Allied to Brain Machine Interface

    Get PDF
    New understanding of the brain function and increasing availability of low-cost-non-invasive electroencephalograms (EEGs) recording devices have made brain-computer-interface (BCI) as an alternative option to augmentation of human capabilities by providing a new non-muscular channel for sending commands, which could be used to activate electronic or mechanical devices based on modulation of thoughts. In this project, our emphasis will be on how to develop such a BCI using fuzzy rule-based systems (FRBSs), metaheuristics and Neural Mass Models (NMMs). In particular, we treat the BCI system as an integrated problem consisting of mathematical modelling, machine learning and classification. Four main steps are involved in designing a BCI system: 1) data acquisition, 2) feature extraction, 3) classification and 4) transferring the classification outcome into control commands for extended peripheral capability. Our focus has been placed on the first three steps. This research project aims to investigate and develop a novel BCI framework encompassing classification based on machine learning, optimisation and neural mass modelling. The primary aim in this project is to bridge the gap of these three different areas in a bid to design a more reliable and accurate communication path between the brain and external world. To achieve this goal, the following objectives have been investigated: 1) Steady-State Visual Evoked Potential (SSVEP) EEG data are collected from human subjects and pre-processed; 2) Feature extraction procedure is implemented to detect and quantify the characteristics of brain activities which indicates the intention of the subject.; 3) a classification mechanism called an Immune Inspired Multi-Objective Fuzzy Modelling Classification algorithm (IMOFM-C), is adapted as a binary classification approach for classifying binary EEG data. Then, the DDAG-Distance aggregation approach is proposed to aggregate the outcomes of IMOFM-C based binary classifiers for multi-class classification; 4) building on IMOFM-C, a preference-based ensemble classification framework known as IMOFM-CP is proposed to enhance the convergence performance and diversity of each individual component classifier, leading to an improved overall classification accuracy of multi-class EEG data; and 5) finally a robust parameterising approach which combines a single-objective GA and a clustering algorithm with a set of newly devised objective and penalty functions is proposed to obtain robust sets of synaptic connectivity parameters of a thalamic neural mass model (NMM). The parametrisation approach aims to cope with nonlinearity nature normally involved in describing multifarious features of brain signals

    Mixtures of Heterogeneous Experts

    Get PDF
    Computer Scienc

    Unified processing framework of high-dimensional and overly imbalanced chemical datasets for virtual screening.

    Get PDF
    Virtual screening in drug discovery involves processing large datasets containing unknown molecules in order to find the ones that are likely to have the desired effects on a biological target, typically a protein receptor or an enzyme. Molecules are thereby classified into active or non-active in relation to the target. Misclassification of molecules in cases such as drug discovery and medical diagnosis is costly, both in time and finances. In the process of discovering a drug, it is mainly the inactive molecules classified as active towards the biological target i.e. false positives that cause a delay in the progress and high late-stage attrition. However, despite the pool of techniques available, the selection of the suitable approach in each situation is still a major challenge. This PhD thesis is designed to develop a pioneering framework which enables the analysis of the virtual screening of chemical compounds datasets in a wide range of settings in a unified fashion. The proposed method provides a better understanding of the dynamics of innovatively combining data processing and classification methods in order to screen massive, potentially high dimensional and overly imbalanced datasets more efficiently

    Emotion and Stress Recognition Related Sensors and Machine Learning Technologies

    Get PDF
    This book includes impactful chapters which present scientific concepts, frameworks, architectures and ideas on sensing technologies and machine learning techniques. These are relevant in tackling the following challenges: (i) the field readiness and use of intrusive sensor systems and devices for capturing biosignals, including EEG sensor systems, ECG sensor systems and electrodermal activity sensor systems; (ii) the quality assessment and management of sensor data; (iii) data preprocessing, noise filtering and calibration concepts for biosignals; (iv) the field readiness and use of nonintrusive sensor technologies, including visual sensors, acoustic sensors, vibration sensors and piezoelectric sensors; (v) emotion recognition using mobile phones and smartwatches; (vi) body area sensor networks for emotion and stress studies; (vii) the use of experimental datasets in emotion recognition, including dataset generation principles and concepts, quality insurance and emotion elicitation material and concepts; (viii) machine learning techniques for robust emotion recognition, including graphical models, neural network methods, deep learning methods, statistical learning and multivariate empirical mode decomposition; (ix) subject-independent emotion and stress recognition concepts and systems, including facial expression-based systems, speech-based systems, EEG-based systems, ECG-based systems, electrodermal activity-based systems, multimodal recognition systems and sensor fusion concepts and (x) emotion and stress estimation and forecasting from a nonlinear dynamical system perspective

    Outils statistiques pour la sélection de variables\ud et l'intégration de données "omiques"

    Get PDF
    Les récentes avancées biotechnologiques permettent maintenant de mesurer une\ud énorme quantité de données biologiques de différentes sources (données génomiques,\ud protémiques, métabolomiques, phénotypiques), souvent caractérisées par un petit nombre\ud d'échantillons ou d'observations.\ud L'objectif de ce travail est de développer ou d'adapter des méthodes statistiques\ud adéquates permettant d'analyser ces jeux de données de grande dimension, en proposant\ud aux biologistes des outils efficaces pour sélectionner les variables les plus pertinentes.\ud Dans un premier temps, nous nous intéressons spécifiquement aux données de\ud transcriptome et à la sélection de gènes discriminants dans un cadre de classification\ud supervisée. Puis, dans un autre contexte, nous cherchons a sélectionner des variables de\ud types différents lors de la réconciliation (ou l'intégration) de deux tableaux de données\ud omiques.\ud Dans la première partie de ce travail, nous proposons une approche de type\ud wrapper en agrégeant des méthodes de classification (CART, SVM) pour sélectionner\ud des gènes discriminants une ou plusieurs conditions biologiques. Dans la deuxième\ud partie, nous développons une approche PLS avec pénalisation l1 dite de type sparse\ud car conduisant à un ensemble "creux" de paramètres, permettant de sélectionner des\ud sous-ensembles de variables conjointement mesurées sur les mêmes échantillons biologiques.\ud Un cadre de régression, ou d'analyse canonique est propose pour répondre\ud spécifiquement a la question biologique.\ud Nous évaluons chacune des approches proposées en les comparant sur de nombreux\ud jeux de données réels a des méthodes similaires proposées dans la littérature.\ud Les critères statistiques usuels que nous appliquons sont souvent limitée par le petit\ud nombre d'échantillons. Par conséquent, nous nous efforcons de toujours combiner nos\ud évaluations statistiques avec une interprétation biologique détaillee des résultats.\ud Les approches que nous proposons sont facilement applicables et donnent des\ud résultats très satisfaisants qui répondent aux attentes des biologistes.------------------------------------------------------------------------------------Recent advances in biotechnology allow the monitoring of large quantities of\ud biological data of various types, such as genomics, proteomics, metabolomics, phenotypes...,\ud that are often characterized by a small number of samples or observations.\ud The aim of this thesis was to develop, or adapt, appropriate statistical methodologies\ud to analyse highly dimensional data, and to present ecient tools to biologists\ud for selecting the most biologically relevant variables. In the rst part, we focus on\ud microarray data in a classication framework, and on the selection of discriminative\ud genes. In the second part, in the context of data integration, we focus on the selection\ud of dierent types of variables with two-block omics data.\ud Firstly, we propose a wrapper method, which agregates two classiers (CART\ud or SVM) to select discriminative genes for binary or multiclass biological conditions.\ud Secondly, we develop a PLS variant called sparse PLS that adapts l1 penalization and\ud allows for the selection of a subset of variables, which are measured from the same\ud biological samples. Either a regression or canonical analysis frameworks are proposed\ud to answer biological questions correctly.\ud We assess each of the proposed approaches by comparing them to similar methods\ud known in the literature on numerous real data sets. The statistical criteria that\ud we use are often limited by the small number of samples. We always try, therefore, to\ud combine statistical assessments with a thorough biological interpretation of the results.\ud The approaches that we propose are easy to apply and give relevant results that\ud answer the biologists needs
    corecore