83 research outputs found

    Novas estratégias de pré-processamento, extração de atributos e classificação em sistemas BCI

    Get PDF
    Orientador: Romis Ribeiro de Faissol AttuxTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: As interfaces cérebro-computador (BCIs) visam controlar um dispositivo externo, utilizando diretamente os sinais cerebrais do usuário. Tais sistemas requerem uma série de etapas para processar e extrair atributos relevantes dos sinais observados para interpretar correta e eficientemente as intenções do usuário. Embora o campo tenha se desenvolvido continuamente e algumas dificuldades tenham sido superadas, ainda é necessário aumentar a capacidade de uso, melhorando sua capacidade de classificação e aumentando a confiabilidade de sua resposta. O objetivo clássico da pesquisa de BCI é apoiar a comunicação e o controle para usuários com comunicação prejudicada devido a doenças ou lesões. Aplicações típicas das BCI são a operação de cursores de interface, programas de escrita de texto ou dispositivos externos, como cadeiras de rodas, robôs e diferentes tipos de próteses. O usuário envia informações moduladas para a BCI, realizando tarefas mentais que produzem padrões cerebrais distintos. A BCI adquire sinais do cérebro do usuário e os traduz em comunicação adequada. Esta tese tem como objetivo desenvolver uma comunicação BCI não invasiva mais rápida e confiável baseada no estudo de diferentes técnicas que atuam nas etapas de processamento do sinal, considerando dois aspectos principais, a abordagem de aprendizado de máquina e a redução da complexidade na tarefa de aprendizado dos padrões mentais pelo usuário. A pesquisa foi focada em dois paradigmas de BCI, Imagética Motora (IM) e o potencial relacionado ao evento P300. Algoritmos de processamento de sinais para a detecção de ambos os padrões cerebrais foram aplicados e avaliados. O aspecto do pré-processamento foi a primeira perspectiva estudada, considerando como destacar a resposta dos fenômenos cerebrais, em relação ao ruído e a outras fontes de informação que talvez distorçam o sinal de EEG; isso em si é um passo que influenciará diretamente a resposta dos seguintes blocos de processamento e classificação. A Análise de Componente Independente (ICA) foi usada em conjunto com métodos de seleção de atributos e diferentes classificadores para separar as fontes originais relacionadas à dessincronização produzida pelo fenômeno de IM; esta foi uma tentativa de criar um tipo de filtro espacial que permitisse o sinal ser pré-processado, reduzindo a influência do ruído. Além disso, os resultados dos valores de classificação foram analisados considerando a comparação com métodos padrão de pré-processamento, como o filtro CAR. Os resultados mostraram que é possível separar os componentes relacionados à atividade motora. A proposta da ICA, em média, foi 4\% mais alta em porcentagem de precisão de classificação do que os resultados obtidos usando o CAR, ou quando nenhum filtro foi usado. O papel dos métodos que estudam a conectividade de diferentes áreas do cérebro foi avaliado como a segunda contribuição deste trabalho; Isso permitiu considerar aspectos que contemplam a complexidade da resposta cerebral de um usuário. A área da BCI precisa de uma interpretação mais profunda do que acontece no nível do cérebro em vários dos fenômenos estudados. A técnica utilizada para construir grafos de conectividade funcional foi a correntropia, esta medida foi utilizada para quantificar a similaridade; uma comparação foi feita usando também, as medidas de correlação de Spearman e Pearson. A conectividade funcional relaciona diferentes áreas do cérebro analisando sua atividade cerebral, de modo que o estudo do grafo foi avaliado utilizando três medidas de centralidade, onde a importância de um nó na rede é medida. Também, dois tipos de classificadores foram testados, comparando os resultados no nível de precisão de classificação. Em conclusão, a correntropia pode trazer mais informações para o estudo da conectividade do que o uso da correlação simples, o que trouxe melhorias nos resultados da classificação, especialmente quando ela foi utilizada com o classificador ELM. Finalmente, esta tese demonstra que os BCIs podem fornecer comunicação efetiva em uma aplicação onde a predição da resposta de classificação foi modelada, o que permitiu a otimização dos parâmetros do processamento de sinal realizado usando o filtro espacial xDAWN e um classificador FLDA para o problema do speller P300, buscando a melhor resposta para cada usuário. O modelo de predição utilizado foi Bayesiano e confirmou os resultados obtidos com a operação on-line do sistema, permitindo otimizar os parâmetros tanto do filtro quanto do classificador. Desta forma, foi visto que usando filtros com poucos canais de entrada, o modelo otimizado deu melhores resultados de acurácia de classificação do que os valores inicialmente obtidos ao treinar o filtro xDAWN para os mesmos casos. Os resultados obtidos mostraram que melhorias nos métodos do transdutor BCI, no pré-processamento, extração de características e classificação constituíram a base para alcançar uma comunicação BCI mais rápida e confiável. O avanço nos resultados da classificação foi obtido em todos os casos, comparado às técnicas que têm sido amplamente utilizadas e já mostraram eficácia para esse tipo de problema. No entanto, ainda há aspectos a considerar da resposta dos sujeitos para tipos específicos de paradigmas, lembrando que sua resposta pode variar ao longo de diferentes dias e as implicações reais disso na definição e no uso de diferentes métodos de processamento de sinalAbstract: Brain-computer interfaces (BCIs) aim to control an external device by directly employing user's brain signals. Such systems require a series of steps to process and extract relevant features from the observed signals to correctly and efficiently interpret the user's intentions. Although the field has been continuously developing and some difficulties have been overcome, it is still necessary to increase usability by enhancing their classification capacity and increasing the reliability of their response. The classical objective of BCI research is to support communication and control for users with impaired communication due to illness or injury. Typical BCI applications are the operation of interface cursors, spelling programs or external devices, such as wheelchairs, robots and different types of prostheses. The user sends modulated information to the BCI by engaging in mental tasks that produce distinct brain patterns. The BCI acquires signals from the user¿s brain and translates them into suitable communication. This thesis aims to develop faster and more reliable non-invasive BCI communication based on the study of different techniques that serve in the signal processing stages, considering two principal aspects, the machine learning approach, and the reduction of the complexity in the task of learning the mental patterns by the user. Research was focused on two BCI paradigms, Motor Imagery (MI) and the P300 event related potential (ERP). Signal processing algorithms for the detection of both brain patterns were applied and evaluated. The aspect of the pre-processing was the first perspective studied to consider how to highlight the response of brain phenomena, in relation to noise and other sources of information that maybe distorting the EEG signal; this in itself is a step that will directly influence the response of the following blocks of processing and classification. The Independent Component Analysis (ICA) was used in conjunction with feature selection methods and different classifiers to separate the original sources that are related to the desynchronization produced by MI phenomenon; an attempt was made to create a type of spatial filter that pre-processed the signal, reducing the influence of the noise. Furthermore, some of the classifications values were analyzed considering comparison when used other standard pre-processing methods, as the CAR filter. The results showed that it is possible to separate the components related to motor activity. The ICA proposal on average were 4\% higher in percent of classification accuracy than those obtained using CAR, or when no filter was used. The role of methods that study the connectivity of different brain areas were evaluated as the second contribution of this work; this allowed to consider aspects that contemplate the complexity of the brain response of a user. The area of BCI needs a deeper interpretation of what happens at the brain level in several of the studied phenomena. The technique used to build functional connectivity graphs was correntropy, this quantity was used to measure similarity, a comparison was made using also, the Spearman and Pearson correlation. Functional connectivity relates different brain areas activity, so the study of the graph was evaluated using three measures of centrality of graph, where the importance of a node in the network is measured. In addition, two types of classifiers were tested, comparing the results at the level of classification precision. In conclusion, the correntropy can bring more information for the study of connectivity than the use of the simple correlation, which brought improvements in the classification results especially when it was used with the ELM classifier. Finally, this thesis demonstrates that BCIs can provide effective communication in an application where the prediction of the classification response was modeled, which allowed the optimization of the parameters of the signal processing performed using the xDAWN spatial filter and a FLDA classifier for the problem of the P300 speller, seeking the best response for each user. The prediction model used was Bayesian and confirmed the results obtained with the on-line operation of the system, thus allowing to optimize the parameters of both the filter and the classifier. In this way it was seen that using filters with few inputs the optimized model gave better results of acuraccy classification than the values initially obtained when the training ofthe xDAWN filter was made for the same cases. The obtained results showed that improvements in the BCI transducer, pre-processing, feature extraction and classification methods constituted the basis to achieve faster and more reliable BCI communication. The advance in the classification results were obtained in all cases, compared to techniques that have been widely used and had already shown effectiveness for this type of problemsDoutoradoEngenharia de ComputaçãoDoutora em Engenharia Elétrica153311/2014-2CNP

    Relevant data representation by a Kernel-based framework

    Get PDF
    Nowadays, the analysis of a large amount of data has emerged as an issue of great interest taking increasing place in the scientific community, especially in automation, signal processing, pattern recognition, and machine learning. In this sense, the identification, description, classification, visualization, and clustering of events or patterns are important problems for engineering developments and scientific issues, such as biology, medicine, economy, artificial vision, artificial intelligence, and industrial production. Nonetheless, it is difficult to interpret the available information due to its complexity and a large amount of obtained features. In addition, the analysis of the input data requires the development of methodologies that allow to reveal the relevant behaviors of the studied process, particularly, when such signals contain hidden structures varying over a given domain, e.g., space and/or time. When the analyzed signal contains such kind of properties, directly applying signal processing and machine learning procedures without considering a suitable model that deals with both the statistical distribution and the data structure, can lead in unstable performance results. Regarding this, kernel functions appear as an alternative approach to address the aforementioned issues by providing flexible mathematical tools that allow enhancing data representation for supporting signal processing and machine learning systems. Moreover, kernelbased methods are powerful tools for developing better-performing solutions by adapting the kernel to a given problem, instead of learning data relationships from explicit raw vector representations. However, building suitable kernels requires some user prior knowledge about input data, which is not available in most of the practical cases. Furthermore, using the definitions of traditional kernel methods directly, possess a challenging estimation problem that often leads to strong simplifications that restrict the kind of representation that we can use on the data. In this study, we propose a data representation framework based on kernel methods to learn automatically relevant sample relationships in learning systems. Namely, the proposed framework is divided into five kernel-based approaches, which aim to compute relevant data representations by adapting them according to both the imposed sample relationships constraints and the learning scenario (unsupervised or supervised task). First, we develop a kernel-based representation approach that allows revealing the main input sample relations by including relevant data structures using graph-based sparse constraints. Thus, salient data structures are highlighted aiming to favor further unsupervised clustering stages. This approach can be viewed as a graph pruning strategy within a spectral clustering framework which allows enhancing both the local and global data consistencies for a given input similarity matrix. Second, we introduce a kernel-based representation methodology that captures meaningful data relations in terms of their statistical distribution. Thus, an information theoretic learning (ITL) based penalty function is introduced to estimate a kernel-based similarity that maximizes the whole information potential variability. So, we seek for a reproducing kernel Hilbert space (RKHS) that spans the widest information force magnitudes among data points to support further clustering stages. Third, an entropy-like functional on positive definite matrices based on Renyi’s definition is adapted to develop a kernel-based representation approach which considers the statistical distribution and the salient data structures. Thereby, relevant input patterns are highlighted in unsupervised learning tasks. Particularly, the introduced approach is tested as a tool to encode relevant local and global input data relationships in dimensional reduction applications. Fourth, a supervised kernel-based representation is introduced via a metric learning procedure in RKHS that takes advantage of the user-prior knowledge, when available, regarding the studied learning task. Such an approach incorporates the proposed ITL-based kernel functional estimation strategy to adapt automatically the relevant representation using both the supervised information and the input data statistical distribution. As a result, relevant sample dependencies are highlighted by weighting the input features that mostly encode the supervised learning task. Finally, a new generalized kernel-based measure is proposed by taking advantage of different RKHSs. In this way, relevant dependencies are highlighted automatically by considering the input data domain-varying behavior and the user-prior knowledge (supervised information) when available. The proposed measure is an extension of the well-known crosscorrentropy function based on Hilbert space embeddings. Throughout the study, the proposed kernel-based framework is applied to biosignal and image data as an alternative to support aided diagnosis systems and image-based object analysis. Indeed, the introduced kernel-based framework improve, in most of the cases, unsupervised and supervised learning performances, aiding researchers in their quest to process and to favor the understanding of complex dataResumen: Hoy en día, el análisis de datos se ha convertido en un tema de gran interés para la comunidad científica, especialmente en campos como la automatización, el procesamiento de señales, el reconocimiento de patrones y el aprendizaje de máquina. En este sentido, la identificación, descripción, clasificación, visualización, y la agrupación de eventos o patrones son problemas importantes para desarrollos de ingeniería y cuestiones científicas, tales como: la biología, la medicina, la economía, la visión artificial, la inteligencia artificial y la producción industrial. No obstante, es difícil interpretar la información disponible debido a su complejidad y la gran cantidad de características obtenidas. Además, el análisis de los datos de entrada requiere del desarrollo de metodologías que permitan revelar los comportamientos relevantes del proceso estudiado, en particular, cuando tales señales contienen estructuras ocultas que varían sobre un dominio dado, por ejemplo, el espacio y/o el tiempo. Cuando la señal analizada contiene este tipo de propiedades, los rendimientos pueden ser inestables si se aplican directamente técnicas de procesamiento de señales y aprendizaje automático sin tener en cuenta la distribución estadística y la estructura de datos. Al respecto, las funciones núcleo (kernel) aparecen como un enfoque alternativo para abordar las limitantes antes mencionadas, proporcionando herramientas matemáticas flexibles que mejoran la representación de los datos de entrada. Por otra parte, los métodos basados en funciones núcleo son herramientas poderosas para el desarrollo de soluciones de mejor rendimiento mediante la adaptación del núcleo de acuerdo al problema en estudio. Sin embargo, la construcción de funciones núcleo apropiadas requieren del conocimiento previo por parte del usuario sobre los datos de entrada, el cual no está disponible en la mayoría de los casos prácticos. Por otra parte, a menudo la estimación de las funciones núcleo conllevan sesgos el modelo, siendo necesario apelar a simplificaciones matemáticas que no siempre son acordes con la realidad. En este estudio, se propone un marco de representación basado en métodos núcleo para resaltar relaciones relevantes entre los datos de forma automática en sistema de aprendizaje de máquina. A saber, el marco propuesto consta de cinco enfoques núcleo, en aras de adaptar la representación de acuerdo a las relaciones impuestas sobre las muestras y sobre el escenario de aprendizaje (sin/con supervisión). En primer lugar, se desarrolla un enfoque de representación núcleo que permite revelar las principales relaciones entre muestras de entrada mediante la inclusión de estructuras relevantes utilizando restricciones basadas en modelado por grafos. Por lo tanto, las estructuras de datos más sobresalientes se destacan con el objetivo de favorecer etapas posteriores de agrupamiento no supervisado. Este enfoque puede ser visto como una estrategia de depuración de grafos dentro de un marco de agrupamiento espectral que permite mejorar las consistencias locales y globales de los datos En segundo lugar, presentamos una metodología de representación núcleo que captura relaciones significativas entre muestras en términos de su distribución estadística. De este modo, se introduce una función de costo basada en aprendizaje por teoría de la información para estimar una similitud que maximice la variabilidad del potencial de información de los datos de entrada. Así, se busca un espacio de Hilbert generado por el núcleo que contenga altas fuerzas de información entre los puntos para favorecer el agrupamiento entre los mismos. En tercer lugar, se propone un esquema de representación que incluye un funcional de entropía para matrices definidas positivas a partir de la definición de Renyi. En este sentido, se pretenden incluir la distribución estadística de las muestras y sus estructuras relevantes. Por consiguiente, los patrones de entrada pertinentes se destacan en tareas de aprendizaje sin supervisión. En particular, el enfoque introducido se prueba como una herramienta para codificar las relaciones locales y globales de los datos en tareas de reducción de dimensión. En cuarto lugar, se introduce una metodología de representación núcleo supervisada a través de un aprendizaje de métrica en el espacio de Hilbert generado por una función núcleo en aras de aprovechar el conocimiento previo del usuario con respecto a la tarea de aprendizaje. Este enfoque incorpora un funcional por teoría de información que permite adaptar automáticamente la representación utilizando tanto información supervisada y la distribución estadística de los datos de entrada. Como resultado, las dependencias entre las muestras se resaltan mediante la ponderación de las características de entrada que codifican la tarea de aprendizaje supervisado. Por último, se propone una nueva medida núcleo mediante el aprovechamiento de diferentes espacios de representación. De este modo, las dependencias más relevantes entre las muestras se resaltan automáticamente considerando el dominio de interés de los datos de entrada y el conocimiento previo del usuario (información supervisada). La medida propuesta es una extensión de la función de cross-correntropia a partir de inmersiones en espacios de Hilbert. A lo largo del estudio, el esquema propuesto se valida sobre datos relacionados con bioseñales e imágenes como una alternativa para apoyar sistemas de apoyo diagnóstico y análisis objetivo basado en imágenes. De hecho, el marco introducido permite mejorar, en la mayoría de los casos, el rendimiento de sistemas de aprendizaje supervisado y no supervisado, favoreciendo la precisión de la tarea y la interpretabilidad de los datosDoctorad

    Broad Learning System Based on Maximum Correntropy Criterion

    Full text link
    As an effective and efficient discriminative learning method, Broad Learning System (BLS) has received increasing attention due to its outstanding performance in various regression and classification problems. However, the standard BLS is derived under the minimum mean square error (MMSE) criterion, which is, of course, not always a good choice due to its sensitivity to outliers. To enhance the robustness of BLS, we propose in this work to adopt the maximum correntropy criterion (MCC) to train the output weights, obtaining a correntropy based broad learning system (C-BLS). Thanks to the inherent superiorities of MCC, the proposed C-BLS is expected to achieve excellent robustness to outliers while maintaining the original performance of the standard BLS in Gaussian or noise-free environment. In addition, three alternative incremental learning algorithms, derived from a weighted regularized least-squares solution rather than pseudoinverse formula, for C-BLS are developed.With the incremental learning algorithms, the system can be updated quickly without the entire retraining process from the beginning, when some new samples arrive or the network deems to be expanded. Experiments on various regression and classification datasets are reported to demonstrate the desirable performance of the new methods

    Wind Power Forecasting Methods Based on Deep Learning: A Survey

    Get PDF
    Accurate wind power forecasting in wind farm can effectively reduce the enormous impact on grid operation safety when high permeability intermittent power supply is connected to the power grid. Aiming to provide reference strategies for relevant researchers as well as practical applications, this paper attempts to provide the literature investigation and methods analysis of deep learning, enforcement learning and transfer learning in wind speed and wind power forecasting modeling. Usually, wind speed and wind power forecasting around a wind farm requires the calculation of the next moment of the definite state, which is usually achieved based on the state of the atmosphere that encompasses nearby atmospheric pressure, temperature, roughness, and obstacles. As an effective method of high-dimensional feature extraction, deep neural network can theoretically deal with arbitrary nonlinear transformation through proper structural design, such as adding noise to outputs, evolutionary learning used to optimize hidden layer weights, optimize the objective function so as to save information that can improve the output accuracy while filter out the irrelevant or less affected information for forecasting. The establishment of high-precision wind speed and wind power forecasting models is always a challenge due to the randomness, instantaneity and seasonal characteristics

    Developing reliable anomaly detection system for critical hosts: a proactive defense paradigm

    Full text link
    Current host-based anomaly detection systems have limited accuracy and incur high processing costs. This is due to the need for processing massive audit data of the critical host(s) while detecting complex zero-day attacks which can leave minor, stealthy and dispersed artefacts. In this research study, this observation is validated using existing datasets and state-of-the-art algorithms related to the construction of the features of a host's audit data, such as the popular semantic-based extraction and decision engines, including Support Vector Machines, Extreme Learning Machines and Hidden Markov Models. There is a challenging trade-off between achieving accuracy with a minimum processing cost and processing massive amounts of audit data that can include complex attacks. Also, there is a lack of a realistic experimental dataset that reflects the normal and abnormal activities of current real-world computers. This thesis investigates the development of new methodologies for host-based anomaly detection systems with the specific aims of improving accuracy at a minimum processing cost while considering challenges such as complex attacks which, in some cases, can only be visible via a quantified computing resource, for example, the execution times of programs, the processing of massive amounts of audit data, the unavailability of a realistic experimental dataset and the automatic minimization of the false positive rate while dealing with the dynamics of normal activities. This study provides three original and significant contributions to this field of research which represent a marked advance in its body of knowledge. The first major contribution is the generation and release of a realistic intrusion detection systems dataset as well as the development of a metric based on fuzzy qualitative modeling for embedding the possible quality of realism in a dataset's design process and assessing this quality in existing or future datasets. The second key contribution is constructing and evaluating the hidden host features to identify the trivial differences between the normal and abnormal artefacts of hosts' activities at a minimum processing cost. Linux-centric features include the frequencies and ranges, frequency-domain representations and Gaussian interpretations of system call identifiers with execution times while, for Windows, a count of the distinct core Dynamic Linked Library calls is identified as a hidden host feature. The final key contribution is the development of two new anomaly-based statistical decision engines for capitalizing on the potential of some of the suggested hidden features and reliably detecting anomalies. The first engine, which has a forensic module, is based on stochastic theories including Hierarchical hidden Markov models and the second is modeled using Gaussian Mixture Modeling and Correntropy. The results demonstrate that the proposed host features and engines are competent for meeting the identified challenges

    Novel Deep Learning Techniques For Computer Vision and Structure Health Monitoring

    Get PDF
    This thesis proposes novel techniques in building a generic framework for both the regression and classification tasks in vastly different applications domains such as computer vision and civil engineering. Many frameworks have been proposed and combined into a complex deep network design to provide a complete solution to a wide variety of problems. The experiment results demonstrate significant improvements of all the proposed techniques towards accuracy and efficiency

    Mathematics and Digital Signal Processing

    Get PDF
    Modern computer technology has opened up new opportunities for the development of digital signal processing methods. The applications of digital signal processing have expanded significantly and today include audio and speech processing, sonar, radar, and other sensor array processing, spectral density estimation, statistical signal processing, digital image processing, signal processing for telecommunications, control systems, biomedical engineering, and seismology, among others. This Special Issue is aimed at wide coverage of the problems of digital signal processing, from mathematical modeling to the implementation of problem-oriented systems. The basis of digital signal processing is digital filtering. Wavelet analysis implements multiscale signal processing and is used to solve applied problems of de-noising and compression. Processing of visual information, including image and video processing and pattern recognition, is actively used in robotic systems and industrial processes control today. Improving digital signal processing circuits and developing new signal processing systems can improve the technical characteristics of many digital devices. The development of new methods of artificial intelligence, including artificial neural networks and brain-computer interfaces, opens up new prospects for the creation of smart technology. This Special Issue contains the latest technological developments in mathematics and digital signal processing. The stated results are of interest to researchers in the field of applied mathematics and developers of modern digital signal processing systems
    corecore