2,061 research outputs found

    Unsupervised clustering of IoT signals through feature extraction and self organizing maps

    Get PDF
    This thesis scope is to build a clustering model to inspect the structural properties of a dataset composed of IoT signals and to classify these through unsupervised clustering algorithms. To this end, a feature-based representation of the signals is used. Different feature selection algorithms are then used to obtain reduced feature spaces, so as to decrease the computational cost and the memory demand. Thus, the IoT signals are clustered using Self-Organizing Maps (SOM) and then evaluatedope

    To Deconvolve, or Not to Deconvolve: Inferences of Neuronal Activities using Calcium Imaging Data

    Full text link
    With the increasing popularity of calcium imaging data in neuroscience research, methods for analyzing calcium trace data are critical to address various questions. The observed calcium traces are either analyzed directly or deconvolved to spike trains to infer neuronal activities. When both approaches are applicable, it is unclear whether deconvolving calcium traces is a necessary step. In this article, we compare the performance of using calcium traces or their deconvolved spike trains for three common analyses: clustering, principal component analysis (PCA), and population decoding. Our simulations and applications to real data suggest that the estimated spike data outperform calcium trace data for both clustering and PCA. Although calcium trace data show higher predictability than spike data at each time point, spike history or cumulative spike counts is comparable to or better than calcium traces in population decoding

    Statistical Machine Learning Methods for High-dimensional Neural Population Data Analysis

    Get PDF
    Advances in techniques have been producing increasingly complex neural recordings, posing significant challenges for data analysis. This thesis discusses novel statistical methods for analyzing high-dimensional neural data. Part one discusses two extensions of state space models tailored to neural data analysis. First, we propose using a flexible count data distribution family in the observation model to faithfully capture over-dispersion and under-dispersion of the neural observations. Second, we incorporate nonlinear observation models into state space models to improve the flexibility of the model and get a more concise representation of the data. For both extensions, novel variational inference techniques are developed for model fitting, and simulated and real experiments show the advantages of our extensions. Part two discusses a fast region of interest (ROI) detection method for large-scale calcium imaging data based on structured matrix factorization. Part three discusses a method for sampling from a maximum entropy distribution with complicated constraints, which is useful for hypothesis testing for neural data analysis and many other applications related to maximum entropy formulation. We conclude the thesis with discussions and future works

    Detecció automàtica i robusta de Bursts en EEG de nounats amb HIE. Enfocament tensorial

    Get PDF
    [ANGLÈS] Hypoxic-Ischemic Encephalopathy (HIE) is an important cause of brain injury in the newborn, and can result in long-term devastating consequences. Burst-suppression pattern is one of several indicators of severe pathology in the EEG signal that may occur after brain damage caused by e.g. asphyxia around the time of birth. The goal of this thesis is to design a robust method to detect burst patterns automatically regardless of the physiologic and extra-physiologic artifacts that may occur at any time. At first, a pre-detector has been designed to obtain potential burst candidates from different patients. Then, a post-classification has been implemented, applying high dimensional feature extraction methods, to get the real burst patterns from these patients with a high sensitivity.[CASTELLÀ] La Hipoxia-Isquemia Encefálica (HIE) es una causa importante de lesión cerebral en los recién nacidos, pudiendo acarrear devastadoras consecuencias a largo plazo. El patrón Burst-Suppression es uno de los indicadores dados en patologías severas en señales EEG los cuales ocurren después de una lesión cerebral causada, por ejemplo, por una asfixia poco después del nacimiento. El objetivo de esta tésis es diseñar un método robusto que detecte automáticamente patrones Burst, prescindiendo de los artefactos fisiológicos y extra-fisiológicos que puedan aparecer en cualquier momento. Primeramente, se ha diseñado un pre-detector para obtener los candidatos potenciales a Burst provenientes de diferentes pacientes. Seguidamente, se ha implementado una post-clasificación, aplicando métodos de extracción de características para altas dimensiones, para obtener patrones reales de Burst con una alta sensitividad.[CATALÀ] La Hipòxia-Isquèmia Encefàlica (HIE) és una causa important de lesió cerebral en nounats, que poden comportar devastadores conseqüències a llarg termini. El patró Burst-Suppression és un dels indicadors donats en patologies severes en els senyals EEG els quals ocorren després d'una lesió cerebral causada, per exemple, per una asfixia poc després del naixement. L'objectiu d'aquesta tesis és dissenyar un mètode robust que detecti automàticament patrons Burst, prescindint dels artefactes fisiològics i extra-fisiològics que poden aparèixer en qualsevol moment. Primerament, s'ha dissenyat un pre-detector per obtenir els candidats potencials a Burst provinents de diferents pacients. Seguidament, s'ha implementat una post-classificació, aplicant mètodes d'extracció de característiques per a altes dimensions, per tal d'obtenir patrons reals de Burst amb una alta sensitivitat

    Multi-faceted Structure-Activity Relationship Analysis Using Graphical Representations

    Get PDF
    A core focus in medicinal chemistry is the interpretation of structure-activity relationships (SARs) of small molecules. SAR analysis is typically carried out on a case-by-case basis for compound sets that share activity against a given target. Although SAR investigations are not a priori dependent on computational approaches, limitations imposed by steady rise in activity information have necessitated the use of such methodologies. Moreover, understanding SARs in multi-target space is extremely difficult. Conceptually different computational approaches are reported in this thesis for graphical SAR analysis in single- as well as multi-target space. Activity landscape models are often used to describe the underlying SAR characteristics of compound sets. Theoretical activity landscapes that are reminiscent of topological maps intuitively represent distributions of pair-wise similarity and potency difference information as three-dimensional surfaces. These models provide easy access to identification of various SAR features. Therefore, such landscapes for actual data sets are generated and compared with graph-based representations. Existing graphical data structures are adapted to include mechanism of action information for receptor ligands to facilitate simultaneous SAR and mechanism-related analyses with the objective of identifying structural modifications responsible for switching molecular mechanisms of action. Typically, SAR analysis focuses on systematic pair-wise relationships of compound similarity and potency differences. Therefore, an approach is reported to calculate SAR feature probabilities on the basis of these pair-wise relationships for individual compounds in a ligand set. The consequent expansion of feature categories improves the analysis of local SAR environments. Graphical representations are designed to avoid a dependence on preconceived SAR models. Such representations are suitable for systematic large-scale SAR exploration. Methods for the navigation of SARs in multi-target space using simple and interpretable data structures are introduced. In summary, multi-faceted SAR analysis aided by computational means forms the primary objective of this dissertation

    Deep Cellular Recurrent Neural Architecture for Efficient Multidimensional Time-Series Data Processing

    Get PDF
    Efficient processing of time series data is a fundamental yet challenging problem in pattern recognition. Though recent developments in machine learning and deep learning have enabled remarkable improvements in processing large scale datasets in many application domains, most are designed and regulated to handle inputs that are static in time. Many real-world data, such as in biomedical, surveillance and security, financial, manufacturing and engineering applications, are rarely static in time, and demand models able to recognize patterns in both space and time. Current machine learning (ML) and deep learning (DL) models adapted for time series processing tend to grow in complexity and size to accommodate the additional dimensionality of time. Specifically, the biologically inspired learning based models known as artificial neural networks that have shown extraordinary success in pattern recognition, tend to grow prohibitively large and cumbersome in the presence of large scale multi-dimensional time series biomedical data such as EEG. Consequently, this work aims to develop representative ML and DL models for robust and efficient large scale time series processing. First, we design a novel ML pipeline with efficient feature engineering to process a large scale multi-channel scalp EEG dataset for automated detection of epileptic seizures. With the use of a sophisticated yet computationally efficient time-frequency analysis technique known as harmonic wavelet packet transform and an efficient self-similarity computation based on fractal dimension, we achieve state-of-the-art performance for automated seizure detection in EEG data. Subsequently, we investigate the development of a novel efficient deep recurrent learning model for large scale time series processing. For this, we first study the functionality and training of a biologically inspired neural network architecture known as cellular simultaneous recurrent neural network (CSRN). We obtain a generalization of this network for multiple topological image processing tasks and investigate the learning efficacy of the complex cellular architecture using several state-of-the-art training methods. Finally, we develop a novel deep cellular recurrent neural network (CDRNN) architecture based on the biologically inspired distributed processing used in CSRN for processing time series data. The proposed DCRNN leverages the cellular recurrent architecture to promote extensive weight sharing and efficient, individualized, synchronous processing of multi-source time series data. Experiments on a large scale multi-channel scalp EEG, and a machine fault detection dataset show that the proposed DCRNN offers state-of-the-art recognition performance while using substantially fewer trainable recurrent units
    corecore