51 research outputs found

    Scalable Tools for Information Extraction and Causal Modeling of Neural Data

    Get PDF
    Systems neuroscience has entered in the past 20 years into an era that one might call "large scale systems neuroscience". From tuning curves and single neuron recordings there has been a conceptual shift towards a more holistic understanding of how the neural circuits work and as a result how their representations produce neural tunings. With the introduction of a plethora of datasets in various scales, modalities, animals, and systems; we as a community have witnessed invaluable insights that can be gained from the collective view of a neural circuit which was not possible with small scale experimentation. The concurrency of the advances in neural recordings such as the production of wide field imaging technologies and neuropixels with the developments in statistical machine learning and specifically deep learning has brought system neuroscience one step closer to data science. With this abundance of data, the need for developing computational models has become crucial. We need to make sense of the data, and thus we need to build models that are constrained up to the acceptable amount of biological detail and probe those models in search of neural mechanisms. This thesis consists of sections covering a wide range of ideas from computer vision, statistics, machine learning, and dynamical systems. But all of these ideas share a common purpose, which is to help automate neuroscientific experimentation process in different levels. In chapters 1, 2, and 3, I develop tools that automate the process of extracting useful information from raw neuroscience data in the model organism C. elegans. The goal of this is to avoid manual labor and pave the way for high throughput data collection aiming at better quantification of variability across the population of worms. Due to its high level of structural and functional stereotypy, and its relative simplicity, the nematode C. elegans has been an attractive model organism for systems and developmental research. With 383 neurons in males and 302 neurons in hermaphrodites, the positions and function of neurons is remarkably conserved across individuals. Furthermore, C. elegans remains the only organism for which a complete cellular, lineage, and anatomical map of the entire nervous system has been described for both sexes. Here, I describe the analysis pipeline that we developed for the recently proposed NeuroPAL technique in C. elegans. Our proposed pipeline consists of atlas building (chapter 1), registration, segmentation, neural tracking (chapter 2), and signal extraction (chapter 3). I emphasize that categorizing the analysis techniques as a pipeline consisting of the above steps is general and can be applied to virtually every single animal model and emerging imaging modality. I use the language of probabilistic generative modeling and graphical models to communicate the ideas in a rigorous form, therefore some familiarity with those concepts could help the reader navigate through the chapters of this thesis more easily. In chapters 4 and 5 I build models that aim to automate hypothesis testing and causal interrogation of neural circuits. The notion of functional connectivity (FC) has been instrumental in our understanding of how information propagates in a neural circuit. However, an important limitation is that current techniques do not dissociate between causal connections and purely functional connections with no mechanistic correspondence. I start chapter 4 by introducing causal inference as a unifying language for the following chapters. In chapter 4 I define the notion of interventional connectivity (IC) as a way to summarize the effect of stimulation in a neural circuit providing a more mechanistic description of the information flow. I then investigate which functional connectivity metrics are best predictive of IC in simulations and real data. Following this framework, I discuss how stimulations and interventions can be used to improve fitting and generalization properties of time series models. Building on the literature of model identification and active causal discovery I develop a switching time series model and a method for finding stimulation patterns that help the model to generalize to the vicinity of the observed neural trajectories. Finally in chapter 5 I develop a new FC metric that separates the transferred information from one variable to the other into unique and synergistic sources. In all projects, I have abstracted out concepts that are specific to the datasets at hand and developed the methods in the most general form. This makes the presented methods applicable to a broad range of datasets, potentially leading to new findings. In addition, all projects are accompanied with extensible and documented code packages, allowing theorists to repurpose the modules for novel applications and experimentalists to run analysis on their datasets efficiently and scalably. In summary my main contribution in this thesis are the following: 1) Building the first atlases of hermaphrodite and male C. elegans and developing a generic statistical framework for constructing atlases for a broad range of datasets. 2) Developing a semi-automated analysis pipeline for neural registration, segmentation, and tracking in C. elegans. 3) Extending the framework of non-negative matrix factorization to datasets with deformable motion and developing algorithms for joint tracking and signal demixing from videos of semi-immobilized C. elegans. 4) Defining the notion of interventional connectivity (IC) as a way to summarize the effect of stimulation in a neural circuit and investigating which functional connectivity metrics are best predictive of IC in simulations and real data. 5) Developing a switching time series model and a method for finding stimulation patterns that help the model to generalize to the vicinity of the observed neural trajectories. 6) Developing a new functional connectivity metric that separates the transferred information from one variable to the other into unique and synergistic sources. 7) Implementing extensible, well documented, open source code packages for each of the above contributions

    New approaches for EEG signal processing: artifact EOG removal by ICA-RLS scheme and tracks extraction method

    Get PDF
    Localizing the bioelectric phenomena originating from the cerebral cortex and evoked by auditory and somatosensory stimuli are clear objectives to both understand how the brain works and to recognize different pathologies. Diseases such as Parkinson’s, Alzheimer’s, schizophrenia and epilepsy are intensively studied to find a cure or accurate diagnosis. Epilepsy is considered the disease with major prevalence within disorders with neurological origin. The recurrent and sudden incidence of seizures can lead to dangerous and possibly life-threatening situations. Since disturbance of consciousness and sudden loss of motor control often occur without any warning, the ability to predict epileptic seizures would reduce patients’ anxiety, thus considerably improving quality of life and safety. The common procedure for epilepsy seizure detection is based on brain activity monitorization via electroencephalogram (EEG) data. This process consumes a lot of time, especially in the case of long recordings, but the major problem is the subjective nature of the analysis among specialists when analyzing the same record. From this perspective, the identification of hidden dynamical patterns is necessary because they could provide insight into the underlying physiological mechanisms that occur in the brain. Time-frequency distributions (TFDs) and adaptive methods have demonstrated to be good alternatives in designing systems for detecting neurodegenerative diseases. TFDs are appropriate transformations because they offer the possibility of analyzing relatively long continuous segments of EEG data even when the dynamics of the signal are rapidly changing. On the other hand, most of the detection methods proposed in the literature assume a clean EEG signal free of artifacts or noise, leaving the preprocessing problem opened to any denoising algorithm. In this thesis we have developed two proposals for EEG signal processing: the first approach consists in electrooculogram (EOG) removal method based on a combination of ICA and RLS algorithms which automatically cancels the artifacts produced by eyes movement without the use of external “ad hoc” electrode. This method, called ICA-RLS has been compared with other techniques that are in the state of the art and has shown to be a good alternative for artifacts rejection. The second approach is a novel method in EEG features extraction called tracks extraction (LFE features). This method is based on the TFDs and partial tracking. Our results in pattern extractions related to epileptic seizures have shown that tracks extraction is appropriate in EEG detection and classification tasks, being practical, easily applicable in medical environment and has acceptable computational cost

    Tracking the Temporal-Evolution of Supernova Bubbles in Numerical Simulations

    Get PDF
    The study of low-dimensional, noisy manifolds embedded in a higher dimensional space has been extremely useful in many applications, from the chemical analysis of multi-phase flows to simulations of galactic mergers. Building a probabilistic model of the manifolds has helped in describing their essential properties and how they vary in space. However, when the manifold is evolving through time, a joint spatio-temporal modelling is needed, in order to fully comprehend its nature. We propose a first-order Markovian process that propagates the spatial probabilistic model of a manifold at fixed time, to its adjacent temporal stages. The proposed methodology is demonstrated using a particle simulation of an interacting dwarf galaxy to describe the evolution of a cavity generated by a Supernov

    Advanced source separation methods with applications to spatio-temporal datasets

    Get PDF
    Latent variable models are useful tools for statistical data analysis in many applications. Examples of popular models include factor analysis, state-space models and independent component analysis. These types of models can be used for solving the source separation problem in which the latent variables should have a meaningful interpretation and represent the actual sources generating data. Source separation methods is the main focus of this work. Bayesian statistical theory provides a principled way to learn latent variable models and therefore to solve the source separation problem. The first part of this work studies variational Bayesian methods and their application to different latent variable models. The properties of variational Bayesian methods are investigated both theoretically and experimentally using linear source separation models. A new nonlinear factor analysis model which restricts the generative mapping to the practically important case of post-nonlinear mixtures is presented. The variational Bayesian approach to learning nonlinear state-space models is studied as well. This method is applied to the practical problem of detecting changes in the dynamics of complex nonlinear processes. The main drawback of Bayesian methods is their high computational burden. This complicates their use for exploratory data analysis in which observed data regularities often suggest what kind of models could be tried. Therefore, the second part of this work proposes several faster source separation algorithms implemented in a common algorithmic framework. The proposed approaches separate the sources by analyzing their spectral contents, decoupling their dynamic models or by optimizing their prominent variance structures. These algorithms are applied to spatio-temporal datasets containing global climate measurements from a long period of time.reviewe

    Design Data Collection with Skylab Microwave Radiometer-Scatterometer S-193, Volume 1

    Get PDF
    The author has identified the following significant results. Observations with S-193 have provided radar design information for systems to be flown on spacecraft, but only at 13.9 GHz and for land areas over the United States and Brazil plus a few other areas of the world for which this kind of analysis was not made. Observations only extended out to about 50 deg angle of incidence. The value of a sensor with such a gross resolution for most overland resource and status monitoring systems seems marginal, with the possible exception of monitoring soil moisture and major vegetation variations. The complementary nature of the scatterometer and radiometer systems was demonstrated by the correlation analysis. Although radiometers must have spatial resolutions dictated by antenna size, radars can use synthetic aperture techniques to achieve much finer resolutions. Multiplicity of modes in the S-193 sensors complicated both the system development and its employment. An attempt was made in the design of the S-193 to arrange optimum integration times for each angle and type of measurement. This unnecessarily complicated the design of the instrument, since the gains in precision achieved in this way were marginal. Either a software-controllable integration time or a set of only two or three integration times would have been better

    ON THE INTERPLAY BETWEEN BRAIN-COMPUTER INTERFACES AND MACHINE LEARNING ALGORITHMS: A SYSTEMS PERSPECTIVE

    Get PDF
    Today, computer algorithms use traditional human-computer interfaces (e.g., keyboard, mouse, gestures, etc.), to interact with and extend human capabilities across all knowledge domains, allowing them to make complex decisions underpinned by massive datasets and machine learning. Machine learning has seen remarkable success in the past decade in obtaining deep insights and recognizing unknown patterns in complex data sets, in part by emulating how the brain performs certain computations. As we increase our understanding of the human brain, brain-computer interfaces can benefit from the power of machine learning, both as an underlying model of how the brain performs computations and as a tool for processing high-dimensional brain recordings. The technology (machine learning) has come full circle and is being applied back to understanding the brain and any electric residues of the brain activity over the scalp (EEG). Similarly, domains such as natural language processing, machine translation, and scene understanding remain beyond the scope of true machine learning algorithms and require human participation to be solved. In this work, we investigate the interplay between brain-computer interfaces and machine learning through the lens of end-user usability. Specifically, we propose the systems and algorithms to enable synergistic and user-friendly integration between computers (machine learning) and the human brain (brain-computer interfaces). In this context, we provide our research contributions in two interrelated aspects by, (i) applying machine learning to solve challenges with EEG-based BCIs, and (ii) enabling human-assisted machine learning with EEG-based human input and implicit feedback.Ph.D

    Learning Identifiable Representations: Independent Influences and Multiple Views

    Get PDF
    Intelligent systems, whether biological or artificial, perceive unstructured information from the world around them: deep neural networks designed for object recognition receive collections of pixels as inputs; living beings capture visual stimuli through photoreceptors that convert incoming light into electrical signals. Sophisticated signal processing is required to extract meaningful features (e.g., the position, dimension, and colour of objects in an image) from these inputs: this motivates the field of representation learning. But what features should be deemed meaningful, and how to learn them? We will approach these questions based on two metaphors. The first one is the cocktail-party problem, where a number of conversations happen in parallel in a room, and the task is to recover (or separate) the voices of the individual speakers from recorded mixtures—also termed blind source separation. The second one is what we call the independent-listeners problem: given two listeners in front of some loudspeakers, the question is whether, when processing what they hear, they will make the same information explicit, identifying similar constitutive elements. The notion of identifiability is crucial when studying these problems, as it specifies suitable technical assumptions under which representations are uniquely determined, up to tolerable ambiguities like latent source reordering. A key result of this theory is that, when the mixing is nonlinear, the model is provably non-identifiable. A first question is, therefore, under what additional assumptions (ideally as mild as possible) the problem becomes identifiable; a second one is, what algorithms can be used to estimate the model. The contributions presented in this thesis address these questions and revolve around two main principles. The first principle is to learn representation where the latent components influence the observations independently. Here the term “independently” is used in a non-statistical sense—which can be loosely thought of as absence of fine-tuning between distinct elements of a generative process. The second principle is that representations can be learned from paired observations or views, where mixtures of the same latent variables are observed, and they (or a subset thereof) are perturbed in one of the views—also termed multi-view setting. I will present work characterizing these two problem settings, studying their identifiability and proposing suitable estimation algorithms. Moreover, I will discuss how the success of popular representation learning methods may be explained in terms of the principles above and describe an application of the second principle to the statistical analysis of group studies in neuroimaging
    • 

    corecore