14 research outputs found
Scalable Spike Source Localization in Extracellular Recordings using Amortized Variational Inference
Scalable software and models for large-scale extracellular recordings
The brain represents information about the world through the electrical activity of
populations of neurons. By placing an electrode near a neuron that is firing (spiking), it
is possible to detect the resulting extracellular action potential (EAP) that is transmitted
down an axon to other neurons. In this way, it is possible to monitor the communication
of a group of neurons to uncover how they encode and transmit information. As the
number of recorded neurons continues to increase, however, so do the data processing
and analysis challenges. It is crucial that scalable software and analysis tools are developed
and made available to the neuroscience community to keep up with the large
amounts of data that are already being gathered.
This thesis is composed of three pieces of work which I develop in order to better
process and analyze large-scale extracellular recordings. My work spans all stages of extracellular
analysis from the processing of raw electrical recordings to the development
of statistical models to reveal underlying structure in neural population activity.
In the first work, I focus on developing software to improve the comparison and adoption
of different computational approaches for spike sorting. When analyzing neural
recordings, most researchers are interested in the spiking activity of individual neurons,
which must be extracted from the raw electrical traces through a process called
spike sorting. Much development has been directed towards improving the performance
and automation of spike sorting. This continuous development, while essential,
has contributed to an over-saturation of new, incompatible tools that hinders rigorous
benchmarking and complicates reproducible analysis. To address these limitations, I
develop SpikeInterface, an open-source, Python framework designed to unify preexisting
spike sorting technologies into a single toolkit and to facilitate straightforward
benchmarking of different approaches. With this framework, I demonstrate that modern,
automated spike sorters have low agreement when analyzing the same dataset, i.e.
they find different numbers of neurons with different activity profiles; This result holds
true for a variety of simulated and real datasets. Also, I demonstrate that utilizing a
consensus-based approach to spike sorting, where the outputs of multiple spike sorters
are combined, can dramatically reduce the number of falsely detected neurons.
In the second work, I focus on developing an unsupervised machine learning approach
for determining the source location of individually detected spikes that are
recorded by high-density, microelectrode arrays. By localizing the source of individual
spikes, my method is able to determine the approximate position of the recorded neuriii
ons in relation to the microelectrode array. To allow my model to work with large-scale
datasets, I utilize deep neural networks, a family of machine learning algorithms that
can be trained to approximate complicated functions in a scalable fashion. I evaluate
my method on both simulated and real extracellular datasets, demonstrating that it is
more accurate than other commonly used methods. Also, I show that location estimates
for individual spikes can be utilized to improve the efficiency and accuracy of spike
sorting. After training, my method allows for localization of one million spikes in approximately
37 seconds on a TITAN X GPU, enabling real-time analysis of massive
extracellular datasets.
In my third and final presented work, I focus on developing an unsupervised machine
learning model that can uncover patterns of activity from neural populations
associated with a behaviour being performed. Specifically, I introduce Targeted Neural
Dynamical Modelling (TNDM), a statistical model that jointly models the neural activity
and any external behavioural variables. TNDM decomposes neural dynamics (i.e.
temporal activity patterns) into behaviourally relevant and behaviourally irrelevant dynamics;
the behaviourally relevant dynamics constitute all activity patterns required
to generate the behaviour of interest while behaviourally irrelevant dynamics may be
completely unrelated (e.g. other behavioural or brain states), or even related to behaviour
execution (e.g. dynamics that are associated with behaviour generally but are not
task specific). Again, I implement TNDM using a deep neural network to improve its
scalability and expressivity. On synthetic data and on real recordings from the premotor
(PMd) and primary motor cortex (M1) of a monkey performing a center-out reaching
task, I show that TNDM is able to extract low-dimensional neural dynamics that are
highly predictive of behaviour without sacrificing its fit to the neural data
Building population models for large-scale neural recordings: opportunities and pitfalls
Modern recording technologies now enable simultaneous recording from large
numbers of neurons. This has driven the development of new statistical models
for analyzing and interpreting neural population activity. Here we provide a
broad overview of recent developments in this area. We compare and contrast
different approaches, highlight strengths and limitations, and discuss
biological and mechanistic insights that these methods provide
Recommended from our members
Statistical Machine Learning Methods for the Large Scale Analysis of Neural Data
Modern neurotechnologies enable the recording of neural activity at the scale of entire brains and with single-cell resolution. However, the lack of principled approaches to extract structure from these massive data streams prevent us from fully exploiting the potential of these technologies. This thesis, divided in three parts, introduces new statistical machine learning methods to enable the large-scale analysis of some of these complex neural datasets. In the first part, I present a method that leverages Gaussian quadrature to accelerate inference of neural encoding models from a certain type of observed neural point processes --- spike trains --- resulting in substantial improvements over existing methods.
The second part focuses on the simultaneous electrical stimulation and recording of neurons using large electrode arrays. There, identification of neural activity is hindered by stimulation artifacts that are much larger than spikes, and overlap temporally with spikes. To surmount this challenge, I develop an algorithm to infer and cancel this artifact, enabling inference of the neural signal of interest. This algorithm is based on a a bayesian generative model for recordings, where a structured gaussian process is used to represent prior knowledge of the artifact. The algorithm achieves near perfect accuracy and enables the analysis of data hundreds of time faster than previous approaches.
The third part is motivated by the problem of inference of neural dynamics in the worm C.elegans: when taking a data-driven approach to this question, e.g., when using whole-brain calcium imaging data, one is faced with the need to match neural recordings to canonical neural identities, in practice resolved by tedious human labor. Alternatively, on a bayesian setup this problem may be cast as posterior inference of a latent permutation. I introduce methods that enable gradient-based approximate posterior inference of permutations, overcoming the difficulties imposed by the combinatorial and discrete nature of this object. Results suggest the feasibility of automating neural identification, and demonstrate variational inference in permutations is a sensible alternative to MCMC
Recommended from our members
Statistical Machine Learning & Deep Neural Networks Applied to Neural Data Analysis
Computational neuroscience seeks to discover the underlying mechanisms by which neural activity is generated. With the recent advancement in neural data acquisition methods, the bottleneck of this pursuit is the analysis of ever-growing volume of neural data acquired in numerous labs from various experiments. These analyses can be broadly divided into two categories. First, extraction of high quality neuronal signals from noisy large scale recordings. Second, inference for statistical models aimed at explaining the neuronal signals and underlying processes that give rise to them. Conventionally, majority of the methodologies employed for this effort are based on statistics and signal processing. However, in recent years recruiting Artificial Neural Networks (ANN) for neural data analysis is gaining traction. This is due to their immense success in computer vision and natural language processing, and the stellar track record of ANN architectures generalizing to a wide variety of problems. In this work we investigate and improve upon statistical and ANN machine learning methods applied to multi-electrode array recordings and inference for dynamical systems that play critical roles in computational neuroscience.
In the first and second part of this thesis, we focus on spike sorting problem. The analysis of large-scale multi-neuronal spike train data is crucial for current and future of neuroscience research. However, this type of data is not available directly from recordings and require further processing to be converted into spike trains. Dense multi-electrode arrays (MEA) are standard methods for collecting such recordings. The processing needed to extract spike trains from these raw electrical signals is carried out by ``spike sorting'' algorithms. We introduce a robust and scalable MEA spike sorting pipeline YASS (Yet Another Spike Sorter) to address many challenges that are inherent to this task. We primarily pay attention to MEA data collected from the primate retina for important reasons such as the unique challenges and available side information that ultimately assist us in scoring different spike sorting pipelines. We also introduce a Neural Network architecture and an accompanying training scheme specifically devised to address the challenging task of deconvolution in MEA recordings.
In the last part, we shift our attention to inference for non-linear dynamics. Dynamical systems are the governing force behind many real world phenomena and temporally correlated data. Recently, a number of neural network architectures have been proposed to address inference for nonlinear dynamical systems. We introduce two different methods based on normalizing flows for posterior inference in latent non-linear dynamical systems. We also present gradient-based amortized posterior inference approaches using the auto-encoding variational Bayes framework that can be applied to a wide range of generative models with nonlinear dynamics. We call our method (FNF). FNF performs favorably against state-of-the-art inference methods in terms of accuracy of predictions and quality of uncovered codes and dynamics on synthetic data