18 research outputs found
Computing Scalable Multivariate Glocal Invariants of Large (Brain-) Graphs
Graphs are quickly emerging as a leading abstraction for the representation
of data. One important application domain originates from an emerging
discipline called "connectomics". Connectomics studies the brain as a graph;
vertices correspond to neurons (or collections thereof) and edges correspond to
structural or functional connections between them. To explore the variability
of connectomes---to address both basic science questions regarding the
structure of the brain, and medical health questions about psychiatry and
neurology---one can study the topological properties of these brain-graphs. We
define multivariate glocal graph invariants: these are features of the graph
that capture various local and global topological properties of the graphs. We
show that the collection of features can collectively be computed via a
combination of daisy-chaining, sparse matrix representation and computations,
and efficient approximations. Our custom open-source Python package serves as a
back-end to a Web-service that we have created to enable researchers to upload
graphs, and download the corresponding invariants in a number of different
formats. Moreover, we built this package to support distributed processing on
multicore machines. This is therefore an enabling technology for network
science, lowering the barrier of entry by providing tools to biologists and
analysts who otherwise lack these capabilities. As a demonstration, we run our
code on 120 brain-graphs, each with approximately 16M vertices and up to 90M
edges.Comment: Published as part of 2013 IEEE GlobalSIP conferenc
An Automated Images-to-Graphs Framework for High Resolution Connectomics
Reconstructing a map of neuronal connectivity is a critical challenge in
contemporary neuroscience. Recent advances in high-throughput serial section
electron microscopy (EM) have produced massive 3D image volumes of nanoscale
brain tissue for the first time. The resolution of EM allows for individual
neurons and their synaptic connections to be directly observed. Recovering
neuronal networks by manually tracing each neuronal process at this scale is
unmanageable, and therefore researchers are developing automated image
processing modules. Thus far, state-of-the-art algorithms focus only on the
solution to a particular task (e.g., neuron segmentation or synapse
identification).
In this manuscript we present the first fully automated images-to-graphs
pipeline (i.e., a pipeline that begins with an imaged volume of neural tissue
and produces a brain graph without any human interaction). To evaluate overall
performance and select the best parameters and methods, we also develop a
metric to assess the quality of the output graphs. We evaluate a set of
algorithms and parameters, searching possible operating points to identify the
best available brain graph for our assessment metric. Finally, we deploy a
reference end-to-end version of the pipeline on a large, publicly available
data set. This provides a baseline result and framework for community analysis
and future algorithm development and testing. All code and data derivatives
have been made publicly available toward eventually unlocking new biofidelic
computational primitives and understanding of neuropathologies.Comment: 13 pages, first two authors contributed equally V2: Added additional
experiments and clarifications; added information on infrastructure and
pipeline environmen
The topology of large Open Connectome networks for the human brain
The structural human connectome (i.e.\ the network of fiber connections in
the brain) can be analyzed at ever finer spatial resolution thanks to advances
in neuroimaging. Here we analyze several large data sets for the human brain
network made available by the Open Connectome Project. We apply statistical
model selection to characterize the degree distributions of graphs containing
up to nodes and edges. A three-parameter
generalized Weibull (also known as a stretched exponential) distribution is a
good fit to most of the observed degree distributions. For almost all networks,
simple power laws cannot fit the data, but in some cases there is statistical
support for power laws with an exponential cutoff. We also calculate the
topological (graph) dimension and the small-world coefficient of
these networks. While suggests a small-world topology, we found that
showing that long-distance connections provide only a small correction
to the topology of the embedding three-dimensional space.Comment: 14 pages, 6 figures, accepted version in Scientific Report
Enabling Scalable Neurocartography: Images to Graphs for Discovery
In recent years, advances in technology have enabled researchers to ask new questions predicated on the collection and analysis of big datasets that were previously too large to study. More specifically, many fundamental questions in neuroscience require studying brain tissue at a large scale to discover emergent properties of neural computation, consciousness, and etiologies of brain disorders. A major challenge is to construct larger, more detailed maps (e.g., structural wiring diagrams) of the brain, known as connectomes.
Although raw data exist, obstacles remain in both algorithm development and scalable image analysis to enable access to the knowledge within these data volumes. This dissertation develops, combines and tests state-of-the-art algorithms to estimate graphs and glean other knowledge across six orders of magnitude, from millimeter-scale magnetic resonance imaging to nanometer-scale electron microscopy.
This work enables scientific discovery across the community and contributes to the tools and services offered by NeuroData and the Open Connectome Project. Contributions include creating, optimizing and evaluating the first known fully-automated brain graphs in electron microscopy data and magnetic resonance imaging data; pioneering approaches to generate knowledge from X-Ray tomography imaging; and identifying and solving a variety of image analysis challenges associated with building graphs suitable for discovery. These methods were applied across diverse datasets to answer questions at scales not previously explored
Analysis of Biochemical Reaction Networks using Tropical and Polyhedral Geometry Methods
The field of systems biology makes an attempt to realise various biological functions and processes as the emergent properties of the underlying biochemical network model. The area of computational systems biology deals with the computational methods to compute such properties. In this context, the thesis primarily discusses novel computational methods to compute the emergent properties as well as to recognize the essence in complex network models. The computational methods described in the thesis are based on the computer algebra techniques, namely tropical geometry and extreme currents. Tropical geometry is based on ideas of dominance of monomials appearing in a system of differential equations, which are often used to describe the dynamics of the network model. In such differential equation based models, tropical geometry deals with identification of the metastable regimes, defined as low dimensional regions of the phase space close to which the dynamics is much slower compared to the rest of the phase space. The application of such properties in model reduction and symbolic dynamics are demonstrated in the network models obtained from a public database namely Biomodels. Extreme currents are limiting edges of the convex polyhedrons describing the admissible fluxes in biochemical networks, which are helpful to decompose a biochemical network into a set of irreducible pathways. The pathways are shown to be associated with given clinical outcomes thereby providing some mechanistic insights associated with the clinical phenotypes. Similar to the tropical geometry, the method based on extreme currents is evaluated on the network models derived from a public database namely KEGG. Therefore, this thesis makes an attempt to explain the emergent properties of the network model by determining extreme currents or metastable regimes. Additionally, their applicability in the real world network models are discussed
Shape Representations Using Nested Descriptors
The problem of shape representation is a core problem in computer vision. It can be argued that shape representation is the most central representational problem for computer vision, since unlike texture or color, shape alone can be used for perceptual tasks such as image matching, object detection and object categorization.
This dissertation introduces a new shape representation called the nested descriptor. A nested descriptor represents shape both globally and locally by pooling salient scaled and oriented complex gradients in a large nested support set. We show that this nesting property introduces a nested correlation structure that enables a new local distance function called the nesting distance, which provides a provably robust similarity function for image matching. Furthermore, the nesting property suggests an elegant flower like normalization strategy called a log-spiral difference. We show that this normalization enables a compact binary representation and is equivalent to a form a bottom up saliency. This suggests that the nested descriptor representational power is due to representing salient edges, which makes a fundamental connection between the saliency and local feature descriptor literature. In this dissertation, we introduce three examples of shape representation using nested descriptors: nested shape descriptors for imagery, nested motion descriptors for video and nested pooling for activities. We show evaluation results for these representations that demonstrate state-of-the-art performance for image matching, wide baseline stereo and activity recognition tasks
Evolutionary genomics : statistical and computational methods
This open access book addresses the challenge of analyzing and understanding the evolutionary dynamics of complex biological systems at the genomic level, and elaborates on some promising strategies that would bring us closer to uncovering of the vital relationships between genotype and phenotype. After a few educational primers, the book continues with sections on sequence homology and alignment, phylogenetic methods to study genome evolution, methodologies for evaluating selective pressures on genomic sequences as well as genomic evolution in light of protein domain architecture and transposable elements, population genomics and other omics, and discussions of current bottlenecks in handling and analyzing genomic data. Written for the highly successful Methods in Molecular Biology series, chapters include the kind of detail and expert implementation advice that lead to the best results. Authoritative and comprehensive, Evolutionary Genomics: Statistical and Computational Methods, Second Edition aims to serve both novices in biology with strong statistics and computational skills, and molecular biologists with a good grasp of standard mathematical concepts, in moving this important field of study forward
Evolutionary Genomics
This open access book addresses the challenge of analyzing and understanding the evolutionary dynamics of complex biological systems at the genomic level, and elaborates on some promising strategies that would bring us closer to uncovering of the vital relationships between genotype and phenotype. After a few educational primers, the book continues with sections on sequence homology and alignment, phylogenetic methods to study genome evolution, methodologies for evaluating selective pressures on genomic sequences as well as genomic evolution in light of protein domain architecture and transposable elements, population genomics and other omics, and discussions of current bottlenecks in handling and analyzing genomic data. Written for the highly successful Methods in Molecular Biology series, chapters include the kind of detail and expert implementation advice that lead to the best results. Authoritative and comprehensive, Evolutionary Genomics: Statistical and Computational Methods, Second Edition aims to serve both novices in biology with strong statistics and computational skills, and molecular biologists with a good grasp of standard mathematical concepts, in moving this important field of study forward