1,227,919 research outputs found
Annotating Synapses in Large EM Datasets
Reconstructing neuronal circuits at the level of synapses is a central
problem in neuroscience and becoming a focus of the emerging field of
connectomics. To date, electron microscopy (EM) is the most proven technique
for identifying and quantifying synaptic connections. As advances in EM make
acquiring larger datasets possible, subsequent manual synapse identification
({\em i.e.}, proofreading) for deciphering a connectome becomes a major time
bottleneck. Here we introduce a large-scale, high-throughput, and
semi-automated methodology to efficiently identify synapses. We successfully
applied our methodology to the Drosophila medulla optic lobe, annotating many
more synapses than previous connectome efforts. Our approaches are extensible
and will make the often complicated process of synapse identification
accessible to a wider-community of potential proofreaders
Analyze Large Multidimensional Datasets Using Algebraic Topology
This paper presents an efficient algorithm to extract knowledge from high-dimensionality, high- complexity datasets using algebraic topology, namely simplicial complexes. Based on concept of isomorphism of relations, our method turn a relational table into a geometric object (a simplicial complex is a polyhedron). So, conceptually association rule searching is turned into a geometric traversal problem. By leveraging on the core concepts behind Simplicial Complex, we use a new technique (in computer science) that improves the performance over existing methods and uses far less memory. It was designed and developed with a strong emphasis on scalability, reliability, and extensibility. This paper also investigate the possibility of Hadoop integration and the challenges that come with the framework
Bayesian Nonstationary Spatial Modeling for Very Large Datasets
With the proliferation of modern high-resolution measuring instruments
mounted on satellites, planes, ground-based vehicles and monitoring stations, a
need has arisen for statistical methods suitable for the analysis of large
spatial datasets observed on large spatial domains. Statistical analyses of
such datasets provide two main challenges: First, traditional
spatial-statistical techniques are often unable to handle large numbers of
observations in a computationally feasible way. Second, for large and
heterogeneous spatial domains, it is often not appropriate to assume that a
process of interest is stationary over the entire domain.
We address the first challenge by using a model combining a low-rank
component, which allows for flexible modeling of medium-to-long-range
dependence via a set of spatial basis functions, with a tapered remainder
component, which allows for modeling of local dependence using a compactly
supported covariance function. Addressing the second challenge, we propose two
extensions to this model that result in increased flexibility: First, the model
is parameterized based on a nonstationary Matern covariance, where the
parameters vary smoothly across space. Second, in our fully Bayesian model, all
components and parameters are considered random, including the number,
locations, and shapes of the basis functions used in the low-rank component.
Using simulated data and a real-world dataset of high-resolution soil
measurements, we show that both extensions can result in substantial
improvements over the current state-of-the-art.Comment: 16 pages, 2 color figure
- …
