1,831 research outputs found
Developing and applying heterogeneous phylogenetic models with XRate
Modeling sequence evolution on phylogenetic trees is a useful technique in
computational biology. Especially powerful are models which take account of the
heterogeneous nature of sequence evolution according to the "grammar" of the
encoded gene features. However, beyond a modest level of model complexity,
manual coding of models becomes prohibitively labor-intensive. We demonstrate,
via a set of case studies, the new built-in model-prototyping capabilities of
XRate (macros and Scheme extensions). These features allow rapid implementation
of phylogenetic models which would have previously been far more
labor-intensive. XRate's new capabilities for lineage-specific models,
ancestral sequence reconstruction, and improved annotation output are also
discussed. XRate's flexible model-specification capabilities and computational
efficiency make it well-suited to developing and prototyping phylogenetic
grammar models. XRate is available as part of the DART software package:
http://biowiki.org/DART .Comment: 34 pages, 3 figures, glossary of XRate model terminolog
Memoizing a monadic mixin DSL
Modular extensibility is a highly desirable property of a domain-specific language (DSL): the ability to add new features without affecting the implementation of existing features. Functional mixins (also known as open recursion) are very suitable for this purpose.
We study the use of mixins in Haskell for a modular DSL for search heuristics used in systematic solvers for combinatorial problems, that generate optimized C++ code from a high-level specification. We show how to apply memoization techniques to tackle performance issues and code explosion due to the high recursion inherent to the semantics of combinatorial search.
As such heuristics are conventionally implemented as highly entangled imperative algorithms, our Haskell mixins are monadic. Memoization of monadic components causes further complications for us to deal with
Image to Image Translation for Domain Adaptation
We propose a general framework for unsupervised domain adaptation, which
allows deep neural networks trained on a source domain to be tested on a
different target domain without requiring any training annotations in the
target domain. This is achieved by adding extra networks and losses that help
regularize the features extracted by the backbone encoder network. To this end
we propose the novel use of the recently proposed unpaired image-toimage
translation framework to constrain the features extracted by the encoder
network. Specifically, we require that the features extracted are able to
reconstruct the images in both domains. In addition we require that the
distribution of features extracted from images in the two domains are
indistinguishable. Many recent works can be seen as specific cases of our
general framework. We apply our method for domain adaptation between MNIST,
USPS, and SVHN datasets, and Amazon, Webcam and DSLR Office datasets in
classification tasks, and also between GTA5 and Cityscapes datasets for a
segmentation task. We demonstrate state of the art performance on each of these
datasets
An Automated Images-to-Graphs Framework for High Resolution Connectomics
Reconstructing a map of neuronal connectivity is a critical challenge in
contemporary neuroscience. Recent advances in high-throughput serial section
electron microscopy (EM) have produced massive 3D image volumes of nanoscale
brain tissue for the first time. The resolution of EM allows for individual
neurons and their synaptic connections to be directly observed. Recovering
neuronal networks by manually tracing each neuronal process at this scale is
unmanageable, and therefore researchers are developing automated image
processing modules. Thus far, state-of-the-art algorithms focus only on the
solution to a particular task (e.g., neuron segmentation or synapse
identification).
In this manuscript we present the first fully automated images-to-graphs
pipeline (i.e., a pipeline that begins with an imaged volume of neural tissue
and produces a brain graph without any human interaction). To evaluate overall
performance and select the best parameters and methods, we also develop a
metric to assess the quality of the output graphs. We evaluate a set of
algorithms and parameters, searching possible operating points to identify the
best available brain graph for our assessment metric. Finally, we deploy a
reference end-to-end version of the pipeline on a large, publicly available
data set. This provides a baseline result and framework for community analysis
and future algorithm development and testing. All code and data derivatives
have been made publicly available toward eventually unlocking new biofidelic
computational primitives and understanding of neuropathologies.Comment: 13 pages, first two authors contributed equally V2: Added additional
experiments and clarifications; added information on infrastructure and
pipeline environmen
- …