3,133 research outputs found
SInC: An accurate and fast error-model based simulator for SNPs, Indels and CNVs coupled with a read generator for short-read sequence data
We report SInC (SNV, Indel and CNV) simulator and read generator, an
open-source tool capable of simulating biological variants taking into account
a platform-specific error model. SInC is capable of simulating and generating
single- and paired-end reads with user-defined insert size with high efficiency
compared to the other existing tools. SInC, due to its multi-threaded
capability during read generation, has a low time footprint. SInC is currently
optimised to work in limited infrastructure setup and can efficiently exploit
the commonly used quad-core desktop architecture to simulate short sequence
reads with deep coverage for large genomes. Sinc can be downloaded from
https://sourceforge.net/projects/sincsimulator/
An Efficient Approximate kNN Graph Method for Diffusion on Image Retrieval
The application of the diffusion in many computer vision and artificial
intelligence projects has been shown to give excellent improvements in
performance. One of the main bottlenecks of this technique is the quadratic
growth of the kNN graph size due to the high-quantity of new connections
between nodes in the graph, resulting in long computation times. Several
strategies have been proposed to address this, but none are effective and
efficient. Our novel technique, based on LSH projections, obtains the same
performance as the exact kNN graph after diffusion, but in less time
(approximately 18 times faster on a dataset of a hundred thousand images). The
proposed method was validated and compared with other state-of-the-art on
several public image datasets, including Oxford5k, Paris6k, and Oxford105k
Farms, pipes, streams and reforestation : reasoning about structured parallel processes using types and hylomorphisms
The increasing importance of parallelism has motivated the creation of better abstractions for writing parallel software, including structured parallelism using nested algorithmic skeletons. Such approaches provide high-level abstractions that avoid common problems, such as race conditions, and often allow strong cost models to be defined. However, choosing a combination of algorithmic skeletons that yields good parallel speedups for a program on some specific parallel architecture remains a difficult task. In order to achieve this, it is necessary to simultaneously reason both about the costs of different parallel structures and about the semantic equivalences between them. This paper presents a new type-based mechanism that enables strong static reasoning about these properties. We exploit well-known properties of a very general recursion pattern, hylomorphisms, and give a denotational semantics for structured parallel processes in terms of these hylomorphisms. Using our approach, it is possible to determine formally whether it is possible to introduce a desired parallel structure into a program without altering its functional behaviour, and also to choose a version of that parallel structure that minimises some given cost model.Postprin
A Divide-and-Conquer Approach Towards Understanding Deep Networks
Deep neural networks have achieved tremendous success in various fields including medical image segmentation. However, they have long been criticized for being a black-box, in that interpretation, understanding and correcting architectures is difficult as there is no general theory for deep neural network design. Previously, precision learning was proposed to fuse deep architectures and traditional approaches. Deep networks constructed in this way benefit from the original known operator, have fewer parameters, and improved interpretability. However, they do not yield state-of-the-art performance in all applications. In this paper, we propose to analyze deep networks using known operators, by adopting a divide-and-conquer strategy to replace network components, whilst retaining networks performance. The task of retinal vessel segmentation is investigated for this purpose. We start with a high-performance U-Net and show by step-by-step conversion that we are able to divide the network into modules of known operators. The results indicate that a combination of a trainable guided filter and a trainable version of the Frangi filter yields a performance at the level of U-Net (AUC 0.974 vs. 0.972) with a tremendous reduction in parameters (111, 536 vs. 9, 575). In addition, the trained layers can be mapped back into their original algorithmic interpretation and analyzed using standard tools of signal processing
Large Scale SfM with the Distributed Camera Model
We introduce the distributed camera model, a novel model for
Structure-from-Motion (SfM). This model describes image observations in terms
of light rays with ray origins and directions rather than pixels. As such, the
proposed model is capable of describing a single camera or multiple cameras
simultaneously as the collection of all light rays observed. We show how the
distributed camera model is a generalization of the standard camera model and
describe a general formulation and solution to the absolute camera pose problem
that works for standard or distributed cameras. The proposed method computes a
solution that is up to 8 times more efficient and robust to rotation
singularities in comparison with gDLS. Finally, this method is used in an novel
large-scale incremental SfM pipeline where distributed cameras are accurately
and robustly merged together. This pipeline is a direct generalization of
traditional incremental SfM; however, instead of incrementally adding one
camera at a time to grow the reconstruction the reconstruction is grown by
adding a distributed camera. Our pipeline produces highly accurate
reconstructions efficiently by avoiding the need for many bundle adjustment
iterations and is capable of computing a 3D model of Rome from over 15,000
images in just 22 minutes.Comment: Published at 2016 3DV Conferenc
Advancing Divide-And-Conquer Phylogeny Estimation Using Robinson-Foulds Supertrees
One of the Grand Challenges in Science is the construction of the Tree of Life, an evolutionary tree containing several million species, spanning all life on earth. However, the construction of the Tree of Life is enormously computationally challenging, as all the current most accurate methods are either heuristics for NP-hard optimization problems or Bayesian MCMC methods that sample from tree space. One of the most promising approaches for improving scalability and accuracy for phylogeny estimation uses divide-and-conquer: a set of species is divided into overlapping subsets, trees are constructed on the subsets, and then merged together using a "supertree method". Here, we present Exact-RFS-2, the first polynomial-time algorithm to find an optimal supertree of two trees, using the Robinson-Foulds Supertree (RFS) criterion (a major approach in supertree estimation that is related to maximum likelihood supertrees), and we prove that finding the RFS of three input trees is NP-hard. We also present GreedyRFS (a greedy heuristic that operates by repeatedly using Exact-RFS-2 on pairs of trees, until all the trees are merged into a single supertree). We evaluate Exact-RFS-2 and GreedyRFS, and show that they have better accuracy than the current leading heuristic for RFS
Parallel processing and expert systems
Whether it be monitoring the thermal subsystem of Space Station Freedom, or controlling the navigation of the autonomous rover on Mars, NASA missions in the 90's cannot enjoy an increased level of autonomy without the efficient use of expert systems. Merely increasing the computational speed of uniprocessors may not be able to guarantee that real time demands are met for large expert systems. Speed-up via parallel processing must be pursued alongside the optimization of sequential implementations. Prototypes of parallel expert systems have been built at universities and industrial labs in the U.S. and Japan. The state-of-the-art research in progress related to parallel execution of expert systems was surveyed. The survey is divided into three major sections: (1) multiprocessors for parallel expert systems; (2) parallel languages for symbolic computations; and (3) measurements of parallelism of expert system. Results to date indicate that the parallelism achieved for these systems is small. In order to obtain greater speed-ups, data parallelism and application parallelism must be exploited
- ā¦