11,004 research outputs found
Reference Based Genome Compression
DNA sequencing technology has advanced to a point where storage is becoming
the central bottleneck in the acquisition and mining of more data. Large
amounts of data are vital for genomics research, and generic compression tools,
while viable, cannot offer the same savings as approaches tuned to inherent
biological properties. We propose an algorithm to compress a target genome
given a known reference genome. The proposed algorithm first generates a
mapping from the reference to the target genome, and then compresses this
mapping with an entropy coder. As an illustration of the performance: applying
our algorithm to James Watson's genome with hg18 as a reference, we are able to
reduce the 2991 megabyte (MB) genome down to 6.99 MB, while Gzip compresses it
to 834.8 MB.Comment: 5 pages; Submitted to the IEEE Information Theory Workshop (ITW) 201
Ellipsoidal halo finders and implications for models of triaxial halo formation
We describe an algorithm for identifying ellipsoidal haloes in numerical
simulations, and quantify how the resulting estimates of halo mass and shape
differ with respect to spherical halo finders. Haloes become more prolate when
fit with ellipsoids, the difference being most pronounced for the more
aspherical objects. Although the ellipsoidal mass is systematically larger,
this is less than 10% for most of the haloes. However, even this small
difference in mass corresponds to a significant difference in shape. We
quantify these effects also on the initial mass and deformation tensors, on
which most models of triaxial collapse are based. By studying the properties of
protohaloes in the initial conditions, we find that models in which protohaloes
are identified in Lagrangian space by three positive eigenvalues of the
deformation tensor are tenable only at the masses well-above . The
overdensity within almost any protohalo is larger than the critical
value associated with spherical collapse (increasing as mass decreases); this
is in good qualitative agreement with models which identify haloes requiring
that collapse have occured along all three principal axes, each axis having
turned around from the universal expansion at a different time. The
distributions of initial values are in agreement with the simplest predictions
associated with ellipsoidal collapse, assuming initially spherical protohaloes,
collapsed around random positions which were sufficiently overdense. However,
most protohaloes are not spherical and departures from sphericity increase as
protohalo mass decreases. [Abridged]Comment: 18 pages, 17 figures, accepted for publication in MNRA
Refractive Index Matched Scanning and Detection of Soft Particle
We describe here how to apply the three dimensional imaging technique of
refrecative index matched scanning to hydrogel spheres. Hydrogels are water
based materials with a low refractive index, which allows for index matching
with water-based solvent mixtures. We discuss here various experimental
techniques required to handle specifically hydrogel spheres as opposed to other
transparent materials. The deformability of hydrogel spheres makes their
identification in three dimensional images non-trivial. We will also discuss
numerical techniques that can be used in general to detect contacting,
non-spherical particles in a three dimensional image. The experimental and
numerical techniques presented here give experimental access to the stress
tensor of a packing of deformed particles.Comment: 9 pages, 9 figures, submitted to review of scientific instruments,
Issue 1
File Updates Under Random/Arbitrary Insertions And Deletions
A client/encoder edits a file, as modeled by an insertion-deletion (InDel)
process. An old copy of the file is stored remotely at a data-centre/decoder,
and is also available to the client. We consider the problem of throughput- and
computationally-efficient communication from the client to the data-centre, to
enable the server to update its copy to the newly edited file. We study two
models for the source files/edit patterns: the random pre-edit sequence
left-to-right random InDel (RPES-LtRRID) process, and the arbitrary pre-edit
sequence arbitrary InDel (APES-AID) process. In both models, we consider the
regime in which the number of insertions/deletions is a small (but constant)
fraction of the original file. For both models we prove information-theoretic
lower bounds on the best possible compression rates that enable file updates.
Conversely, our compression algorithms use dynamic programming (DP) and entropy
coding, and achieve rates that are approximately optimal.Comment: The paper is an extended version of our paper to be appeared at ITW
201
Multi-rate, real time image compression for images dominated by point sources
An image compression system recently developed for compression of digital images dominated by point sources is presented. Encoding consists of minimum-mean removal, vector quantization, adaptive threshold truncation, and modified Huffman encoding. Simulations are presented showing that the peaks corresponding to point sources can be transmitted losslessly for low signal-to-noise ratios (SNR) and high point source densities while maintaining a reduced output bit rate. Encoding and decoding hardware has been built and tested which processes 552,960 12-bit pixels per second at compression rates of 10:1 and 4:1. Simulation results are presented for the 10:1 case only
- …