11,004 research outputs found

    Reference Based Genome Compression

    Full text link
    DNA sequencing technology has advanced to a point where storage is becoming the central bottleneck in the acquisition and mining of more data. Large amounts of data are vital for genomics research, and generic compression tools, while viable, cannot offer the same savings as approaches tuned to inherent biological properties. We propose an algorithm to compress a target genome given a known reference genome. The proposed algorithm first generates a mapping from the reference to the target genome, and then compresses this mapping with an entropy coder. As an illustration of the performance: applying our algorithm to James Watson's genome with hg18 as a reference, we are able to reduce the 2991 megabyte (MB) genome down to 6.99 MB, while Gzip compresses it to 834.8 MB.Comment: 5 pages; Submitted to the IEEE Information Theory Workshop (ITW) 201

    Ellipsoidal halo finders and implications for models of triaxial halo formation

    Full text link
    We describe an algorithm for identifying ellipsoidal haloes in numerical simulations, and quantify how the resulting estimates of halo mass and shape differ with respect to spherical halo finders. Haloes become more prolate when fit with ellipsoids, the difference being most pronounced for the more aspherical objects. Although the ellipsoidal mass is systematically larger, this is less than 10% for most of the haloes. However, even this small difference in mass corresponds to a significant difference in shape. We quantify these effects also on the initial mass and deformation tensors, on which most models of triaxial collapse are based. By studying the properties of protohaloes in the initial conditions, we find that models in which protohaloes are identified in Lagrangian space by three positive eigenvalues of the deformation tensor are tenable only at the masses well-above M∗M_*. The overdensity δ\delta within almost any protohalo is larger than the critical value associated with spherical collapse (increasing as mass decreases); this is in good qualitative agreement with models which identify haloes requiring that collapse have occured along all three principal axes, each axis having turned around from the universal expansion at a different time. The distributions of initial values are in agreement with the simplest predictions associated with ellipsoidal collapse, assuming initially spherical protohaloes, collapsed around random positions which were sufficiently overdense. However, most protohaloes are not spherical and departures from sphericity increase as protohalo mass decreases. [Abridged]Comment: 18 pages, 17 figures, accepted for publication in MNRA

    Refractive Index Matched Scanning and Detection of Soft Particle

    Full text link
    We describe here how to apply the three dimensional imaging technique of refrecative index matched scanning to hydrogel spheres. Hydrogels are water based materials with a low refractive index, which allows for index matching with water-based solvent mixtures. We discuss here various experimental techniques required to handle specifically hydrogel spheres as opposed to other transparent materials. The deformability of hydrogel spheres makes their identification in three dimensional images non-trivial. We will also discuss numerical techniques that can be used in general to detect contacting, non-spherical particles in a three dimensional image. The experimental and numerical techniques presented here give experimental access to the stress tensor of a packing of deformed particles.Comment: 9 pages, 9 figures, submitted to review of scientific instruments, Issue 1

    File Updates Under Random/Arbitrary Insertions And Deletions

    Full text link
    A client/encoder edits a file, as modeled by an insertion-deletion (InDel) process. An old copy of the file is stored remotely at a data-centre/decoder, and is also available to the client. We consider the problem of throughput- and computationally-efficient communication from the client to the data-centre, to enable the server to update its copy to the newly edited file. We study two models for the source files/edit patterns: the random pre-edit sequence left-to-right random InDel (RPES-LtRRID) process, and the arbitrary pre-edit sequence arbitrary InDel (APES-AID) process. In both models, we consider the regime in which the number of insertions/deletions is a small (but constant) fraction of the original file. For both models we prove information-theoretic lower bounds on the best possible compression rates that enable file updates. Conversely, our compression algorithms use dynamic programming (DP) and entropy coding, and achieve rates that are approximately optimal.Comment: The paper is an extended version of our paper to be appeared at ITW 201

    Multi-rate, real time image compression for images dominated by point sources

    Get PDF
    An image compression system recently developed for compression of digital images dominated by point sources is presented. Encoding consists of minimum-mean removal, vector quantization, adaptive threshold truncation, and modified Huffman encoding. Simulations are presented showing that the peaks corresponding to point sources can be transmitted losslessly for low signal-to-noise ratios (SNR) and high point source densities while maintaining a reduced output bit rate. Encoding and decoding hardware has been built and tested which processes 552,960 12-bit pixels per second at compression rates of 10:1 and 4:1. Simulation results are presented for the 10:1 case only
    • …
    corecore