1,703 research outputs found

    Large-scale compression of genomic sequence databases with the Burrows-Wheeler transform

    Full text link
    Motivation The Burrows-Wheeler transform (BWT) is the foundation of many algorithms for compression and indexing of text data, but the cost of computing the BWT of very large string collections has prevented these techniques from being widely applied to the large sets of sequences often encountered as the outcome of DNA sequencing experiments. In previous work, we presented a novel algorithm that allows the BWT of human genome scale data to be computed on very moderate hardware, thus enabling us to investigate the BWT as a tool for the compression of such datasets. Results We first used simulated reads to explore the relationship between the level of compression and the error rate, the length of the reads and the level of sampling of the underlying genome and compare choices of second-stage compression algorithm. We demonstrate that compression may be greatly improved by a particular reordering of the sequences in the collection and give a novel `implicit sorting' strategy that enables these benefits to be realised without the overhead of sorting the reads. With these techniques, a 45x coverage of real human genome sequence data compresses losslessly to under 0.5 bits per base, allowing the 135.3Gbp of sequence to fit into only 8.2Gbytes of space (trimming a small proportion of low-quality bases from the reads improves the compression still further). This is more than 4 times smaller than the size achieved by a standard BWT-based compressor (bzip2) on the untrimmed reads, but an important further advantage of our approach is that it facilitates the building of compressed full text indexes such as the FM-index on large-scale DNA sequence collections.Comment: Version here is as submitted to Bioinformatics and is same as the previously archived version. This submission registers the fact that the advanced access version is now available at http://bioinformatics.oxfordjournals.org/content/early/2012/05/02/bioinformatics.bts173.abstract . Bioinformatics should be considered as the original place of publication of this article, please cite accordingl

    A Rapid Cloning Method Employing Orthogonal End Protection

    Get PDF
    We describe a novel in vitro cloning strategy that combines standard tools in molecular biology with a basic protecting group concept to create a versatile framework for the rapid and seamless assembly of modular DNA building blocks into functional open reading frames. Analogous to chemical synthesis strategies, our assembly design yields idempotent composite synthons amenable to iterative and recursive split-and-pool reaction cycles. As an example, we illustrate the simplicity, versatility and efficiency of the approach by constructing an open reading frame composed of tandem arrays of a human fibronectin type III (FNIII) domain and the von Willebrand Factor A2 domain (VWFA2), as well as chimeric (FNIII)n-VWFA2-(FNIII)n constructs. Although we primarily designed this strategy to accelerate assembly of repetitive constructs for single-molecule force spectroscopy, we anticipate that this approach is equally applicable to the reconstitution and modification of complex modular sequences including structural and functional analysis of multi-domain proteins, synthetic biology or the modular construction of episomal vectors

    Towards Communication-Efficient Quantum Oblivious Key Distribution

    Get PDF
    Oblivious Transfer, a fundamental problem in the field of secure multi-party computation is defined as follows: A database DB of N bits held by Bob is queried by a user Alice who is interested in the bit DB_b in such a way that (1) Alice learns DB_b and only DB_b and (2) Bob does not learn anything about Alice's choice b. While solutions to this problem in the classical domain rely largely on unproven computational complexity theoretic assumptions, it is also known that perfect solutions that guarantee both database and user privacy are impossible in the quantum domain. Jakobi et al. [Phys. Rev. A, 83(2), 022301, Feb 2011] proposed a protocol for Oblivious Transfer using well known QKD techniques to establish an Oblivious Key to solve this problem. Their solution provided a good degree of database and user privacy (using physical principles like impossibility of perfectly distinguishing non-orthogonal quantum states and the impossibility of superluminal communication) while being loss-resistant and implementable with commercial QKD devices (due to the use of SARG04). However, their Quantum Oblivious Key Distribution (QOKD) protocol requires a communication complexity of O(N log N). Since modern databases can be extremely large, it is important to reduce this communication as much as possible. In this paper, we first suggest a modification of their protocol wherein the number of qubits that need to be exchanged is reduced to O(N). A subsequent generalization reduces the quantum communication complexity even further in such a way that only a few hundred qubits are needed to be transferred even for very large databases.Comment: 7 page

    Weak and strong electronic correlations in Fe superconductors

    Full text link
    In this chapter the strength of electronic correlations in the normal phase of Fe-superconductors is discussed. It will be shown that the agreement between a wealth of experiments and DFT+DMFT or similar approaches supports a scenario in which strongly-correlated and weakly-correlated electrons coexist in the conduction bands of these materials. I will then reverse-engineer the realistic calculations and justify this scenario in terms of simpler behaviors easily interpreted through model results. All pieces come together to show that Hund's coupling, besides being responsible for the electronic correlations even in absence of a strong Coulomb repulsion is also the origin of a subtle emergent behavior: orbital decoupling. Indeed Hund's exchange decouples the charge excitations in the different Iron orbitals involved in the conduction bands thus causing an independent tuning of the degree of electronic correlation in each one of them. The latter becomes sensitive almost only to the offset of the orbital population from half-filling, where a Mott insulating state is invariably realized at these interaction strengths. Depending on the difference in orbital population a different 'Mottness' affects each orbital, and thus reflects in the conduction bands and in the Fermi surfaces depending on the orbital content.Comment: Book Chapte

    Practical private database queries based on a quantum key distribution protocol

    Get PDF
    Private queries allow a user Alice to learn an element of a database held by a provider Bob without revealing which element she was interested in, while limiting her information about the other elements. We propose to implement private queries based on a quantum key distribution protocol, with changes only in the classical post-processing of the key. This approach makes our scheme both easy to implement and loss-tolerant. While unconditionally secure private queries are known to be impossible, we argue that an interesting degree of security can be achieved, relying on fundamental physical principles instead of unverifiable security assumptions in order to protect both user and database. We think that there is scope for such practical private queries to become another remarkable application of quantum information in the footsteps of quantum key distribution.Comment: 7 pages, 2 figures, new and improved version, clarified claims, expanded security discussio

    Competition of crystal field splitting and Hund's rule coupling in two-orbital magnetic metal-insulator transitions

    Full text link
    Competition of crystal field splitting and Hund's rule coupling in magnetic metal-insulator transitions of half-filled two-orbital Hubbard model is investigated by multi-orbital slave-boson mean field theory. We show that with the increase of Coulomb correlation, the system firstly transits from a paramagnetic (PM) metal to a {\it N\'{e}el} antiferromagnetic (AFM) Mott insulator, or a nonmagnetic orbital insulator, depending on the competition of crystal field splitting and the Hund's rule coupling. The different AFM Mott insulator, PM metal and orbital insulating phase are none, partially and fully orbital polarized, respectively. For a small JHJ_{H} and a finite crystal field, the orbital insulator is robust. Although the system is nonmagnetic, the phase boundary of the orbital insulator transition obviously shifts to the small UU regime after the magnetic correlations is taken into account. These results demonstrate that large crystal field splitting favors the formation of the orbital insulating phase, while large Hund's rule coupling tends to destroy it, driving the low-spin to high-spin transition.Comment: 4 pages, 4 figure

    The Magic Number Problem for Subregular Language Families

    Full text link
    We investigate the magic number problem, that is, the question whether there exists a minimal n-state nondeterministic finite automaton (NFA) whose equivalent minimal deterministic finite automaton (DFA) has alpha states, for all n and alpha satisfying n less or equal to alpha less or equal to exp(2,n). A number alpha not satisfying this condition is called a magic number (for n). It was shown in [11] that no magic numbers exist for general regular languages, while in [5] trivial and non-trivial magic numbers for unary regular languages were identified. We obtain similar results for automata accepting subregular languages like, for example, combinational languages, star-free, prefix-, suffix-, and infix-closed languages, and prefix-, suffix-, and infix-free languages, showing that there are only trivial magic numbers, when they exist. For finite languages we obtain some partial results showing that certain numbers are non-magic.Comment: In Proceedings DCFS 2010, arXiv:1008.127

    Experimental validation of 4D log file-based proton dose reconstruction for interplay assessment considering amplitude-sorted 4DCTs

    Get PDF
    Purpose The unpredictable interplay between dynamic proton therapy delivery and target motion in the thorax can lead to severe dose distortions. A fraction-wise four-dimensional (4D) dose reconstruction workflow allows for the assessment of the applied dose after patient treatment while considering the actual beam delivery sequence extracted from machine log files, the recorded breathing pattern and the geometric information from a 4D computed tomography scan (4DCT). Such an algorithm capable of accounting for amplitude-sorted 4DCTs was implemented and its accuracy as well as its sensitivity to input parameter variations was experimentally evaluated. Methods An anthropomorphic thorax phantom with a movable insert containing a target surrogate and a radiochromic film was irradiated with a monoenergetic field for various 1D target motion forms (sin, sin(4)) and peak-to-peak amplitudes (5/10/15/20/30 mm). The measured characteristic film dose distributions were compared to the respective sections in the 4D reconstructed doses using a 2D gamma-analysis (3 mm, 3%); gamma-pass rates were derived for different dose grid resolutions (1 mm/3 mm) and deformable image registrations (DIR, automatic/manual) applied during the 4D dose reconstruction process. In an additional analysis, the sensitivity of reconstructed dose distributions against potential asynchronous timing of the motion and machine log files was investigated for both a monoenergetic field and more realistic 4D robustly optimized fields by artificially introduced offsets of +/- 1/5/25/50/250 ms. The resulting dose distributions with asynchronized log files were compared to those with synchronized log files by means of a 3D gamma-analysis (1 mm, 1%) and the evaluation of absolute dose differences. Results The induced characteristic interplay patterns on the films were well reproduced by the 4D dose reconstruction with 2D gamma-pass rates >= 95% for almost all cases with motion magnitude
    corecore