370 research outputs found

    GPU-Accelerated BWT Construction for Large Collection of Short Reads

    Full text link
    Advances in DNA sequencing technology have stimulated the development of algorithms and tools for processing very large collections of short strings (reads). Short-read alignment and assembly are among the most well-studied problems. Many state-of-the-art aligners, at their core, have used the Burrows-Wheeler transform (BWT) as a main-memory index of a reference genome (typical example, NCBI human genome). Recently, BWT has also found its use in string-graph assembly, for indexing the reads (i.e., raw data from DNA sequencers). In a typical data set, the volume of reads is tens of times of the sequenced genome and can be up to 100 Gigabases. Note that a reference genome is relatively stable and computing the index is not a frequent task. For reads, the index has to computed from scratch for each given input. The ability of efficient BWT construction becomes a much bigger concern than before. In this paper, we present a practical method called CX1 for constructing the BWT of very large string collections. CX1 is the first tool that can take advantage of the parallelism given by a graphics processing unit (GPU, a relative cheap device providing a thousand or more primitive cores), as well as simultaneously the parallelism from a multi-core CPU and more interestingly, from a cluster of GPU-enabled nodes. Using CX1, the BWT of a short-read collection of up to 100 Gigabases can be constructed in less than 2 hours using a machine equipped with a quad-core CPU and a GPU, or in about 43 minutes using a cluster with 4 such machines (the speedup is almost linear after excluding the first 16 minutes for loading the reads from the hard disk). The previously fastest tool BRC is measured to take 12 hours to process 100 Gigabases on one machine; it is non-trivial how BRC can be parallelized to take advantage a cluster of machines, let alone GPUs.Comment: 11 page

    Structural roles of CTG repeats in slippage expansion during DNA replication

    Get PDF
    CTG triplet repeat sequences have been found to form slipped-strand structures leading to self-expansion during DNA replication. The lengthening of these repeats causes the onset of neurodegenerative diseases, such as myotonic dystrophy. In this study, electrophoretic and NMR spectroscopic studies have been carried out to investigate the length and the structural roles of CTG repeats in affecting the hairpin formation propensity. Direct NMR evidence has been successfully obtained the first time to support the presence of three types of hairpin structures in sequences containing 1–10 CTG repeats. The first type contains no intra-loop hydrogen bond and occurs when the number of repeats is less than four. The second type has a 4 nt TGCT-loop and occurs in sequences with even number of repeats. The third type contains a 3 nt CTG-loop and occurs in sequences with odd number of repeats. Although stabilizing interactions have been identified between CTG repeats in both the second and third types of hairpins, the structural differences observed account for the higher hairpin formation propensity in sequences containing even number of CTG repeats. The results of this study confirm the hairpin loop structures and explain how slippage occurs during DNA replication

    MEGAHIT: An ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph

    Get PDF
    MEGAHIT is a NGS de novo assembler for assembling large and complex metagenomics data in a time- and cost-efficient manner. It finished assembling a soil metagenomics dataset with 252Gbps in 44.1 hours and 99.6 hours on a single computing node with and without a GPU, respectively. MEGAHIT assembles the data as a whole, i.e., it avoids pre-processing like partitioning and normalization, which might compromise on result integrity. MEGAHIT generates 3 times larger assembly, with longer contig N50 and average contig length than the previous assembly. 55.8% of the reads were aligned to the assembly, which is 4 times higher than the previous. The source code of MEGAHIT is freely available at https://github.com/voutcn/megahit under GPLv3 license.Comment: 2 pages, 2 tables, 1 figure, submitted to Oxford Bioinformatics as an Application Not

    The regulation of Sox3 function in zebrafish embryonic development

    Get PDF
    Embryogenesis in vertebrates is regulated by highly complicated signaling networks, which involve various signaling pathways and factors. Sox3 is known to have critical roles during the whole of embryonic development in vertebrates. In zebrafish, it has been shown that Sox3 restricts organizer formation as well as inhibiting Fgf signaling, which is required for the expression of organizer genes. On the other hand, SUMOylation has been suggested to be a regulator of the transcriptional activity of Sox3. The SUMOylaton has previously been demonstrated on mouse Sox3 and chick Sox3. In this study, I examined the role and regulation of Sox3 in early embryogenesis in zebrafish. In the first part of the study, I inspected the detailed expression of gsc and chd, in comparison to foxd3 expression. It was demonstrated that Sox3 and Fgf signaling could repress both gsc and chd independently from each other. Sox3 was also found to be able to directly act on the promoter region of gsc. In the second part of the study, it was shown that the SUMOylation of Sox3 appeared to occur in zebrafish embryos. The SUMOylation of Sox3 was also shown to correlate with the chromatin fraction, although there was a very small fraction of Sox3 SUMOylated. The biological effect of SUMOylation of Sox3 has also been analysed. It was shown that SUMOylation of Sox3 could enhance the transcriptional repressor activity of Sox3. The result of luciferase reporter assay on the promoter region of boz also suggested that SUMOylation eliminated the transcriptional activator activity of Sox3. These data raise the possibility that SUMOylation of Sox3 might act as the switch from a transcriptional activator to a repressor

    Managing inter-agency co-ordination : an analysis of district level administration in Hong Kong

    Get PDF
    published_or_final_versionPolitics and Public AdministrationMasterMaster of Public Administratio
    corecore