4 research outputs found

    Oculus: faster sequence alignment by streaming read compression

    Full text link
    Abstract Background Despite significant advancement in alignment algorithms, the exponential growth of nucleotide sequencing throughput threatens to outpace bioinformatic analysis. Computation may become the bottleneck of genome analysis if growing alignment costs are not mitigated by further improvement in algorithms. Much gain has been gleaned from indexing and compressing alignment databases, but many widely used alignment tools process input reads sequentially and are oblivious to any underlying redundancy in the reads themselves. Results Here we present Oculus, a software package that attaches to standard aligners and exploits read redundancy by performing streaming compression, alignment, and decompression of input sequences. This nearly lossless process (> 99.9%) led to alignment speedups of up to 270% across a variety of data sets, while requiring a modest amount of memory. We expect that streaming read compressors such as Oculus could become a standard addition to existing RNA-Seq and ChIP-Seq alignment pipelines, and potentially other applications in the future as throughput increases. Conclusions Oculus efficiently condenses redundant input reads and wraps existing aligners to provide nearly identical SAM output in a fraction of the aligner runtime. It includes a number of useful features, such as tunable performance and fidelity options, compatibility with FASTA or FASTQ files, and adherence to the SAM format. The platform-independent C++ source code is freely available online, at http://code.google.com/p/oculus-bio .http://deepblue.lib.umich.edu/bitstream/2027.42/112673/1/12859_2012_Article_5548.pd

    Inflammation-Induced Oxidative Stress Mediates Gene Fusion Formation in Prostate Cancer.

    Get PDF
    Approximately 50% of prostate cancers are associated with gene fusions of the androgen-regulated gene TMPRSS2 to the oncogenic erythroblast transformation-specific (ETS) transcription factor ERG. The three-dimensional proximity of TMPRSS2 and ERG genes, in combination with DNA breaks, facilitates the formation of TMPRSS2-ERG gene fusions. However, the origins of DNA breaks that underlie gene fusion formation in prostate cancers are far from clear. We demonstrate a role for inflammation-induced oxidative stress in the formation of DNA breaks leading to recurrent TMPRSS2-ERG gene fusions. The transcriptional status and epigenetic features of the target genes influence this effect. Importantly, inflammation-induced de novo genomic rearrangements are blocked by homologous recombination (HR) and promoted by non-homologous end-joining (NHEJ) pathways. In conjunction with the association of proliferative inflammatory atrophy (PIA) with human prostate cancer, our results support a working model in which recurrent genomic rearrangements induced by inflammatory stimuli lead to the development of prostate cancer
    corecore