11,129 research outputs found

    Discovery of large genomic inversions using long range information.

    Get PDF
    BackgroundAlthough many algorithms are now available that aim to characterize different classes of structural variation, discovery of balanced rearrangements such as inversions remains an open problem. This is mainly due to the fact that breakpoints of such events typically lie within segmental duplications or common repeats, which reduces the mappability of short reads. The algorithms developed within the 1000 Genomes Project to identify inversions are limited to relatively short inversions, and there are currently no available algorithms to discover large inversions using high throughput sequencing technologies.ResultsHere we propose a novel algorithm, VALOR, to discover large inversions using new sequencing methods that provide long range information such as 10X Genomics linked-read sequencing, pooled clone sequencing, or other similar technologies that we commonly refer to as long range sequencing. We demonstrate the utility of VALOR using both pooled clone sequencing and 10X Genomics linked-read sequencing generated from the genome of an individual from the HapMap project (NA12878). We also provide a comprehensive comparison of VALOR against several state-of-the-art structural variation discovery algorithms that use whole genome shotgun sequencing data.ConclusionsIn this paper, we show that VALOR is able to accurately discover all previously identified and experimentally validated large inversions in the same genome with a low false discovery rate. Using VALOR, we also predicted a novel inversion, which we validated using fluorescent in situ hybridization. VALOR is available at https://github.com/BilkentCompGen/VALOR

    Periodic shock-emission from acoustically driven cavitation clouds:a source of the subharmonic signal

    Get PDF
    Single clouds of cavitation bubbles, driven by 254 kHz focused ultrasound at pressure amplitudes in the range of 0.48ā€“1.22 MPa, have been observed via high-speed shadowgraphic imaging at 1 Ɨ 10ā¶ frames per second. Clouds underwent repetitive growth, oscillation and collapse (GOC) cycles, with shock-waves emitted periodically at the instant of collapse during each cycle. The frequency of cloud collapse, and coincident shock-emission, was primarily dependent on the intensity of the focused ultrasound driving the activity. The lowest peak-to-peak pressure amplitude of 0.48 MPa generated shock-waves with an average period of 7.9 Ā± 0.5 Ī¼s, corresponding to a frequency of fā‚€/2, half-harmonic to the fundamental driving. Increasing the intensity gave rise to GOC cycles and shock-emission periods of 11.8 Ā± 0.3, 15.8 Ā± 0.3, 19.8 Ā± 0.2 Ī¼s, at pressure amplitudes of 0.64, 0.92 and 1.22 MPa, corresponding to the higher-order subharmonics of fā‚€/3, fā‚€/4 and fā‚€/5, respectively. Parallel passive acoustic detection, filtered for the fundamental driving, revealed features that correlated temporally to the shock-emissions observed via high-speed imaging, p(two-tailed) 200 Ī¼m diameter, at maximum inflation), that developed under insonations of peak-to-peak pressure amplitudes >1.0 MPa, emitted shock-waves with two or more fronts suggesting non-uniform collapse of the cloud. The observations indicate that periodic shock-emissions from acoustically driven cavitation clouds provide a source for the cavitation subharmonic signal, and that shock structure may be used to study intra-cloud dynamics at sub-microsecond timescales

    The elusive evidence for chromothripsis.

    Get PDF
    The chromothripsis hypothesis suggests an extraordinary one-step catastrophic genomic event allowing a chromosome to 'shatter into many pieces' and reassemble into a functioning chromosome. Recent efforts have aimed to detect chromothripsis by looking for a genomic signature, characterized by a large number of breakpoints (50-250), but a limited number of oscillating copy number states (2-3) confined to a few chromosomes. The chromothripsis phenomenon has become widely reported in different cancers, but using inconsistent and sometimes relaxed criteria for determining rearrangements occur simultaneously rather than progressively. We revisit the original simulation approach and show that the signature is not clearly exceptional, and can be explained using only progressive rearrangements. For example, 3.9% of progressively simulated chromosomes with 50-55 breakpoints were dominated by two or three copy number states. In addition, by adjusting the parameters of the simulation, the proposed footprint appears more frequently. Lastly, we provide an algorithm to find a sequence of progressive rearrangements that explains all observed breakpoints from a proposed chromothripsis chromosome. Thus, the proposed signature cannot be considered a sufficient proof for this extraordinary hypothesis. Great caution should be exercised when labeling complex rearrangements as chromothripsis from genome hybridization and sequencing experiments

    Structural Variation Discovery and Genotyping from Whole Genome Sequencing: Methodology and Applications: A Dissertation

    Get PDF
    A comprehensive understanding about how genetic variants and mutations contribute to phenotypic variations and alterations entails experimental technologies and analytical methodologies that are able to detect genetic variants/mutations from various biological samples in a timely and accurate manner. High-throughput sequencing technology represents the latest achievement in a series of efforts to facilitate genetic variants discovery and genotyping and promises to transform the way we tackle healthcare and biomedical problems. The tremendous amount of data generated by this new technology, however, needs to be processed and analyzed in an accurate and efficient way in order to fully harness its potential. Structural variation (SV) encompasses a wide range of genetic variations with different sizes and generated by diverse mechanisms. Due to the technical difficulties of reliably detecting SVs, their characterization lags behind that of SNPs and indels. In this dissertation I presented two novel computational methods: one for detecting transposable element (TE) transpositions and the other for detecting SVs in general using a local assembly approach. Both methods are able to pinpoint breakpoint junctions at single-nucleotide resolution and estimate variant allele frequencies in the sample. I also applied those methods to study the impact of TE transpositions on the genomic stability, the inheritance patterns of TE insertions in the population and the molecular mechanisms and potential functional consequences of somatic SVs in cancer genomes

    Targeted next-generation sequencing of DNA regions proximal to a conserved GXGXXG signaling motif enables systematic discovery of tyrosine kinase fusions in cancer

    Get PDF
    Tyrosine kinase (TK) fusions are attractive drug targets in cancers. However, rapid identification of these lesions has been hampered by experimental limitations. Our in silico analysis of known cancer-derived TK fusions revealed that most breakpoints occur within a defined region upstream of a conserved GXGXXG kinase motif. We therefore designed a novel DNA-based targeted sequencing approach to screen systematically for fusions within the 90 human TKs; it should detect 92% of known TK fusions. We deliberately paired ā€˜in-solutionā€™ DNA capture with 454 sequencing to minimize starting material requirements, take advantage of long sequence reads, and facilitate mapping of fusions. To validate this platform, we analyzed genomic DNA from thyroid cancer cells (TPC-1) and leukemia cells (KG-1) with fusions known only at the mRNA level. We readily identified for the first time the genomic fusion sequences of CCDC6-RET in TPC-1 cells and FGFR1OP2-FGFR1 in KG-1 cells. These data demonstrate the feasibility of this approach to identify TK fusions across multiple human cancers in a high-throughput, unbiased manner. This method is distinct from other similar efforts, because it focuses specifically on targets with therapeutic potential, uses only 1.5ā€‰Āµg of DNA, and circumvents the need for complex computational sequence analysis

    Long noncoding RNAs in prostate cancer: overview and clinical implications.

    Get PDF
    Prostate cancer is the second most common cause of cancer mortality among men in the United States. While many prostate cancers are indolent, an important subset of patients experiences disease recurrence after conventional therapy and progresses to castration-resistant prostate cancer (CRPC), which is currently incurable. Thus, there is a critical need to identify biomarkers that will distinguish indolent from aggressive disease, as well as novel therapeutic targets for the prevention or treatment of CRPC. In recent years, long noncoding RNAs (lncRNAs) have emerged as an important class of biological molecules. LncRNAs are polyadenylated RNA species that share many similarities with protein-coding genes despite the fact that they are noncoding (not translated into proteins). They are usually transcribed by RNA polymerase II and exhibit the same epigenetic signatures as protein-coding genes. LncRNAs have also been implicated in the development and progression of variety of cancers, including prostate cancer. While a large number of lncRNAs exhibit tissue- and cancer-specific expression, their utility as diagnostic and prognostic biomarkers is just starting to be explored. In this review, we highlight recent findings on the functional role and molecular mechanisms of lncRNAs in the progression of prostate cancer and evaluate their use as potential biomarkers and therapeutic targets

    Discovering cancer-associated transcripts by RNA sequencing

    Full text link
    High-throughput sequencing of poly-adenylated RNA (RNA-Seq) in human cancers shows remarkable potential to identify uncharacterized aspects of tumor biology, including gene fusions with therapeutic significance and disease markers such as long non-coding RNA (lncRNA) species. However, the analysis of RNA-Seq data places unprecedented demands upon computational infrastructures and algorithms, requiring novel bioinformatics approaches. To meet these demands, we present two new open-source software packages - ChimeraScan and AssemblyLine - designed to detect gene fusion events and novel lncRNAs, respectively. RNA-Seq studies utilizing ChimeraScan led to discoveries of new families of recurrent gene fusions in breast cancers and solitary fibrous tumors. Further, ChimeraScan was one of the key components of the repertoire of computational tools utilized in data analysis for MI-ONCOSEQ, a clinical sequencing initiative to identify potentially informative and actionable mutations in cancer patientsā€™ tumors. AssemblyLine, by contrast, reassembles RNA sequencing data into full-length transcripts ab initio. In head-to-head analyses AssemblyLine compared favorably to existing ab initio approaches and unveiled abundant novel lncRNAs, including antisense and intronic lncRNAs disregarded by previous studies. Moreover, we used AssemblyLine to define the prostate cancer transcriptome from a large patient cohort and discovered myriad lncRNAs, including 121 prostate cancer-associated transcripts (PCATs) that could potentially serve as novel disease markers. Functional studies of two PCATs - PCAT-1 and SChLAP1 - revealed cancer-promoting roles for these lncRNAs. PCAT1, a lncRNA expressed from chromosome 8q24, promotes cell proliferation and represses the tumor suppressor BRCA2. SChLAP1, located in a chromosome 2q31 ā€˜gene desertā€™, independently predicts poor patient outcomes, including metastasis and cancer-specific mortality. Mechanistically, SChLAP1 antagonizes the genome-wide localization and regulatory functions of the SWI/SNF chromatin-modifying complex. Collectively, this work demonstrates the utility of ChimeraScan and AssemblyLine as open-source bioinformatics tools. Our applications of ChimeraScan and AssemblyLine led to the discovery of new classes of recurrent and clinically informative gene fusions, and established a prominent role for lncRNAs in coordinating aggressive prostate cancer, respectively. We expect that the methods and findings described herein will establish a precedent for RNA-Seq-based studies in cancer biology and assist the research community at large in making similar discoveries.PHDBioinformaticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/120814/1/mkiyer_1.pd
    • ā€¦
    corecore