27 research outputs found

    Identification of a deletion in an individual genome by split read analysis (middle), and by depth of coverage analysis (bottom).

    No full text
    <p>Identification of a deletion in an individual genome by split read analysis (middle), and by depth of coverage analysis (bottom).</p

    A strobe with 3 subreads.

    No full text
    <p>A strobe with 3 subreads.</p

    An inversion resulting from non-allelic homologous recombination (NAHR) between two nearly identical segmental duplications (blue boxes) with opposite orientations (arrows).

    No full text
    <p>The inversion flips the orientation of the subsequence, or block, in one genome relative to the other genome.</p

    Paired end mapping (PEM).

    No full text
    <p>Fragments from an individual genome are sequenced from both ends and the resulting paired reads are aligned to a reference genome. Most paired reads correspond to concordant pairs, where the distance between the alignment of each read agrees with the distribution of fragment lengths (right). The remaining discordant pairs suggest structural variants (here a deletion) that distinguish the individual and reference genomes.</p

    Chapter 6: Structural Variation and Medical Genomics

    Get PDF
    <div><p>Differences between individual human genomes, or between human and cancer genomes, range in scale from single nucleotide variants (SNVs) through intermediate and large-scale duplications, deletions, and rearrangements of genomic segments. The latter class, called structural variants (SVs), have received considerable attention in the past several years as they are a previously under appreciated source of variation in human genomes. Much of this recent attention is the result of the availability of higher-resolution technologies for measuring these variants, including both microarray-based techniques, and more recently, high-throughput DNA sequencing. We describe the genomic technologies and computational techniques currently used to measure SVs, focusing on applications in human and cancer genomics.</p> </div

    Mutation, selection, and clonal expansion in tumor development leads to genomic heterogeneity between cells in a tumor.

    No full text
    <p>Current DNA sequencing approaches sequence DNA from many cells and thus result in a heterogenous mixture of mutations, with varying numbers of both passenger mutations (black) and driver mutations (red).</p

    Figure 6

    No full text
    <p>(Top) A discordant pair (arc) indicates a deletion with unknown breakpoints and located in orange blocks. Positions , and the minimum and maximum length of end-sequenced fragments constrain breakpoints to lie within the indicated orange blocks, and are governed by the indicated linear inequalities. (Bottom) A polygon in 2D genome space expresses the linear dependency between breakpoints and and records the uncertainty in the location of the breakpoints.</p

    Two major approaches to detect structural variants in an individual genome from next-generation sequencing data are <i>de novo</i> assembly and resequencing.

    No full text
    <p>In <i>de novo</i> assembly, the individual genome sequence is constructed by examining overlaps between reads. In resequencing approaches, reads from the individual genome are aligned to a closely related reference genome. Examination of the resulting alignments reveals differences between the individual genome and the reference genome.</p

    Accurate Computation of Survival Statistics in Genome-Wide Studies

    Get PDF
    <div><p>A key challenge in genomics is to identify genetic variants that distinguish patients with different <i>survival time</i> following diagnosis or treatment. While the log-rank test is widely used for this purpose, nearly all implementations of the log-rank test rely on an asymptotic approximation that is not appropriate in many genomics applications. This is because: the two populations determined by a genetic variant may have very different sizes; and the evaluation of many possible variants demands highly accurate computation of very small <i>p</i>-values. We demonstrate this problem for cancer genomics data where the standard log-rank test leads to many false positive associations between somatic mutations and survival time. We develop and analyze a novel algorithm, Exact Log-rank Test (ExaLT), that accurately computes the <i>p</i>-value of the log-rank statistic under an exact distribution that is appropriate for any size populations. We demonstrate the advantages of ExaLT on data from published cancer genomics studies, finding significant differences from the reported <i>p</i>-values. We analyze somatic mutations in six cancer types from The Cancer Genome Atlas (TCGA), finding mutations with known association to survival as well as several novel associations. In contrast, standard implementations of the log-rank test report dozens-hundreds of likely false positive associations as more significant than these known associations.</p></div

    Comparison of the <i>p</i>-values of association between somatic mutations and survival time for TCGA glioblastoma (GBM) and ovarian (OV) datasets.

    No full text
    <p>Each data point represents a gene. (A) Comparison of the Rsurvdiff<i>p</i>-values and the exact permutational <i>p</i>-values for the GBM dataset. (B) Comparison of the Rsurvdiff<i>p</i>-values and the exact permutational <i>p</i>-values for the OV dataset.</p
    corecore