14,099 research outputs found

    Change-point model on nonhomogeneous Poisson processes with application in copy number profiling by next-generation DNA sequencing

    Get PDF
    We propose a flexible change-point model for inhomogeneous Poisson Processes, which arise naturally from next-generation DNA sequencing, and derive score and generalized likelihood statistics for shifts in intensity functions. We construct a modified Bayesian information criterion (mBIC) to guide model selection, and point-wise approximate Bayesian confidence intervals for assessing the confidence in the segmentation. The model is applied to DNA Copy Number profiling with sequencing data and evaluated on simulated spike-in and real data sets.Comment: Published in at http://dx.doi.org/10.1214/11-AOAS517 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Pathway relevance ranking for tumor samples through network-based data integration

    Get PDF
    The study of cancer, a highly heterogeneous disease with different causes and clinical outcomes, requires a multi-angle approach and the collection of large multi-omics datasets that, ideally, should be analyzed simultaneously. We present a new pathway relevance ranking method that is able to prioritize pathways according to the information contained in any combination of tumor related omics datasets. Key to the method is the conversion of all available data into a single comprehensive network representation containing not only genes but also individual patient samples. Additionally, all data are linked through a network of previously identified molecular interactions. We demonstrate the performance of the new method by applying it to breast and ovarian cancer datasets from The Cancer Genome Atlas. By integrating gene expression, copy number, mutation and methylation data, the method's potential to identify key pathways involved in breast cancer development shared by different molecular subtypes is illustrated. Interestingly, certain pathways were ranked equally important for different subtypes, even when the underlying (epi)-genetic disturbances were diverse. Next to prioritizing universally high-scoring pathways, the pathway ranking method was able to identify subtype-specific pathways. Often the score of a pathway could not be motivated by a single mutation, copy number or methylation alteration, but rather by a combination of genetic and epi-genetic disturbances, stressing the need for a network-based data integration approach. The analysis of ovarian tumors, as a function of survival-based subtypes, demonstrated the method's ability to correctly identify key pathways, irrespective of tumor subtype. A differential analysis of survival-based subtypes revealed several pathways with higher importance for the bad-outcome patient group than for the good-outcome patient group. Many of the pathways exhibiting higher importance for the bad-outcome patient group could be related to ovarian tumor proliferation and survival

    Combining chromosomal arm status and significantly aberrant genomic locations reveals new cancer subtypes

    Get PDF
    Many types of tumors exhibit chromosomal losses or gains, as well as local amplifications and deletions. Within any given tumor type, sample specific amplifications and deletionsare also observed. Typically, a region that is aberrant in more tumors,or whose copy number change is stronger, would be considered as a more promising candidate to be biologically relevant to cancer. We sought for an intuitive method to define such aberrations and prioritize them. We define V, the volume associated with an aberration, as the product of three factors: a. fraction of patients with the aberration, b. the aberrations length and c. its amplitude. Our algorithm compares the values of V derived from real data to a null distribution obtained by permutations, and yields the statistical significance, p value, of the measured value of V. We detected genetic locations that were significantly aberrant and combined them with chromosomal arm status to create a succint fingerprint of the tumor genome. This genomic fingerprint is used to visualize the tumors, highlighting events that are co ocurring or mutually exclusive. We allpy the method on three different public array CGH datasets of Medulloblastoma and Neuroblastoma, and demonstrate its ability to detect chromosomal regions that were known to be altered in the tested cancer types, as well as to suggest new genomic locations to be tested. We identified a potential new subtype of Medulloblastoma, which is analogous to Neuroblastoma type 1.Comment: 34 pages, 3 figures; to appear in Cancer Informatic

    Pooled DNA sequencing to identify SNPs associated with a major QTL for bacterial wilt resistance in Italian ryegrass (Lolium multiflorum Lam.)

    Get PDF
    peer-reviewedItalian ryegrass (Lolium multiflorum Lam.) is one of the most important forage grass species in temperate regions. Its yield, quality and persistency can significantly be reduced by bacterial wilt, a serious disease caused by Xanthomonas translucens pv. graminis. Although a major QTL for bacterial wilt resistance has previously been reported, detailed knowledge on underlying genes and DNA markers to allow for efficient resistance breeding strategies is currently not available. We used pooled DNA sequencing to characterize a major QTL for bacterial wilt resistance of Italian ryegrass and to develop inexpensive sequence-based markers to efficiently target resistance alleles for marker-assisted recurrent selection. From the mapping population segregating for the QTL, DNA of 44 of the most resistant and 44 of the most susceptible F1 individuals was pooled and sequenced using the Illumina HiSeq 2000 platform. Allele frequencies of 18ā€‰Ć—ā€‰106 single nucleotide polymorphisms (SNP) were determined in the resistant and susceptible pool. A total of 271 SNPs on 140 scaffold sequences of the reference parental genome showed significantly different allele frequencies in both pools. We converted 44 selected SNPs to KASPā„¢ markers, genetically mapped these proximal to the major QTL and thus validated their association with bacterial wilt resistance. This study highlights the power of pooled DNA sequencing to efficiently target binary traits in biparental mapping populations. It delivers genome sequence data, SNP markers and potential candidate genes which will allow to implement marker-assisted strategies to fix bacterial wilt resistance in outcrossing breeding populations of Italian ryegrass
    • ā€¦
    corecore