41 research outputs found

    Exome sequencing from nanogram amounts of starting DNA: comparing three approaches

    Hybridization-based target enrichment protocols require relatively large starting amounts of genomic DNA, which is not always available. Here, we tested three approaches to pre-capture library preparation starting from 10 ng of genomic DNA: (i and ii) whole-genome amplification of DNA samples with REPLI-g (Qiagen) and GenomePlex (Sigma) kits followed by standard library preparation, and (iii) library construction with a low input oriented ThruPLEX kit (Rubicon Genomics). Exome capture with Agilent SureSelectXT2 Human AllExon v4+UTRs capture probes, and HiSeq2000 sequencing were performed for test libraries along with the control library prepared from 1 µg of starting DNA. Tested protocols were characterized in terms of mapping efficiency, enrichment ratio, coverage of the target region, and reliability of SNP genotyping. REPLI-g- and ThruPLEX-FD-based protocols seem to be adequate solutions for exome sequencing of low input samples

    A simple strand-specific RNA-Seq library preparation protocol combining the Illumina TruSeq RNA and the dUTP methods

    Preserving the original RNA orientation information in RNA-Sequencing (RNA-Seq) experiment is essential to the analysis and understanding of the complexity of mammalian transcriptomes. We describe herein a simple, robust, and time-effective protocol for generating strand-specific RNA-seq libraries suited for the Illumina sequencing platform. We modified the Illumina TruSeq RNA sample preparation by implementing the strand specificity feature using the dUTP method. This protocol uses low amounts of starting material and allows a fast processing within two days. It can be easily implemented and requires only few additional reagents to the original Illumina kit

    Use of high throughput sequencing to observe genome dynamics at a single cell level

    With the development of high throughput sequencing technology, it becomes possible to directly analyze mutation distribution in a genome-wide fashion, dissociating mutation rate measurements from the traditional underlying assumptions. Here, we sequenced several genomes of Escherichia coli from colonies obtained after chemical mutagenesis and observed a strikingly nonrandom distribution of the induced mutations. These include long stretches of exclusively G to A or C to T transitions along the genome and orders of magnitude intra- and inter-genomic differences in mutation density. Whereas most of these observations can be explained by the known features of enzymatic processes, the others could reflect stochasticity in the molecular processes at the single-cell level. Our results demonstrate how analysis of the molecular records left in the genomes of the descendants of an individual mutagenized cell allows for genome-scale observations of fixation and segregation of mutations, as well as recombination events, in the single genome of their progenitor.Comment: 22 pages, 9 figures (including 5 supplementary), one tabl

    Punctuated bursts in human male demography inferred from 1,244 worldwide Y-chromosome sequences

    We report the sequences of 1,244 human Y chromosomes randomly ascertained from 26 worldwide populations by the 1000 Genomes Project. We discovered more than 65,000 variants, including single-nucleotide variants, multiple-nucleotide variants, insertions and deletions, short tandem repeats, and copy number variants. Of these, copy number variants contribute the greatest predicted functional impact. We constructed a calibrated phylogenetic tree on the basis of binary single-nucleotide variants and projected the more complex variants onto it, estimating the number of mutations for each class. Our phylogeny shows bursts of extreme expansion in male numbers that have occurred independently among each of the five continental superpopulations examined, at times of known migrations and technological innovations

    Active medulloblastoma enhancers reveal subgroup-specific cellular origins

    Medulloblastoma is a highly malignant paediatric brain tumour, often inflicting devastating consequences on the developing child. Genomic studies have revealed four distinct molecular subgroups with divergent biology and clinical behaviour. An understanding of the regulatory circuitry governing the transcriptional landscapes of medulloblastoma subgroups, and how this relates to their respective developmental origins, is lacking. Here, using H3K27ac and BRD4 chromatin immunoprecipitation followed by sequencing (ChIP-seq) coupled with tissue-matched DNA methylation and transcriptome data, we describe the active cis-regulatory landscape across 28 primary medulloblastoma specimens. Analysis of differentially regulated enhancers and super-enhancers reinforced inter-subgroup heterogeneity and revealed novel, clinically relevant insights into medulloblastoma biology. Computational reconstruction of core regulatory circuitry identified a master set of transcription factors, validated by ChIP-seq, that is responsible for subgroup divergence, and implicates candidate cells of origin for Group 4. Our integrated analysis of enhancer elements in a large series of primary tumour samples reveals insights into cis-regulatory architecture, unrecognized dependencies, and cellular origins

    Integrative genomic analyses reveal an androgen-driven somatic alteration landscape in early-onset prostate cancer

    Early-onset prostate cancer (EO-PCA) represents the earliest clinical manifestation of prostate cancer. To compare the genomic alteration landscapes of EO-PCA with "classical" (elderly-onset) PCA, we performed deep sequencing-based genomics analyses in 11 tumors diagnosed at young age, and pursued comparative assessments with seven elderly-onset PCA genomes. Remarkable age-related differences in structural rearrangement (SR) formation became evident, suggesting distinct disease pathomechanisms. Whereas EO-PCAs harbored a prevalence of balanced SRs, with a specific abundance of androgen-regulated ETS gene fusions including TMPRSS2:ERG, elderly-onset PCAs displayed primarily non-androgen-associated SRs. Data from a validation cohort of > 10,000 patients showed age-dependent androgen receptor levels and a prevalence of SRs affecting androgen-regulated genes, further substantiating the activity of a characteristic "androgen-type" pathomechanism in EO-PCA

    Genomics and drug profiling of fatal TCF3-HLF-positive acute lymphoblastic leukemia identifies recurrent mutation patterns and therapeutic options.

    TCF3-HLF-positive acute lymphoblastic leukemia (ALL) is currently incurable. Using an integrated approach, we uncovered distinct mutation, gene expression and drug response profiles in TCF3-HLF-positive and treatment-responsive TCF3-PBX1-positive ALL. We identified recurrent intragenic deletions of PAX5 or VPREB1 in constellation with the fusion of TCF3 and HLF. Moreover somatic mutations in the non-translocated allele of TCF3 and a reduction of PAX5 gene dosage in TCF3-HLF ALL suggest cooperation within a restricted genetic context. The enrichment for stem cell and myeloid features in the TCF3-HLF signature may reflect reprogramming by TCF3-HLF of a lymphoid-committed cell of origin toward a hybrid, drug-resistant hematopoietic state. Drug response profiling of matched patient-derived xenografts revealed a distinct profile for TCF3-HLF ALL with resistance to conventional chemotherapeutics but sensitivity to glucocorticoids, anthracyclines and agents in clinical development. Striking on-target sensitivity was achieved with the BCL2-specific inhibitor venetoclax (ABT-199). This integrated approach thus provides alternative treatment options for this deadly disease

    Genetic Drivers of Epigenetic and Transcriptional Variation in Human Immune Cells

    Characterizing the multifaceted contribution of genetic and epigenetic factors to disease phenotypes is a major challenge in human genetics and medicine. We carried out high-resolution genetic, epigenetic, and transcriptomic profiling in three major human immune cell types (CD14+^{+} monocytes, CD16+^{+} neutrophils, and naive CD4+^{+} T cells) from up to 197 individuals. We assess, quantitatively, the relative contribution of cis\textit{cis}-genetic and epigenetic factors to transcription and evaluate their impact as potential sources of confounding in epigenome-wide association studies. Further, we characterize highly coordinated genetic effects on gene expression, methylation, and histone variation through quantitative trait locus (QTL) mapping and allele-specific (AS) analyses. Finally, we demonstrate colocalization of molecular trait QTLs at 345 unique immune disease loci. This expansive, high-resolution atlas of multi-omics changes yields insights into cell-type-specific correlation between diverse genomic inputs, more generalizable correlations between these inputs, and defines molecular events that may underpin complex disease risk.This work was predominantly funded by the EU FP7 High Impact Project BLUEPRINT (HEALTH-F5-2011-282510) and the Canadian Institutes of Health Research (CIHR EP1-120608). The research leading to these results has received funding from the European Union's Seventh Framework Programme (FP7/2007-2013) under grant agreement no 282510 (BLUEPRINT), the European Molecular Biology Laboratory, the Max Planck society, the Spanish Ministry of Economy and Competitiveness, ‘Centro de Excelencia Severo Ochoa 2013-2017’, SEV-2012-0208 and Spanish National Bioinformatics Institute (INB-ISCIII) PT13/0001/0021 co-funded by FEDER "“Una Manera de hacer Europa”. D.G. is supported by a “la Caixa”-Severo Ochoa pre-doctoral fellowship, M.F. was supported by the BHF Cambridge Centre of Excellence [RE/13/6/30180], K.D. is funded as a HSST trainee by NHS Health Education England, S.E. is supported by a fellowship from La Caixa, V.P. is supported by a FEBS long-term fellowship and N.S.'s research is supported by the Wellcome Trust (Grant Codes WT098051 and WT091310), the EU FP7 (EPIGENESYS Grant Code 257082 and BLUEPRINT Grant Code HEALTH-F5-2011-282510) and the NIHR BRC. The Blood and Transplant Unit (BTRU) in Donor Health and Genomics is part of and funded by the National Institute for Health Research (NIHR) and is a partnership between the University of Cambridge and NHS Blood and Transplant (NHSBT) in collaboration with the University of Oxford and the Wellcome Trust Sanger Institute. The T-cell data was produced by the McGill Epigenomics Mapping Centre (EMC McGill). It is funded under the Canadian Epigenetics, Environment, and Health Research Consortium (CEEHRC) by the Canadian Institutes of Health Research and by Genome Quebec (CIHR EP1-120608), with additional support from Genome Canada and FRSQ. T.P. holds a Canada Research Chair

    Symposium on the Scottish labour market

    In the post-war period, up to the late 1960s, Britain enjoyed a modicum of unemployment and government policies which were geared to producing Full Employment were considered a success. It was simple - boost demand and more people would find work. But the mid 1970s the economic regency enjoyed by those advocating demand sided policies fell into disrepute as the OPEC nations raised prices dramatically and brought in a new era of both rising prices and unemployment. The laws of economics, which previously had viewed policy decisions as the choice between lower unemployment and higher inflation were now redundant. Both unemployment and inflation were moving in the same direction. The era of stagflation had begun

    Integrating sequence and array data to create an improved 1000 Genomes Project haplotype reference panel

    A major use of the 1000 Genomes Project (1000GP) data is genotype imputation in genome-wide association studies (GWAS). Here we develop a method to estimate haplotypes from low-coverage sequencing data that can take advantage of single-nucleotide polymorphism (SNP) microarray genotypes on the same samples. First the SNP array data are phased to build a backbone (or 'scaffold') of haplotypes across each chromosome. We then phase the sequence data 'onto' this haplotype scaffold. This approach can take advantage of relatedness between sequenced and non-sequenced samples to improve accuracy. We use this method to create a new 1000GP haplotype reference set for use by the human genetic community. Using a set of validation genotypes at SNP and bi-allelic indels we show that these haplotypes have lower genotype discordance and improved imputation performance into downstream GWAS samples, especially at low-frequency variants. © 2014 Macmillan Publishers Limited. All rights reserved