424 research outputs found
MuSiC: Identifying mutational significance in cancer genomes
Massively parallel sequencing technology and the associated rapidly decreasing sequencing costs have enabled systemic analyses of somatic mutations in large cohorts of cancer cases. Here we introduce a comprehensive mutational analysis pipeline that uses standardized sequence-based inputs along with multiple types of clinical data to establish correlations among mutation sites, affected genes and pathways, and to ultimately separate the commonly abundant passenger mutations from the truly significant events. In other words, we aim to determine the Mutational Significance in Cancer (MuSiC) for these large data sets. The integration of analytical operations in the MuSiC framework is widely applicable to a broad set of tumor types and offers the benefits of automation as well as standardization. Herein, we describe the computational structure and statistical underpinnings of the MuSiC pipeline and demonstrate its performance using 316 ovarian cancer samples from the TCGA ovarian cancer project. MuSiC correctly confirms many expected results, and identifies several potentially novel avenues for discovery
Final report on project SP1210: Lowland peatland systems in England and Wales – evaluating greenhouse gas fluxes and carbon balances
Lowland peatlands represent one of the most carbon-rich ecosystems in the UK. As a result of widespread habitat modification and drainage to support agriculture and peat extraction, they have been converted from natural carbon sinks into major carbon sources, and are now amongst the largest sources of greenhouse gas (GHG) emissions from the UK land-use sector. Despite this, they have previously received relatively little policy attention, and measures to reduce GHG emissions either through re-wetting and restoration or improved management of agricultural land remain at a relatively early stage. In part, this has stemmed from a lack of reliable measurements on the carbon and GHG balance of UK lowland peatlands. This project aimed to address this evidence gap via an unprecedented programme of consistent, multi year field measurements at a total of 15 lowland peatland sites in England and Wales, ranging from conservation managed ‘near-natural’ ecosystems to intensively managed agricultural and extraction sites. The use of standardised measurement and data analysis protocols allowed the magnitude of GHG emissions and removals by peatlands to be quantified across this heterogeneous data set, and for controlling factors to be identified. The network of seven flux towers established during the project is believed to be unique on peatlands globally, and has provided new insights into the processes the control GHG fluxes in lowland peatlands. The work undertaken is intended to support the future development and implementation of agricultural management and restoration measures aimed at reducing the contribution of these important ecosystems to UK GHG emissions
Recommended from our members
A novel retinoblastoma therapy from genomic and epigenetic analyses.
Retinoblastoma is an aggressive childhood cancer of the developing retina that is initiated by the biallelic loss of RB1. Tumours progress very quickly following RB1 inactivation but the underlying mechanism is not known. Here we show that the retinoblastoma genome is stable, but that multiple cancer pathways can be epigenetically deregulated. To identify the mutations that cooperate with RB1 loss, we performed whole-genome sequencing of retinoblastomas. The overall mutational rate was very low; RB1 was the only known cancer gene mutated. We then evaluated the role of RB1 in genome stability and considered non-genetic mechanisms of cancer pathway deregulation. For example, the proto-oncogene SYK is upregulated in retinoblastoma and is required for tumour cell survival. Targeting SYK with a small-molecule inhibitor induced retinoblastoma tumour cell death in vitro and in vivo. Thus, retinoblastomas may develop quickly as a result of the epigenetic deregulation of key cancer pathways as a direct or indirect result of RB1 loss
A vertebrate case study of the quality of assemblies derived from next-generation sequences
The unparalleled efficiency of next-generation sequencing (NGS) has prompted widespread adoption, but significant problems remain in the use of NGS data for whole genome assembly. We explore the advantages and disadvantages of chicken genome assemblies generated using a variety of sequencing and assembly methodologies. NGS assemblies are equivalent in some ways to a Sanger-based assembly yet deficient in others. Nonetheless, these assemblies are sufficient for the identification of the majority of genes and can reveal novel sequences when compared to existing assembly references
Recommended from our members
A high-resolution map of human evolutionary constraint using 29 mammals.
The comparison of related genomes has emerged as a powerful lens for genome interpretation. Here we report the sequencing and comparative analysis of 29 eutherian genomes. We confirm that at least 5.5% of the human genome has undergone purifying selection, and locate constrained elements covering ∼4.2% of the genome. We use evolutionary signatures and comparisons with experimental data sets to suggest candidate functions for ∼60% of constrained bases. These elements reveal a small number of new coding exons, candidate stop codon readthrough events and over 10,000 regions of overlapping synonymous constraint within protein-coding exons. We find 220 candidate RNA structural families, and nearly a million elements overlapping potential promoter, enhancer and insulator regions. We report specific amino acid residues that have undergone positive selection, 280,000 non-coding elements exapted from mobile elements and more than 1,000 primate- and human-accelerated elements. Overlap with disease-associated variants indicates that our findings will be relevant for studies of human biology, health and disease
Clonal architecture of secondary acute myeloid leukemia
BACKGROUND: The myelodysplastic syndromes are a group of hematologic disorders that often evolve into secondary acute myeloid leukemia (AML). The genetic changes that underlie progression from the myelodysplastic syndromes to secondary AML are not well understood. METHODS: We performed whole-genome sequencing of seven paired samples of skin and bone marrow in seven subjects with secondary AML to identify somatic mutations specific to secondary AML. We then genotyped a bone marrow sample obtained during the antecedent myelodysplastic-syndrome stage from each subject to determine the presence or absence of the specific somatic mutations. We identified recurrent mutations in coding genes and defined the clonal architecture of each pair of samples from the myelodysplastic-syndrome stage and the secondary-AML stage, using the allele burden of hundreds of mutations. RESULTS: Approximately 85% of bone marrow cells were clonal in the myelodysplastic-syndrome and secondary-AML samples, regardless of the myeloblast count. The secondary-AML samples contained mutations in 11 recurrently mutated genes, including 4 genes that have not been previously implicated in the myelodysplastic syndromes or AML. In every case, progression to acute leukemia was defined by the persistence of an antecedent founding clone containing 182 to 660 somatic mutations and the outgrowth or emergence of at least one subclone, harboring dozens to hundreds of new mutations. All founding clones and subclones contained at least one mutation in a coding gene. CONCLUSIONS: Nearly all the bone marrow cells in patients with myelodysplastic syndromes and secondary AML are clonally derived. Genetic evolution of secondary AML is a dynamic process shaped by multiple cycles of mutation acquisition and clonal selection. Recurrent gene mutations are found in both founding clones and daughter subclones. (Funded by the National Institutes of Health and others.
Water-level dynamics in natural and artificial pools in blanket peatlands
Perennial pools are common natural features of peatlands and their hydrological functioning and turnover may be important for carbon fluxes, aquatic ecology and downstream water quality. Peatland restoration methods such as ditch blocking result in many new pools. However, little is known about the hydrological function of either pool type. We monitored six natural and six artificial pools on a Scottish blanket peatland. Pool water levels were more variable in all seasons in artificial pools having greater water level increases and faster recession responses to storms than natural pools. Pools overflowed by a median of 9 and 54 times pool volume per year for natural and artificial pools respectively but this varied widely because some large pools had small upslope catchments and vice versa. Mean peat water-table depths were similar between natural and artificial pool sites but much more variable over time at the artificial pool site, possibly due to a lower bulk specific yield across this site. Pool levels and pool-level fluctuations were not the same as those of local water tables in the adjacent peat. Pool level time-series were much smoother, with more damped rainfall or recession responses than those for peat water tables. There were strong hydraulic gradients between the peat and pools, with absolute water tables often being 20-30 cm higher or lower than water levels in pools only 1-4 m away. However, as peat hydraulic conductivity was very low (median of 1.5×10-5 and 1.4×10-6 cm s-1 at 30 and 50 cm depths at the natural pool site) there was little deep subsurface flow interaction. We conclude that: 1) for peat restoration projects, a larger total pool surface area is likely to result in smaller flood peaks downstream, at least during summer months, because peatland bulk specific yield will be greater; and 2) surface and near-surface connectivity during storm events and topographic context, rather than pool size alone, must be taken into account in future peatland pool and stream chemistry studies
Design and implementation of a generalized laboratory data model
<p>Abstract</p> <p>Background</p> <p>Investigators in the biological sciences continue to exploit laboratory automation methods and have dramatically increased the rates at which they can generate data. In many environments, the methods themselves also evolve in a rapid and fluid manner. These observations point to the importance of robust information management systems in the modern laboratory. Designing and implementing such systems is non-trivial and it appears that in many cases a database project ultimately proves unserviceable.</p> <p>Results</p> <p>We describe a general modeling framework for laboratory data and its implementation as an information management system. The model utilizes several abstraction techniques, focusing especially on the concepts of inheritance and meta-data. Traditional approaches commingle event-oriented data with regular entity data in <it>ad hoc </it>ways. Instead, we define distinct regular entity and event schemas, but fully integrate these via a standardized interface. The design allows straightforward definition of a "processing pipeline" as a sequence of events, obviating the need for separate workflow management systems. A layer above the event-oriented schema integrates events into a workflow by defining "processing directives", which act as automated project managers of items in the system. Directives can be added or modified in an almost trivial fashion, i.e., without the need for schema modification or re-certification of applications. Association between regular entities and events is managed via simple "many-to-many" relationships. We describe the programming interface, as well as techniques for handling input/output, process control, and state transitions.</p> <p>Conclusion</p> <p>The implementation described here has served as the Washington University Genome Sequencing Center's primary information system for several years. It handles all transactions underlying a throughput rate of about 9 million sequencing reactions of various kinds per month and has handily weathered a number of major pipeline reconfigurations. The basic data model can be readily adapted to other high-volume processing environments.</p
Genome modeling system: A knowledge management platform for genomics
In this work, we present the Genome Modeling System (GMS), an analysis information management system capable of executing automated genome analysis pipelines at a massive scale. The GMS framework provides detailed tracking of samples and data coupled with reliable and repeatable analysis pipelines. The GMS also serves as a platform for bioinformatics development, allowing a large team to collaborate on data analysis, or an individual researcher to leverage the work of others effectively within its data management system. Rather than separating ad-hoc analysis from rigorous, reproducible pipelines, the GMS promotes systematic integration between the two. As a demonstration of the GMS, we performed an integrated analysis of whole genome, exome and transcriptome sequencing data from a breast cancer cell line (HCC1395) and matched lymphoblastoid line (HCC1395BL). These data are available for users to test the software, complete tutorials and develop novel GMS pipeline configurations. The GMS is available at https://github.com/genome/gms
The Origin and Evolution of Mutations in Acute Myeloid Leukemia
SummaryMost mutations in cancer genomes are thought to be acquired after the initiating event, which may cause genomic instability and drive clonal evolution. However, for acute myeloid leukemia (AML), normal karyotypes are common, and genomic instability is unusual. To better understand clonal evolution in AML, we sequenced the genomes of M3-AML samples with a known initiating event (PML-RARA) versus the genomes of normal karyotype M1-AML samples and the exomes of hematopoietic stem/progenitor cells (HSPCs) from healthy people. Collectively, the data suggest that most of the mutations found in AML genomes are actually random events that occurred in HSPCs before they acquired the initiating mutation; the mutational history of that cell is “captured” as the clone expands. In many cases, only one or two additional, cooperating mutations are needed to generate the malignant founding clone. Cells from the founding clone can acquire additional cooperating mutations, yielding subclones that can contribute to disease progression and/or relapse
- …