424 research outputs found

    MuSiC: Identifying mutational significance in cancer genomes

    Get PDF
    Massively parallel sequencing technology and the associated rapidly decreasing sequencing costs have enabled systemic analyses of somatic mutations in large cohorts of cancer cases. Here we introduce a comprehensive mutational analysis pipeline that uses standardized sequence-based inputs along with multiple types of clinical data to establish correlations among mutation sites, affected genes and pathways, and to ultimately separate the commonly abundant passenger mutations from the truly significant events. In other words, we aim to determine the Mutational Significance in Cancer (MuSiC) for these large data sets. The integration of analytical operations in the MuSiC framework is widely applicable to a broad set of tumor types and offers the benefits of automation as well as standardization. Herein, we describe the computational structure and statistical underpinnings of the MuSiC pipeline and demonstrate its performance using 316 ovarian cancer samples from the TCGA ovarian cancer project. MuSiC correctly confirms many expected results, and identifies several potentially novel avenues for discovery

    Final report on project SP1210: Lowland peatland systems in England and Wales – evaluating greenhouse gas fluxes and carbon balances

    Get PDF
    Lowland peatlands represent one of the most carbon-rich ecosystems in the UK. As a result of widespread habitat modification and drainage to support agriculture and peat extraction, they have been converted from natural carbon sinks into major carbon sources, and are now amongst the largest sources of greenhouse gas (GHG) emissions from the UK land-use sector. Despite this, they have previously received relatively little policy attention, and measures to reduce GHG emissions either through re-wetting and restoration or improved management of agricultural land remain at a relatively early stage. In part, this has stemmed from a lack of reliable measurements on the carbon and GHG balance of UK lowland peatlands. This project aimed to address this evidence gap via an unprecedented programme of consistent, multi year field measurements at a total of 15 lowland peatland sites in England and Wales, ranging from conservation managed ‘near-natural’ ecosystems to intensively managed agricultural and extraction sites. The use of standardised measurement and data analysis protocols allowed the magnitude of GHG emissions and removals by peatlands to be quantified across this heterogeneous data set, and for controlling factors to be identified. The network of seven flux towers established during the project is believed to be unique on peatlands globally, and has provided new insights into the processes the control GHG fluxes in lowland peatlands. The work undertaken is intended to support the future development and implementation of agricultural management and restoration measures aimed at reducing the contribution of these important ecosystems to UK GHG emissions

    A vertebrate case study of the quality of assemblies derived from next-generation sequences

    Get PDF
    The unparalleled efficiency of next-generation sequencing (NGS) has prompted widespread adoption, but significant problems remain in the use of NGS data for whole genome assembly. We explore the advantages and disadvantages of chicken genome assemblies generated using a variety of sequencing and assembly methodologies. NGS assemblies are equivalent in some ways to a Sanger-based assembly yet deficient in others. Nonetheless, these assemblies are sufficient for the identification of the majority of genes and can reveal novel sequences when compared to existing assembly references

    Clonal architecture of secondary acute myeloid leukemia

    Get PDF
    BACKGROUND: The myelodysplastic syndromes are a group of hematologic disorders that often evolve into secondary acute myeloid leukemia (AML). The genetic changes that underlie progression from the myelodysplastic syndromes to secondary AML are not well understood. METHODS: We performed whole-genome sequencing of seven paired samples of skin and bone marrow in seven subjects with secondary AML to identify somatic mutations specific to secondary AML. We then genotyped a bone marrow sample obtained during the antecedent myelodysplastic-syndrome stage from each subject to determine the presence or absence of the specific somatic mutations. We identified recurrent mutations in coding genes and defined the clonal architecture of each pair of samples from the myelodysplastic-syndrome stage and the secondary-AML stage, using the allele burden of hundreds of mutations. RESULTS: Approximately 85% of bone marrow cells were clonal in the myelodysplastic-syndrome and secondary-AML samples, regardless of the myeloblast count. The secondary-AML samples contained mutations in 11 recurrently mutated genes, including 4 genes that have not been previously implicated in the myelodysplastic syndromes or AML. In every case, progression to acute leukemia was defined by the persistence of an antecedent founding clone containing 182 to 660 somatic mutations and the outgrowth or emergence of at least one subclone, harboring dozens to hundreds of new mutations. All founding clones and subclones contained at least one mutation in a coding gene. CONCLUSIONS: Nearly all the bone marrow cells in patients with myelodysplastic syndromes and secondary AML are clonally derived. Genetic evolution of secondary AML is a dynamic process shaped by multiple cycles of mutation acquisition and clonal selection. Recurrent gene mutations are found in both founding clones and daughter subclones. (Funded by the National Institutes of Health and others.

    Water-level dynamics in natural and artificial pools in blanket peatlands

    Get PDF
    Perennial pools are common natural features of peatlands and their hydrological functioning and turnover may be important for carbon fluxes, aquatic ecology and downstream water quality. Peatland restoration methods such as ditch blocking result in many new pools. However, little is known about the hydrological function of either pool type. We monitored six natural and six artificial pools on a Scottish blanket peatland. Pool water levels were more variable in all seasons in artificial pools having greater water level increases and faster recession responses to storms than natural pools. Pools overflowed by a median of 9 and 54 times pool volume per year for natural and artificial pools respectively but this varied widely because some large pools had small upslope catchments and vice versa. Mean peat water-table depths were similar between natural and artificial pool sites but much more variable over time at the artificial pool site, possibly due to a lower bulk specific yield across this site. Pool levels and pool-level fluctuations were not the same as those of local water tables in the adjacent peat. Pool level time-series were much smoother, with more damped rainfall or recession responses than those for peat water tables. There were strong hydraulic gradients between the peat and pools, with absolute water tables often being 20-30 cm higher or lower than water levels in pools only 1-4 m away. However, as peat hydraulic conductivity was very low (median of 1.5×10-5 and 1.4×10-6 cm s-1 at 30 and 50 cm depths at the natural pool site) there was little deep subsurface flow interaction. We conclude that: 1) for peat restoration projects, a larger total pool surface area is likely to result in smaller flood peaks downstream, at least during summer months, because peatland bulk specific yield will be greater; and 2) surface and near-surface connectivity during storm events and topographic context, rather than pool size alone, must be taken into account in future peatland pool and stream chemistry studies

    Design and implementation of a generalized laboratory data model

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Investigators in the biological sciences continue to exploit laboratory automation methods and have dramatically increased the rates at which they can generate data. In many environments, the methods themselves also evolve in a rapid and fluid manner. These observations point to the importance of robust information management systems in the modern laboratory. Designing and implementing such systems is non-trivial and it appears that in many cases a database project ultimately proves unserviceable.</p> <p>Results</p> <p>We describe a general modeling framework for laboratory data and its implementation as an information management system. The model utilizes several abstraction techniques, focusing especially on the concepts of inheritance and meta-data. Traditional approaches commingle event-oriented data with regular entity data in <it>ad hoc </it>ways. Instead, we define distinct regular entity and event schemas, but fully integrate these via a standardized interface. The design allows straightforward definition of a "processing pipeline" as a sequence of events, obviating the need for separate workflow management systems. A layer above the event-oriented schema integrates events into a workflow by defining "processing directives", which act as automated project managers of items in the system. Directives can be added or modified in an almost trivial fashion, i.e., without the need for schema modification or re-certification of applications. Association between regular entities and events is managed via simple "many-to-many" relationships. We describe the programming interface, as well as techniques for handling input/output, process control, and state transitions.</p> <p>Conclusion</p> <p>The implementation described here has served as the Washington University Genome Sequencing Center's primary information system for several years. It handles all transactions underlying a throughput rate of about 9 million sequencing reactions of various kinds per month and has handily weathered a number of major pipeline reconfigurations. The basic data model can be readily adapted to other high-volume processing environments.</p

    Genome modeling system: A knowledge management platform for genomics

    Get PDF
    In this work, we present the Genome Modeling System (GMS), an analysis information management system capable of executing automated genome analysis pipelines at a massive scale. The GMS framework provides detailed tracking of samples and data coupled with reliable and repeatable analysis pipelines. The GMS also serves as a platform for bioinformatics development, allowing a large team to collaborate on data analysis, or an individual researcher to leverage the work of others effectively within its data management system. Rather than separating ad-hoc analysis from rigorous, reproducible pipelines, the GMS promotes systematic integration between the two. As a demonstration of the GMS, we performed an integrated analysis of whole genome, exome and transcriptome sequencing data from a breast cancer cell line (HCC1395) and matched lymphoblastoid line (HCC1395BL). These data are available for users to test the software, complete tutorials and develop novel GMS pipeline configurations. The GMS is available at https://github.com/genome/gms

    The Origin and Evolution of Mutations in Acute Myeloid Leukemia

    Get PDF
    SummaryMost mutations in cancer genomes are thought to be acquired after the initiating event, which may cause genomic instability and drive clonal evolution. However, for acute myeloid leukemia (AML), normal karyotypes are common, and genomic instability is unusual. To better understand clonal evolution in AML, we sequenced the genomes of M3-AML samples with a known initiating event (PML-RARA) versus the genomes of normal karyotype M1-AML samples and the exomes of hematopoietic stem/progenitor cells (HSPCs) from healthy people. Collectively, the data suggest that most of the mutations found in AML genomes are actually random events that occurred in HSPCs before they acquired the initiating mutation; the mutational history of that cell is “captured” as the clone expands. In many cases, only one or two additional, cooperating mutations are needed to generate the malignant founding clone. Cells from the founding clone can acquire additional cooperating mutations, yielding subclones that can contribute to disease progression and/or relapse
    corecore