2,859 research outputs found

    Knowledge Driven Approaches and Machine Learning Improve the Identification of Clinically Relevant Somatic Mutations in Cancer Genomics

    Get PDF
    For cancer genomics to fully expand its utility from research discovery to clinical adoption, somatic variant detection pipelines must be optimized and standardized to ensure identification of clinically relevant mutations and to reduce laborious and error-prone post-processing steps. To address the need for improved catalogues of clinically and biologically important somatic mutations, we developed DoCM, a Database of Curated Mutations in Cancer (http://docm.info), as described in Chapter 2. DoCM is an open source, openly licensed resource to enable the cancer research community to aggregate, store and track biologically and clinically important cancer variants. DoCM is currently comprised of 1,364 variants in 132 genes across 122 cancer subtypes, based on the curation of 876 publications. To demonstrate the utility of this resource, the mutations in DoCM were used to identify variants of established significance in cancer that were missed by standard variant discovery pipelines (Chapter 3). Sequencing data from 1,833 cases across four TCGA projects were reanalyzed and 1,228 putative variants that were missed in the original TCGA reports were identified. Validation sequencing data were produced from 93 of these cases to confirm the putative variant we detected with DoCM. Here, we demonstrated that at least one functionally important variant in DoCM was recovered in 41% of cases studied. A major bottleneck in the DoCM analysis in Chapter 3 was the filtering and manual review of somatic variants. Several steps in this post-processing phase of somatic variant calling have already been automated. However, false positive filtering and manual review of variant candidates remains as a major challenge, especially in high-throughput discovery projects or in clinical cancer diagnostics. In Chapter 4, an approach that systematized and standardized the post-processing of somatic variant calls using machine learning algorithms, trained on 41,000 manually reviewed variants from 20 cancer genome projects, is outlined. The approach accurately reproduced the manual review process on hold out test samples, and accurately predicted which variants would be confirmed by orthogonal validation sequencing data. When compared to traditional manual review, this approach increased identification of clinically actionable variants by 6.2%. These chapters outline studies that result in substantial improvements in the identification and interpretation of somatic variants, the use of which can standardize and streamline cancer genomics, enabling its use at high throughput as well as clinically

    A mitochondrial-focused genetic interaction map reveals a scaffold-like complex required for inner membrane organization in mitochondria.

    Get PDF
    To broadly explore mitochondrial structure and function as well as the communication of mitochondria with other cellular pathways, we constructed a quantitative, high-density genetic interaction map (the MITO-MAP) in Saccharomyces cerevisiae. The MITO-MAP provides a comprehensive view of mitochondrial function including insights into the activity of uncharacterized mitochondrial proteins and the functional connection between mitochondria and the ER. The MITO-MAP also reveals a large inner membrane-associated complex, which we term MitOS for mitochondrial organizing structure, comprised of Fcj1/Mitofilin, a conserved inner membrane protein, and five additional components. MitOS physically and functionally interacts with both outer and inner membrane components and localizes to extended structures that wrap around the inner membrane. We show that MitOS acts in concert with ATP synthase dimers to organize the inner membrane and promote normal mitochondrial morphology. We propose that MitOS acts as a conserved mitochondrial skeletal structure that differentiates regions of the inner membrane to establish the normal internal architecture of mitochondria

    The immunopeptidome landscape associated with T cell infiltration, inflammation and immune editing in lung cancer.

    Get PDF
    One key barrier to improving efficacy of personalized cancer immunotherapies that are dependent on the tumor antigenic landscape remains patient stratification. Although patients with CD3 <sup>+</sup> CD8 <sup>+</sup> T cell-inflamed tumors typically show better response to immune checkpoint inhibitors, it is still unknown whether the immunopeptidome repertoire presented in highly inflamed and noninflamed tumors is substantially different. We surveyed 61 tumor regions and adjacent nonmalignant lung tissues from 8 patients with lung cancer and performed deep antigen discovery combining immunopeptidomics, genomics, bulk and spatial transcriptomics, and explored the heterogeneous expression and presentation of tumor (neo)antigens. In the present study, we associated diverse immune cell populations with the immunopeptidome and found a relatively higher frequency of predicted neoantigens located within HLA-I presentation hotspots in CD3 <sup>+</sup> CD8 <sup>+</sup> T cell-excluded tumors. We associated such neoantigens with immune recognition, supporting their involvement in immune editing. This could have implications for the choice of combination therapies tailored to the patient's mutanome and immune microenvironment

    Genome mapping of a LYST mutation in corn snakes indicates that vertebrate chromatophore vesicles are lysosome-related organelles.

    Get PDF
    Reptiles exhibit a spectacular diversity of skin colors and patterns brought about by the interactions among three chromatophore types: black melanophores with melanin-packed melanosomes, red and yellow xanthophores with pteridine- and/or carotenoid-containing vesicles, and iridophores filled with light-reflecting platelets generating structural colors. Whereas the melanosome, the only color-producing endosome in mammals and birds, has been documented as a lysosome-related organelle, the maturation paths of xanthosomes and iridosomes are unknown. Here, we first use 10x Genomics linked-reads and optical mapping to assemble and annotate a nearly chromosome-quality genome of the corn snake Pantherophis guttatus The assembly is 1.71 Gb long, with an N50 of 16.8 Mb and L50 of 24. Second, we perform mapping-by-sequencing analyses and identify a 3.9-Mb genomic interval where the lavender variant resides. The lavender color morph in corn snakes is characterized by gray, rather than red, blotches on a pink, instead of orange, background. Third, our sequencing analyses reveal a single nucleotide polymorphism introducing a premature stop codon in the lysosomal trafficking regulator gene (LYST) that shortens the corresponding protein by 603 amino acids and removes evolutionary-conserved domains. Fourth, we use light and transmission electron microscopy comparative analyses of wild type versus lavender corn snakes and show that the color-producing endosomes of all chromatophores are substantially affected in the LYST mutant. Our work provides evidence characterizing xanthosomes in xanthophores and iridosomes in iridophores as lysosome-related organelles

    Einkorn genomics sheds light on history of the oldest domesticated wheat

    Full text link
    Einkorn (Triticum monococcum) was the first domesticated wheat species, and was central to the birth of agriculture and the Neolithic Revolution in the Fertile Crescent around 10,000 years ago1,2^{1,2}. Here we generate and analyse 5.2-Gb genome assemblies for wild and domesticated einkorn, including completely assembled centromeres. Einkorn centromeres are highly dynamic, showing evidence of ancient and recent centromere shifts caused by structural rearrangements. Whole-genome sequencing analysis of a diversity panel uncovered the population structure and evolutionary history of einkorn, revealing complex patterns of hybridizations and introgressions after the dispersal of domesticated einkorn from the Fertile Crescent. We also show that around 1% of the modern bread wheat (Triticum aestivum) A subgenome originates from einkorn. These resources and findings highlight the history of einkorn evolution and provide a basis to accelerate the genomics-assisted improvement of einkorn and bread wheat

    Automated, Systematic and Parallel Approaches to Software Testing in Bioinformatics

    Get PDF
    Software quality assurance becomes especially critical if bioinformatics tools are to be used in a translational medical setting, such as analysis and interpretation of biological data. We must ensure that only validated algorithms are used, and that they are implemented correctly in the analysis pipeline – and not disrupted by hardware or software failure. In this thesis, I review common quality assurance practice and guidelines for bioinformatics software testing. Furthermore, I present a novel cloud-based framework to enable automated testing of genetic sequence alignment programs. This framework performs testing based on gold standard simulation data sets, and metamorphic testing. I demonstrate the effectiveness of this cloudbased framework using two widely used sequence alignment programs, BWA and Bowtie, and some fault-seeded ‘mutant’ versions of BWA and Bowtie. This preliminary study demonstrates that this type of cloud-based software testing framework is an effective and promising way to implement quality assurance in bioinformatics software that is used in genomic medicine
    corecore