3,415 research outputs found
A Spatial Simulation Approach to Account for Protein Structure When Identifying Non-Random Somatic Mutations
Background: Current research suggests that a small set of "driver" mutations
are responsible for tumorigenesis while a larger body of "passenger" mutations
occurs in the tumor but does not progress the disease. Due to recent
pharmacological successes in treating cancers caused by driver mutations, a
variety of of methodologies that attempt to identify such mutations have been
developed. Based on the hypothesis that driver mutations tend to cluster in key
regions of the protein, the development of cluster identification algorithms
has become critical.
Results: We have developed a novel methodology, SpacePAC (Spatial Protein
Amino acid Clustering), that identifies mutational clustering by considering
the protein tertiary structure directly in 3D space. By combining the
mutational data in the Catalogue of Somatic Mutations in Cancer (COSMIC) and
the spatial information in the Protein Data Bank (PDB), SpacePAC is able to
identify novel mutation clusters in many proteins such as FGFR3 and CHRM2. In
addition, SpacePAC is better able to localize the most significant mutational
hotspots as demonstrated in the cases of BRAF and ALK. The R package is
available on Bioconductor at:
http://www.bioconductor.org/packages/release/bioc/html/SpacePAC.html
Conclusion: SpacePAC adds a valuable tool to the identification of mutational
clusters while considering protein tertiary structureComment: 16 pages, 8 Figures, 4 Table
Leveraging protein quaternary structure to identify oncogenic driver mutations.
BACKGROUND: Identifying key "driver" mutations which are responsible for tumorigenesis is critical in the development of new oncology drugs. Due to multiple pharmacological successes in treating cancers that are caused by such driver mutations, a large body of methods have been developed to differentiate these mutations from the benign "passenger" mutations which occur in the tumor but do not further progress the disease. Under the hypothesis that driver mutations tend to cluster in key regions of the protein, the development of algorithms that identify these clusters has become a critical area of research. RESULTS: We have developed a novel methodology, QuartPAC (Quaternary Protein Amino acid Clustering), that identifies non-random mutational clustering while utilizing the protein quaternary structure in 3D space. By integrating the spatial information in the Protein Data Bank (PDB) and the mutational data in the Catalogue of Somatic Mutations in Cancer (COSMIC), QuartPAC is able to identify clusters which are otherwise missed in a variety of proteins. The R package is available on Bioconductor at: http://bioconductor.jp/packages/3.1/bioc/html/QuartPAC.html . CONCLUSION: QuartPAC provides a unique tool to identify mutational clustering while accounting for the complete folded protein quaternary structure.This work was supported in part by NSF Grant DMS 1106738 (GR, HZ); NIH Grants GM59507 and CA154295 (HZ), and GM102869 (YM); and Wellcome Trust Grant 101908/Z/13/Z (YM)
Recommended from our members
Selection and competition of somatic mutations in normal epithelia
Tumourigenesis occurs when a series of genome alterations occur in the same group of cells and cause uncontrolled cell proliferation. Therefore, to understand the journey from healthy to cancerous tissue, it is important to study the accumulation and spread of mutations in pre- cancerous normal tissues. Recent studies have shown that apparently normal epithelium contains a high burden of mutations in cancer-associated genes. This thesis explores the behaviour of mutant clones in normal epithelium and how this affects cancer development.
The nature of mutant clonal growth and competition in normal epidermis has been a subject of debate. A study found that mutant clone sizes inferred from DNA sequencing of normal human eyelid skin were consistent with a mathematical model of neutral cell dynamics, appearing to contradict a genetic analysis of the same dataset that found several genes under positive selection. I investigate this debate using computational modelling that takes into account the tissue structure and experimental tissue-sampling methods. The results show that mutant clone sizes in skin and oesophagus are consistent with non-neutral clonal competition.
Further evidence for non-neutral selection in normal epithelium is found in patterns of mutations detected by DNA sequencing. By adapting a statistical method used for driver gene discovery, I look for enrichment or depletion of structural categories of missense mutations and find biologically meaningful patterns of selection in several proteins. The method can associate changes to protein structure or function with cell fitness, even in the absence of hotspot mutations and in the presence of passenger mutations. I demonstrate how the method is flexible and could be widely applicable, but can also produce misleading results if confounding sources of selection are not accounted for.
Clonal competition in normal oesophageal epithelium is dominated by Notch1 loss-of- function mutations. I fit stochastic models of clonal dynamics to lineage tracing data to show that haploinsufficiency greatly accelerates Notch1 mutant expansion and that the loss of the second Notch1 allele provides a further strong selective advantage, consistent with the high frequency of NOTCH1 loss-of-heterozygosity events observed in human oesophagus. Finally, I examine a consequence of the spread of these highly fit mutant clones in the normal tissue. I use a mathematical model to analyse the results of a series of experiments in mutagen-treated mouse oesophagus, finding that microscopic tumours can be eliminated by highly fit clones in the surrounding normal tissue.Harrison Watson Fund at Clare College, Cambridg
A Graph Theoretic Approach to Utilizing Protein Structure to Identify Non-Random Somatic Mutations
Background: It is well known that the development of cancer is caused by the
accumulation of somatic mutations within the genome. For oncogenes
specifically, current research suggests that there is a small set of "driver"
mutations that are primarily responsible for tumorigenesis. Further, due to
some recent pharmacological successes in treating these driver mutations and
their resulting tumors, a variety of methods have been developed to identify
potential driver mutations using methods such as machine learning and
mutational clustering. We propose a novel methodology that increases our power
to identify mutational clusters by taking into account protein tertiary
structure via a graph theoretical approach.
Results: We have designed and implemented GraphPAC (Graph Protein Amino Acid
Clustering) to identify mutational clustering while considering protein spatial
structure. Using GraphPAC, we are able to detect novel clusters in proteins
that are known to exhibit mutation clustering as well as identify clusters in
proteins without evidence of prior clustering based on current methods.
Specifically, by utilizing the spatial information available in the Protein
Data Bank (PDB) along with the mutational data in the Catalogue of Somatic
Mutations in Cancer (COSMIC), GraphPAC identifies new mutational clusters in
well known oncogenes such as EGFR and KRAS. Further, by utilizing graph theory
to account for the tertiary structure, GraphPAC identifies clusters in DPP4,
NRP1 and other proteins not identified by existing methods. The R package is
available at: http://bioconductor.org/packages/release/bioc/html/GraphPAC.html
Conclusion: GraphPAC provides an alternative to iPAC and an extension to
current methodology when identifying potential activating driver mutations by
utilizing a graph theoretic approach when considering protein tertiary
structure.Comment: 25 pages, 8 figures, 3 Table
Inference Of Natural Selection In Human Populations And Cancers: Testing, Extending, And Complementing Dn/ds-Like Approaches
Heritable traits tend to rise or fall in prevalence over time in accordance with their effect on survival and reproduction; this is the law of natural selection, the driving force behind speciation. Natural selection is both a consequence and (in cancer) a cause of disease. The new abundance of sequencing data has spurred the development of computational techniques to infer the strength of selection across a genome. One technique, dN/dS, compares mutation rates at mutation-tolerant synonymous sites with those at nonsynonymous sites to infer selection. This dissertation tests, extends, and complements dN/dS for inferring selection from sequencing data. First, I test whether the genomic community’s understanding of mutational processes is sufficient to use synonymous mutations to set expectations for nonsynonymous mutations. Second, I extend a dN/dS-like approach to the noncoding genome, where dN/dS is otherwise undefined, using conservation data among mammals. Third, I use evolutionary theory to co-develop a new technique for inferring selection within an individual patient’s tumor. Overall, this work advances our ability to infer selection pressure, prioritize disease-related genomic elements, and ultimately identify new therapeutic targets for patients suffering from a broad range of genetically-influenced diseases
Protein Structure-Guided Approaches to Identify Functional Mutations in Cancer
Distinguishing driver mutations from passenger mutations within tumor cells continues to be a major challenge in cancer genomics. Many computational tools have been developed to address this challenge; however, they rely heavily on primary protein sequence context and frequency/mutation rate. Rare driver mutations not found in many cancer patients may be missed with these traditional approaches. Additionally, the structural context of mutations on tertiary/quaternary protein structures is not taken into account and may play a more prominent role in determining phenotype and function. This dissertation first presents a novel computational tool called HotSpot3D, which identifies regions of protein structures that are enriched in proximal mutations from cancer patients and identifies clusters of mutations within a single protein as well as along the interface of protein-protein complexes. This tool gives insight to potential rare driver mutations that may cluster closely to known hotspot driver mutations as well as critical regions of proteins specific to certain cancer types. A small subset of predictions from this tool are validated using high throughput phosphorylation data and in vitro cell-based assay to support its biological utility. We then shift to studying the druggability of mutations and apply HotSpot3D to identify potential druggable mutations that cluster with known sensitive actionable mutations. We also demonstrate how utilizing integrative omics approaches better enables precision oncology; Combining multiple data types such as genomic mutations or mRNA/protein expression outliers as biomarkers of druggability can expand the druggable cohort, better inform treatment response, and nominate novel combinatorial therapies for clinical trials. Lastly, we improve driver predictions of HotSpot3D by creating a supervised learning approach that integrates additional biological features related to structural context beyond just positional clustering. Overall, this dissertation provides a suite of computational methods to explore mutations in the context of protein structure and their potential implications in oncogenesis
A theoretical investigation of the effect of proliferation & adhesion on monoclonal conversion in the colonic crypt
The surface epithelium lining the intestinal tract renews itself rapidly by a coordinated programme of cell proliferation, migration and differentiation events that is initiated in the crypts of Lieberkühn. It is generally believed that colorectal cancer arises due to mutations that disrupt the normal cellular dynamics of the crypts. Using a spatially structured cell-based model of a colonic crypt, we investigate the likelihood that the progeny of a mutated cell will dominate, or be sloughed out of, a crypt. Our approach is to perform multiple simulations, varying the spatial location of the initial mutation, and the proliferative and adhesive properties of the mutant cells, to obtain statistical distributions for the probability of their domination. Our simulations lead us to make a number of predictions. The process of monoclonal conversion always occurs, and does not require that the cell which initially gave rise to the population remains in the crypt. Mutations occurring more than one to two cells from the base of the crypt are unlikely to become the dominant clone. The probability of a mutant clone persisting in the crypt is sensitive to dysregulation of adhesion. By comparing simulation results with those from a simple one-dimensional stochastic model of population dynamics at the base of the crypt, we infer that this sensitivity is due to direct competition between wild-type and mutant cells at the base of the crypt. We also predict that increases in the extent of the spatial domain in which the mutant cells proliferate can give rise to counter-intuitive, non-linear changes to the probability of their fixation, due to effects that cannot be captured in simpler models
- …