Search CORE

602 research outputs found

Statistical method on nonrandom clustering with application to somatic mutations in cancer

Author: A Bardelli
A Torkamani
A Wagner
Adam Pavlicek
AJ Strongosky
B Vogelstein
C Greenman
Cancer Genome Atlas Research Network
CH Huang
Chi-Hse Teng
D Graur
DL Evans
DP Cahill
DW Parsons
Elizabeth A Lunney
H Davies
H Davies
H Song
HM Berman
IB Weinstein
IF Mata
IW Burr
J Glaz
J Sved
JI Naus
JI Naus
Jingjing Ye
JL Bos
JM Nigro
JS Kaminker
L Ding
M Hollstein
N Balakrishnan
NL Johnson
PA Jones
Paul A Rejto
PJ Morin
R Inzelberg
S Jones
SA Forbes
T Hagen
T Sjöblom
T Tolkacheva
TL Wang
WP Yu
Y Benjamini
Y Benjamini
Y Samuels
Y Wang
Y-X Fan
YL Yip
Z Yang
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Human cancer is caused by the accumulation of tumor-specific mutations in oncogenes and tumor suppressors that confer a selective growth advantage to cells. As a consequence of genomic instability and high levels of proliferation, many passenger mutations that do not contribute to the cancer phenotype arise alongside mutations that drive oncogenesis. While several approaches have been developed to separate driver mutations from passengers, few approaches can specifically identify activating driver mutations in oncogenes, which are more amenable for pharmacological intervention. Results We propose a new statistical method for detecting activating mutations in cancer by identifying nonrandom clusters of amino acid mutations in protein sequences. A probability model is derived using order statistics assuming that the location of amino acid mutations on a protein follows a uniform distribution. Our statistical measure is the differences between pair-wise order statistics, which is equivalent to the size of an amino acid mutation cluster, and the probabilities are derived from exact and approximate distributions of the statistical measure. Using data in the Catalog of Somatic Mutations in Cancer (COSMIC) database, we have demonstrated that our method detects well-known clusters of activating mutations in KRAS, BRAF, PI3K, and <it>β</it>-catenin. The method can also identify new cancer targets as well as gain-of-function mutations in tumor suppressors. Conclusions Our proposed method is useful to discover activating driver mutations in cancer by identifying nonrandom clusters of somatic amino acid mutations in protein sequences.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Utilizing Protein Structure to Identify Non-Random Somatic Mutations

Author: Cheng Yuwei
Cheung Kei-Hoi
Modis Yorgo
Ryslik Gregory
Zhao Hongyu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 27/02/2013
Field of study

Motivation: Human cancer is caused by the accumulation of somatic mutations in tumor suppressors and oncogenes within the genome. In the case of oncogenes, recent theory suggests that there are only a few key "driver" mutations responsible for tumorigenesis. As there have been significant pharmacological successes in developing drugs that treat cancers that carry these driver mutations, several methods that rely on mutational clustering have been developed to identify them. However, these methods consider proteins as a single strand without taking their spatial structures into account. We propose a new methodology that incorporates protein tertiary structure in order to increase our power when identifying mutation clustering. Results: We have developed a novel algorithm, iPAC: identification of Protein Amino acid Clustering, for the identification of non-random somatic mutations in proteins that takes into account the three dimensional protein structure. By using the tertiary information, we are able to detect both novel clusters in proteins that are known to exhibit mutation clustering as well as identify clusters in proteins without evidence of clustering based on existing methods. For example, by combining the data in the Protein Data Bank (PDB) and the Catalogue of Somatic Mutations in Cancer, our algorithm identifies new mutational clusters in well known cancer proteins such as KRAS and PI3KCa. Further, by utilizing the tertiary structure, our algorithm also identifies clusters in EGFR, EIF2AK2, and other proteins that are not identified by current methodology

arXiv.org e-Print Archive

Springer - Publisher Connector

A Spatial Simulation Approach to Account for Protein Structure When Identifying Non-Random Somatic Mutations

Author: Bjornson Robert
Cheng Yuwei
Cheung Kei-Hoi
Modis Yorgo
Ryslik Gregory
Zelterman Daniel
Zhao Hongyu
Publication venue
Publication date: 28/10/2013
Field of study

Background: Current research suggests that a small set of "driver" mutations are responsible for tumorigenesis while a larger body of "passenger" mutations occurs in the tumor but does not progress the disease. Due to recent pharmacological successes in treating cancers caused by driver mutations, a variety of of methodologies that attempt to identify such mutations have been developed. Based on the hypothesis that driver mutations tend to cluster in key regions of the protein, the development of cluster identification algorithms has become critical. Results: We have developed a novel methodology, SpacePAC (Spatial Protein Amino acid Clustering), that identifies mutational clustering by considering the protein tertiary structure directly in 3D space. By combining the mutational data in the Catalogue of Somatic Mutations in Cancer (COSMIC) and the spatial information in the Protein Data Bank (PDB), SpacePAC is able to identify novel mutation clusters in many proteins such as FGFR3 and CHRM2. In addition, SpacePAC is better able to localize the most significant mutational hotspots as demonstrated in the cases of BRAF and ALK. The R package is available on Bioconductor at: http://www.bioconductor.org/packages/release/bioc/html/SpacePAC.html Conclusion: SpacePAC adds a valuable tool to the identification of mutational clusters while considering protein tertiary structureComment: 16 pages, 8 Figures, 4 Table

arXiv.org e-Print Archive

Springer - Publisher Connector

Landscape of somatic single nucleotide variants and indels in colorectal cancer and impact on survival

Author: al. et
Cao Yin
Zaidi Syed H.
Publication venue: Digital Commons@Becker
Publication date: 01/01/2020
Field of study

Digital Commons@Becker

mutation3D:Cancer Gene Prediction Through Atomic Clustering of Coding Variants in the Structural Proteome

Author: Adzhubei
Alexandrov
Berman
Cancer Genome Atlas
Das
Das
Forbes
Fu
Futreal
Grantham
Guedes
Hanahan
Hodis
Kamburov
Kan
Kucukkal
Lawrence
Lawrence
McLaren
Miller M
Muller
Nishi
Petukh
Pieper
Pylayeva-Gupta
Ryslik
Ryslik
Sjöblom
Sneath
Stenson
Sørensen
Tamborero
Tusche
Velankar
Vucic
Wagner
Wang
Wei
Wood
Zhou
Publication venue: 'Wiley'
Publication date: 03/02/2016
Field of study

A new algorithm and Web server, mutation3D (http://mutation3d.org), proposes driver genes in cancer by identifying clusters of amino acid substitutions within tertiary protein structures. We demonstrate the feasibility of using a 3D clustering approach to implicate proteins in cancer based on explorations of single proteins using the mutation3D Web interface. On a large scale, we show that clustering with mutation3D is able to separate functional from nonfunctional mutations by analyzing a combination of 8,869 known inherited disease mutations and 2,004 SNPs overlaid together upon the same sets of crystal structures and homology models. Further, we present a systematic analysis of whole-genome and whole-exome cancer datasets to demonstrate that mutation3D identifies many known cancer genes as well as previously underexplored target genes. The mutation3D Web interface allows users to analyze their own mutation data in a variety of popular formats and provides seamless access to explore mutation clusters derived from over 975,000 somatic mutations reported by 6,811 cancer sequencing studies. The mutation3D Web interface is freely available with all major browsers supported

Crossref

Repositorio Institucional CONACYT

Online Research @ Cardiff

Royal Holloway - Pure

PubMed Central

Discerning Drivers of Cancer: Computational Approaches to Somatic Exome Sequencing Data

Author: Kumar Runjun
Publication venue: Washington University Open Scholarship
Publication date: 15/05/2018
Field of study

Paired tumor-normal sequencing of thousands of patient’s exomes has revealed millions of somatic mutations, but functional characterization and clinical decision making are stymied because biologically neutral ‘passenger’ mutations greatly outnumber pathogenic ‘driver’ mutations. Since most mutations will return negative results if tested, conventional resource-intensive experiments are reserved for mutations which are observed in multiple patients or rarer mutations found in well-established cancer genes. Most mutations are therefore never tested, diminishing the potential to discover new mechanisms of cancer development and treatment opportunities. Computational methods that reliably prioritize mutations for testing would greatly increase the translation of sequencing results to clinical care. The goal of this thesis is to develop new approaches that use datasets of protein-coding somatic mutations to identify putative cancer-causing genes and mutations, and to validate these predictions in silico and experimentally. This effort will be split among several inter-related efforts, which taken together will help experimental biologists and clinicians focus on hypotheses that can yield novel insights into cancer biology, development, and treatment

Washington University St. Louis: Open Scholarship

Recommended from our members

The impact of chromosomal translocation locus and fusion oncogene coding sequence in synovial sarcomagenesis.

Author: Barrott JJ
Cairns BR
Capecchi MR
Ding L
Haldar M
Jin H
Jones KB
Langer EM
Monument MJ
Mosbruger TL
Randall RL
Wilson RK
Xie M
Zhu J-F
Publication venue: eScholarship, University of California
Publication date: 01/09/2016
Field of study

Synovial sarcomas are aggressive soft-tissue malignancies that express chromosomal translocation-generated fusion genes, SS18-SSX1 or SS18-SSX2 in most cases. Here, we report a mouse sarcoma model expressing SS18-SSX1, complementing our prior model expressing SS18-SSX2. Exome sequencing identified no recurrent secondary mutations in tumors of either genotype. Most of the few mutations identified in single tumors were present in genes that were minimally or not expressed in any of the tumors. Chromosome 6, either entirely or around the fusion gene expression locus, demonstrated a copy number gain in a majority of tumors of both genotypes. Thus, by fusion oncogene coding sequence alone, SS18-SSX1 and SS18-SSX2 can each drive comparable synovial sarcomagenesis, independent from other genetic drivers. SS18-SSX1 and SS18-SSX2 tumor transcriptomes demonstrated very few consistent differences overall. In direct tumorigenesis comparisons, SS18-SSX2 was slightly more sarcomagenic than SS18-SSX1, but equivalent in its generation of biphasic histologic features. Meta-analysis of human synovial sarcoma patient series identified two tumor-gentoype-phenotype correlations that were not modeled by the mice, namely a scarcity of male hosts and biphasic histologic features among SS18-SSX2 tumors. Re-analysis of human SS18-SSX1 and SS18-SSX2 tumor transcriptomes demonstrated very few consistent differences, but highlighted increased native SSX2 expression in SS18-SSX1 tumors. This suggests that the translocated locus may drive genotype-phenotype differences more than the coding sequence of the fusion gene created. Two possible roles for native SSX2 in synovial sarcomagenesis are explored. Thus, even specific partial failures of mouse genetic modeling can be instructive to human tumor biology

eScholarship - University of California

Understanding how cancer mutations hinder the interactions inside proteins

Author: Sáenz Ausejo Carmen
Publication venue
Publication date: 01/01/2018
Field of study

Trabajo fin de máster en Bioinformática y Biología ComputacionalThe acquisition of somatic mutations can induce cancer by dysregulating the delicate mechanisms controlling balance between proliferation and apoptosis. Genomic alterations can be classified in driver and passenger mutations. Driver mutations confer selective advantage to tumor development, contrarily to passenger mutations that do not provide growth advantage to tumorigenesis. Most of the driver mutations have unknown functional impact on protein structure and function. Furthermore, not all driver alterations in a cancer gene have the same functional impact. The use of high-throughput sequencing technologies facilitated the discovery of cancer related mutations in case and control studies. The analysis of different tumor types facilitates the identification of recurrent mutations and the functional pathways involved in tumor development. One of the current challenges is to distinguish between drivers and passenger mutations. Mutations occurring with high frequency in tumor samples are considered to be drivers. Therefore, a commonly used method is to consider mutations that occur with higher frequency than a background mutation rate. Tamborero et al., (2013) developed a method to identify cancer related genes by grouping together residues with a significant rate of mutations that are close in the primary sequence of the protein above the background model. The background model was generated considering coding-silent mutations based on the evidences of a nonrandom mutation processes along the genome (Amos, 2010). Recently, Gao et al., (2017) identified genomic mutations affecting residues located in 3-dimensional proximity of protein structures by comparing the mutation frequency against a random background. The first method used gene sequences, considering proteins as single strands, and omitted that distant genomic regions might be close in the 3D space when the protein folds. And the second method assumed a homogeneous mutation probability across the whole genome, which is likely an oversimplification that may introduces a bias in the expected mutation rates (Amos, 2010). Both problems were considered in this study for the development of the algorithm. This method identifies associated with BRCA-mutated breast cancer using coding-silent Understanding how cancer mutations hinder the interactions inside proteins V Summary mutation frequency as a background. Furthermore, the method identified structural and catalytic roles of 3D protein clusters within relevant biological pathways in breast cancer. This method considered that a 3D protein cluster is significant when the residues within it have a higher non-synonymous mutation rate as compared to the background mutation rate. Most of the significant 3D protein clusters were located within PIK3CA gene. Additionally, most of the mutations in the 3D clusters were predominantly found in the kinase and helical domains of the corresponding protein (PI3K). These mutations destabilize the inactive conformation of the proteins or lock the activation loop in an active conformation resulting in constitutive protein activation. Thus, significant 3D protein clusters in PIK3CA contain ideal hot-spot mutants to target with anti-cancer agents (Gabelli, Mandelker, Schmidt-Kittler, Vogelstein, & Amzel, 2010). Nowadays, treatments with PI3K inhibitors are available. However, the oncogenic PI3K pathway activation is achieved in different redundant ways, therefore mono-therapies are not always effective. In conclusion, the results of this Master´s Thesis can help to understand better the interactions of the non-synonymous mutations in the 3D protein space to identify new targets, develop new therapies and consequently maximize the therapeutic benefi

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Biblos-e Archivo

A Graph Theoretic Approach to Utilizing Protein Structure to Identify Non-Random Somatic Mutations

Author: Cheng Yuwei
Cheung Kei-Hoi
Modis Yorgo
Ryslik Gregory
Zhao Hongyu
Publication venue
Publication date: 12/07/2013
Field of study

Background: It is well known that the development of cancer is caused by the accumulation of somatic mutations within the genome. For oncogenes specifically, current research suggests that there is a small set of "driver" mutations that are primarily responsible for tumorigenesis. Further, due to some recent pharmacological successes in treating these driver mutations and their resulting tumors, a variety of methods have been developed to identify potential driver mutations using methods such as machine learning and mutational clustering. We propose a novel methodology that increases our power to identify mutational clusters by taking into account protein tertiary structure via a graph theoretical approach. Results: We have designed and implemented GraphPAC (Graph Protein Amino Acid Clustering) to identify mutational clustering while considering protein spatial structure. Using GraphPAC, we are able to detect novel clusters in proteins that are known to exhibit mutation clustering as well as identify clusters in proteins without evidence of prior clustering based on current methods. Specifically, by utilizing the spatial information available in the Protein Data Bank (PDB) along with the mutational data in the Catalogue of Somatic Mutations in Cancer (COSMIC), GraphPAC identifies new mutational clusters in well known oncogenes such as EGFR and KRAS. Further, by utilizing graph theory to account for the tertiary structure, GraphPAC identifies clusters in DPP4, NRP1 and other proteins not identified by existing methods. The R package is available at: http://bioconductor.org/packages/release/bioc/html/GraphPAC.html Conclusion: GraphPAC provides an alternative to iPAC and an extension to current methodology when identifying potential activating driver mutations by utilizing a graph theoretic approach when considering protein tertiary structure.Comment: 25 pages, 8 figures, 3 Table

arXiv.org e-Print Archive

Springer - Publisher Connector