Search CORE

8 research outputs found

A Graph Theoretic Approach to Utilizing Protein Structure to Identify Non-Random Somatic Mutations

Author: Cheng Yuwei
Cheung Kei-Hoi
Modis Yorgo
Ryslik Gregory
Zhao Hongyu
Publication venue
Publication date: 12/07/2013
Field of study

Background: It is well known that the development of cancer is caused by the accumulation of somatic mutations within the genome. For oncogenes specifically, current research suggests that there is a small set of "driver" mutations that are primarily responsible for tumorigenesis. Further, due to some recent pharmacological successes in treating these driver mutations and their resulting tumors, a variety of methods have been developed to identify potential driver mutations using methods such as machine learning and mutational clustering. We propose a novel methodology that increases our power to identify mutational clusters by taking into account protein tertiary structure via a graph theoretical approach. Results: We have designed and implemented GraphPAC (Graph Protein Amino Acid Clustering) to identify mutational clustering while considering protein spatial structure. Using GraphPAC, we are able to detect novel clusters in proteins that are known to exhibit mutation clustering as well as identify clusters in proteins without evidence of prior clustering based on current methods. Specifically, by utilizing the spatial information available in the Protein Data Bank (PDB) along with the mutational data in the Catalogue of Somatic Mutations in Cancer (COSMIC), GraphPAC identifies new mutational clusters in well known oncogenes such as EGFR and KRAS. Further, by utilizing graph theory to account for the tertiary structure, GraphPAC identifies clusters in DPP4, NRP1 and other proteins not identified by existing methods. The R package is available at: http://bioconductor.org/packages/release/bioc/html/GraphPAC.html Conclusion: GraphPAC provides an alternative to iPAC and an extension to current methodology when identifying potential activating driver mutations by utilizing a graph theoretic approach when considering protein tertiary structure.Comment: 25 pages, 8 figures, 3 Table

arXiv.org e-Print Archive

Springer - Publisher Connector

A Spatial Simulation Approach to Account for Protein Structure When Identifying Non-Random Somatic Mutations

Author: Bjornson Robert
Cheng Yuwei
Cheung Kei-Hoi
Modis Yorgo
Ryslik Gregory
Zelterman Daniel
Zhao Hongyu
Publication venue
Publication date: 28/10/2013
Field of study

Background: Current research suggests that a small set of "driver" mutations are responsible for tumorigenesis while a larger body of "passenger" mutations occurs in the tumor but does not progress the disease. Due to recent pharmacological successes in treating cancers caused by driver mutations, a variety of of methodologies that attempt to identify such mutations have been developed. Based on the hypothesis that driver mutations tend to cluster in key regions of the protein, the development of cluster identification algorithms has become critical. Results: We have developed a novel methodology, SpacePAC (Spatial Protein Amino acid Clustering), that identifies mutational clustering by considering the protein tertiary structure directly in 3D space. By combining the mutational data in the Catalogue of Somatic Mutations in Cancer (COSMIC) and the spatial information in the Protein Data Bank (PDB), SpacePAC is able to identify novel mutation clusters in many proteins such as FGFR3 and CHRM2. In addition, SpacePAC is better able to localize the most significant mutational hotspots as demonstrated in the cases of BRAF and ALK. The R package is available on Bioconductor at: http://www.bioconductor.org/packages/release/bioc/html/SpacePAC.html Conclusion: SpacePAC adds a valuable tool to the identification of mutational clusters while considering protein tertiary structureComment: 16 pages, 8 Figures, 4 Table

arXiv.org e-Print Archive

Springer - Publisher Connector

SomInaClust: detection of cancer genes based on somatic mutation patterns of inactivation and clustering

Author: Fierro Gutierrez Ana Carolina Elisa
Marchal Kathleen
Van den Eynden Jimmy
Verbeke Lieven
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Background: With the advances in high throughput technologies, increasing amounts of cancer somatic mutation data are being generated and made available. Only a small number of (driver) mutations occur in driver genes and are responsible for carcinogenesis, while the majority of (passenger) mutations do not influence tumour biology. In this study, SomInaClust is introduced, a method that accurately identifies driver genes based on their mutation pattern across tumour samples and then classifies them into oncogenes or tumour suppressor genes respectively. Results: SomInaClust starts from the observation that oncogenes mainly contain mutations that, due to positive selection, cluster at similar positions in a gene across patient samples, whereas tumour suppressor genes contain a high number of protein-truncating mutations throughout the entire gene length. The method was shown to prioritize driver genes in 9 different solid cancers. Furthermore it was found to be complementary to existing similar-purpose methods with the additional advantages that it has a higher sensitivity, also for rare mutations (occurring in less than 1% of all samples), and it accurately classifies candidate driver genes in putative oncogenes and tumour suppressor genes. Pathway enrichment analysis showed that the identified genes belong to known cancer signalling pathways, and that the distinction between oncogenes and tumour suppressor genes is biologically relevant. Conclusions: SomInaClust was shown to detect candidate driver genes based on somatic mutation patterns of inactivation and clustering and to distinguish oncogenes from tumour suppressor genes. The method could be used for the identification of new cancer genes or to filter mutation data for further data-integration purposes

Crossref

Ghent University Academic Bibliography

PubMed Central

mutation3D:Cancer Gene Prediction Through Atomic Clustering of Coding Variants in the Structural Proteome

Author: Adzhubei
Alexandrov
Berman
Cancer Genome Atlas
Das
Das
Forbes
Fu
Futreal
Grantham
Guedes
Hanahan
Hodis
Kamburov
Kan
Kucukkal
Lawrence
Lawrence
McLaren
Miller M
Muller
Nishi
Petukh
Pieper
Pylayeva-Gupta
Ryslik
Ryslik
Sjöblom
Sneath
Stenson
Sørensen
Tamborero
Tusche
Velankar
Vucic
Wagner
Wang
Wei
Wood
Zhou
Publication venue: 'Wiley'
Publication date: 03/02/2016
Field of study

A new algorithm and Web server, mutation3D (http://mutation3d.org), proposes driver genes in cancer by identifying clusters of amino acid substitutions within tertiary protein structures. We demonstrate the feasibility of using a 3D clustering approach to implicate proteins in cancer based on explorations of single proteins using the mutation3D Web interface. On a large scale, we show that clustering with mutation3D is able to separate functional from nonfunctional mutations by analyzing a combination of 8,869 known inherited disease mutations and 2,004 SNPs overlaid together upon the same sets of crystal structures and homology models. Further, we present a systematic analysis of whole-genome and whole-exome cancer datasets to demonstrate that mutation3D identifies many known cancer genes as well as previously underexplored target genes. The mutation3D Web interface allows users to analyze their own mutation data in a variety of popular formats and provides seamless access to explore mutation clusters derived from over 975,000 somatic mutations reported by 6,811 cancer sequencing studies. The mutation3D Web interface is freely available with all major browsers supported

Crossref

Repositorio Institucional CONACYT

Online Research @ Cardiff

Royal Holloway - Pure

PubMed Central

A spatial simulation approach to account for protein structure when identifying non-random somatic mutations

Author: A Bardelli
A Koch
A Motegi
A Siraj
A Torkamani
A Wagner
A Youn
AE Gould
B Reva
B Vogelstein
C Cortes
C Greenman
C Hafner
Daniel Zelterman
F Rousseau
G Ryslik
GA Ryslik
Gregory A Ryslik
H Carter
H Davies
H Pages
H Rajagopalan
Hongyu Zhao
I Borg
IA Adzhubei
IB Weinstein
J Qing
J Sved
J Ye
JT Hartmann
JW Lee
K Haga
KC Hart
Kei-Hoi Cheung
L Breiman
M Ferretti
M Hollstein
M Kreitman
M Mao
N Friedman
P Andreu-Pérez
P Legoix
P Mazot
P Moreau
PC Ng
RE George
Robert D Bjornson
RT Bossi
S Faivre
SK Olsen
SR Hingorani
T Sjöblom
T Wang
T Zhou
TJ Lynch
TU Consortium
W Ockenga
Y Benjamini
Y Gong
Y Hadari
YH Tan
YJ Bang
Yorgo Modis
Yuwei Cheng
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Developing a computational approach to investigate the impacts of disease-causing mutations on protein function

Author: Pang Camilla Sih Mai
Publication venue: UCL (University College London)
Publication date: 28/04/2018
Field of study

This project uses bioinformatics protocols to explore the impacts of non-synonymous mutations (nsSNPs) in proteins associated with diseases, including germline, rare diseases and somatic diseases such as cancer. New approaches were explored for determining the impacts of disease-associated mutations on protein structure and function. Whilst this work has mainly concentrated on the analysis of cancer mutations, the methods developed are generic and could be applied to analysing other types of disease mutations. Different types of disease-causing mutations have been studied including germline diseases, somatic cancer mutations in oncogenes and tumour-suppressors, along with known activating and inactivating mutations in kinases. The proximity of disease-associated mutations has been analysed with respect to known functional sites reported by CSA, IBIS, along with predicted functional sites derived from the CATH classification of domain structure superfamilies. The latter are called FunSites, and are highly conserved residues within a CATH functional family (FunFam) – which is a functionally coherent subset of a CATH superfamily. Such sites include key catalytic residues as well as specificity determining residues and interface residues. Clear differences were found between oncogenes, tumour suppressor and germ-line mutations with oncogene mutations more likely to locate close to FunSites. Functional families that are highly enriched in disease mutations were identified and exploited structural data to identify clusters within proteins in these families that are enriched in mutations (using our MutClust program). We examined the tendencies of these clusters to lie close to the functional sites discussed above. For selected genes, the stability effects of disease mutations in cancer have also been investigated with a particular focus on activating mutations in FGFR3. These studies, which were supported by experimental validation, showed that activating mutations implicated in cancer tend to cause stabilisation of the active FGFR3 form, leading to its abnormal activity and oncogenesis. Mutationally enriched CATH FunFams were also used in the identification of cancer driver genes, which were then subjected to pathway and GO biological process analysis

UCL Discovery