Search CORE

910 research outputs found

Data mining using the Catalogue of Somatic Mutations in Cancer BioMart

Author: A. Menzies
A. P. Butler
C. G. Cole
C. Y. Kok
D. Beare
den Dunnen
Ding
J. W. Teague
Janne
K. Leung
M. Jia
M. R. Stratton
McDermott
N. Bindal
P. A. Futreal
P. Gunasekaran
P. J. Campbell
Petitjean
R. Shepherd
S. A. Forbes
S. Bamford
S. Ward
Sharma
Stein
Vizcaino
Publication venue: Oxford University Press
Publication date
Field of study

Catalogue of Somatic Mutations in Cancer (COSMIC) (http://www.sanger.ac.uk/cosmic) is a publicly available resource providing information on somatic mutations implicated in human cancer. Release v51 (January 2011) includes data from just over 19 000 genes, 161 787 coding mutations and 5573 gene fusions, described in more than 577 000 tumour samples. COSMICMart (COSMIC BioMart) provides a flexible way to mine these data and combine somatic mutations with other biological relevant data sets. This article describes the data available in COSMIC along with examples of how to successfully mine and integrate data sets using COSMICMart

Crossref

PubMed Central

Cancer3D: understanding cancer mutations through protein structures.

Author: Godzik Adam
Hrabe Thomas
Porta-Pardo Eduard
Publication venue: eScholarship, University of California
Publication date: 01/01/2014
Field of study

The new era of cancer genomics is providing us with extensive knowledge of mutations and other alterations in cancer. The Cancer3D database at http://www.cancer3d.org gives an open and user-friendly way to analyze cancer missense mutations in the context of structures of proteins in which they are found. The database also helps users analyze the distribution patterns of the mutations as well as their relationship to changes in drug activity through two algorithms: e-Driver and e-Drug. These algorithms use knowledge of modular structure of genes and proteins to separately study each region. This approach allows users to find novel candidate driver regions or drug biomarkers that cannot be found when similar analyses are done on the whole-gene level. The Cancer3D database provides access to the results of such analyses based on data from The Cancer Genome Atlas (TCGA) and the Cancer Cell Line Encyclopedia (CCLE). In addition, it displays mutations from over 14,700 proteins mapped to more than 24,300 structures from PDB. This helps users visualize the distribution of mutations and identify novel three-dimensional patterns in their distribution

CiteSeerX

PubMed Central

eScholarship - University of California

Deriving a mutation index of carcinogenicity using protein structure and protein interfaces

Author: A Custodio
A David
A Dixit
A Hamosh
A Pal
AJ Bass
Anna Tramontano
B Reva
B Vogelstein
CJ Richardson
CM Croce
D Chasman
D Sims
D Talavera
D Xu
E Krissinel
EC Chao
ER Mardis
F Damm
Frances Pearl
G Birrane
G De Baets
H Boutselakis
H Carter
H Makishima
IA Adzhubei
IS Moreira
J Carlsson
Jarle Hakas
JM Hurst
JM Izarzugaza
JR Morris
K Wang
Konstantinos Mitsopoulos
L Breiman
L Ding
M Li
M Magrane
Marketa Zvelebil
MR Stratton
MR Stratton
MS Greenblatt
MW MacArthur
MY Frederic
Octavio Espinosa
P Flicek
P Kumar
P Srivastava
PA Chan
PA Futreal
PB Crowley
PC Ng
PC Ng
PD Stenson
PH Lee
PT Wan
PV Hornbeck
PY Chou
R Ferla
R Rajasekaran
RJ Kinsella
S Jones
S Sunyaev
S Velankar
SA Forbes
TM Anne
V Ramensky
W Huang da
W Kabsch
X Wang
X Wang
Y Bromberg
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2014
Field of study

With the advent of Next Generation Sequencing the identification of mutations in the genomes of healthy and diseased tissues has become commonplace. While much progress has been made to elucidate the aetiology of disease processes in cancer, the contributions to disease that many individual mutations make remain to be characterised and their downstream consequences on cancer phenotypes remain to be understood. Missense mutations commonly occur in cancers and their consequences remain challenging to predict. However, this knowledge is becoming more vital, for both assessing disease progression and for stratifying drug treatment regimes. Coupled with structural data, comprehensive genomic databases of mutations such as the 1000 Genomes project and COSMIC give an opportunity to investigate general principles of how cancer mutations disrupt proteins and their interactions at the molecular and network level. We describe a comprehensive comparison of cancer and neutral missense mutations; by combining features derived from structural and interface properties we have developed a carcinogenicity predictor, InCa (Index of Carcinogenicity). Upon comparison with other methods, we observe that InCa can predict mutations that might not be detected by other methods. We also discuss general limitations shared by all predictors that attempt to predict driver mutations and discuss how this could impact high-throughput predictions. A web interface to a server implementation is publicly available at http://inca.icr.ac.uk/

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Institute of Cancer Research Repository

Sussex Research Online

FigShare

Challenges in identifying cancer genes by analysis of exome sequencing data.

Author: Bandyopadhyay Sourav
Carter Hannah
Friend Stephen
Hofree Matan
Ideker Trey
Kreisberg Jason F
Mischel Paul S
Publication venue: eScholarship, University of California
Publication date: 01/07/2016
Field of study

Massively parallel sequencing has permitted an unprecedented examination of the cancer exome, leading to predictions that all genes important to cancer will soon be identified by genetic analysis of tumours. To examine this potential, here we evaluate the ability of state-of-the-art sequence analysis methods to specifically recover known cancer genes. While some cancer genes are identified by analysis of recurrence, spatial clustering or predicted impact of somatic mutations, many remain undetected due to lack of power to discriminate driver mutations from the background mutational load (13-60% recall of cancer genes impacted by somatic single-nucleotide variants, depending on the method). Cancer genes not detected by mutation recurrence also tend to be missed by all types of exome analysis. Nonetheless, these genes are implicated by other experiments such as functional genetic screens and expression profiling. These challenges are only partially addressed by increasing sample size and will likely hold even as greater numbers of tumours are analysed

PubMed Central

eScholarship - University of California

Comparison of TCGA and GENIE genomic datasets for the detection of clinically actionable alterations in breast cancer.

Author: Carpten John D
Kaur Pushpinder
Lang Julie E
Porras Tania B
Ring Alexander
Publication venue: eScholarship, University of California
Publication date: 01/02/2019
Field of study

Whole exome sequencing (WES), targeted gene panel sequencing and single nucleotide polymorphism (SNP) arrays are increasingly used for the identification of actionable alterations that are critical to cancer care. Here, we compared The Cancer Genome Atlas (TCGA) and the Genomics Evidence Neoplasia Information Exchange (GENIE) breast cancer genomic datasets (array and next generation sequencing (NGS) data) in detecting genomic alterations in clinically relevant genes. We performed an in silico analysis to determine the concordance in the frequencies of actionable mutations and copy number alterations/aberrations (CNAs) in the two most common breast cancer histologies, invasive lobular and invasive ductal carcinoma. We found that targeted sequencing identified a larger number of mutational hotspots and clinically significant amplifications that would have been missed by WES and SNP arrays in many actionable genes such as PIK3CA, EGFR, AKT3, FGFR1, ERBB2, ERBB3 and ESR1. The striking differences between the number of mutational hotspots and CNAs generated from these platforms highlight a number of factors that should be considered in the interpretation of array and NGS-based genomic data for precision medicine. Targeted panel sequencing was preferable to WES to define the full spectrum of somatic mutations present in a tumor

Directory of Open Access Journals

eScholarship - University of California

COSMIC: mining complete cancer genomes in the Catalogue of Somatic Mutations in Cancer

Author: A. Menzies
C. Cole
C. Y. Kok
D. Beare
Ding
J. W. Teague
K. Leung
M. Jia
M. R. Stratton
McLendon
N. Bindal
P. A. Futreal
P. J. Campbell
Petitjean
Pleasance
R. Shepherd
S. A. Forbes
S. Bamford
Stein
Stephens
Publication venue: Oxford University Press
Publication date
Field of study

COSMIC (http://www.sanger.ac.uk/cosmic) curates comprehensive information on somatic mutations in human cancer. Release v48 (July 2010) describes over 136 000 coding mutations in almost 542 000 tumour samples; of the 18 490 genes documented, 4803 (26%) have one or more mutations. Full scientific literature curations are available on 83 major cancer genes and 49 fusion gene pairs (19 new cancer genes and 30 new fusion pairs this year) and this number is continually increasing. Key amongst these is TP53, now available through a collaboration with the IARC p53 database. In addition to data from the Cancer Genome Project (CGP) at the Sanger Institute, UK, and The Cancer Genome Atlas project (TCGA), large systematic screens are also now curated. Major website upgrades now make these data much more mineable, with many new selection filters and graphics. A Biomart is now available allowing more automated data mining and integration with other biological databases. Annotation of genomic features has become a significant focus; COSMIC has begun curating full-genome resequencing experiments, developing new web pages, export formats and graphics styles. With all genomic information recently updated to GRCh37, COSMIC integrates many diverse types of mutation information and is making much closer links with Ensembl and other data resources

Crossref

PubMed Central

The Complete Spectrum of Yeast Chromosome Instability Genes Identifies Candidate CIN Cancer Genes and Functional Roles for ASTRA Complex Components

Author: A Breitkreutz
A Shevchenko
AH Tong
CM Anderson
DK Breslow
DP Cahill
E Jacinto
F Spencer
H Takai
H Takai
II Ouspenski
J Huen
J McLellan
JA Daniel
JJ Tate
JM Schvartzman
JR Veatch
KE Hurov
KW Yuen
Kyungjae Myung
L Ungar
M Costanzo
M Kanemaki
Megan Kofoed
Michael Snyder
Michelle S. Bloom
MJL de Hoon
MP Andersen
N Izumi
P Kanellis
PA Futreal
Payal Sipahimalani
Peter C. Stirling
Philip Hieter
R Loewith
RD Paulsen
S Ben-Aroya
S Ben-Aroya
S Mnaimneh
S Smith
SA Forbes
Shay Ben-Aroya
Stephanie Smith
SY Carroll
T Kaizuka
TD Barber
Tejomayee Solanki-Patil
X Wang
Z Horejsí
Zhijian Li
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Chromosome instability (CIN) is observed in most solid tumors and is linked to somatic mutations in genome integrity maintenance genes. The spectrum of mutations that cause CIN is only partly known and it is not possible to predict a priori all pathways whose disruption might lead to CIN. To address this issue, we generated a catalogue of CIN genes and pathways by screening ∼2,000 reduction-of-function alleles for 90% of essential genes in Saccharomyces cerevisiae. Integrating this with published CIN phenotypes for other yeast genes generated a systematic CIN gene dataset comprised of 692 genes. Enriched gene ontology terms defined cellular CIN pathways that, together with sequence orthologs, created a list of human CIN candidate genes, which we cross-referenced to published somatic mutation databases revealing hundreds of mutated CIN candidate genes. Characterization of some poorly characterized CIN genes revealed short telomeres in mutants of the ASTRA/TTT components TTI1 and ASA1. High-throughput phenotypic profiling links ASA1 to TTT (Tel2-Tti1-Tti2) complex function and to TORC1 signaling via Tor1p stability, consistent with the role of TTT in PI3-kinase related kinase biogenesis. The comprehensive CIN gene list presented here in principle comprises all conserved eukaryotic genome integrity pathways. Deriving human CIN candidate genes from the list allows direct cross-referencing with tumor mutational data and thus candidate mutations potentially driving CIN in tumors. Overall, the CIN gene spectrum reveals new chromosome biology and will help us to understand CIN phenotypes in human disease

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

ScholarWorks@UNIST

Design of a liver cancer-specific selector for the analysis of circulating tumor DNA

Author: Guo Xueqin
Liu Jun
Ma Yuanyuan
Meng Rui
Mu Feng
Niu Mingshan
Sun Yan
Tang Heng
Wang Huimin
Wang Jun
Wei Xiaoming
Wu Gang
Xue Jun
Yang Yun
Publication venue: 'Spandidos Publications'
Publication date: 01/01/2019
Field of study

Copenhagen University Research Information System

A framework for organizing cancer-related variations from existing databases, publications and NGS data using a High-performance Integrated Virtual Environment (HIVE)

Author: Chrichton Daniel J.
Mazumder Raja
Pan Yang
Shamsaddini Amirhossein
Simonyan Vahan
Smith Krista
Wu Tsung-Jung
Publication venue: Health Sciences Research Commons
Publication date: 01/01/2014
Field of study

Years of sequence feature curation by UniProtKB/Swiss-Prot, PIR-PSD, NCBI-CDD, RefSeq and other database biocurators has led to a rich repository of information on functional sites of genes and proteins. This information along with variation-related annotation can be used to scan human short sequence reads from next-generation sequencing (NGS) pipelines for presence of non-synonymous single-nucleotide variations (nsSNVs) that affect functional sites. This and similar workflows are becoming more important because thousands of NGS data sets are being made available through projects such as The Cancer Genome Atlas (TCGA), and researchers want to evaluate their biomarkers in genomic data. BioMuta, an integrated sequence feature database, provides a framework for automated and manual curation and integration of cancer-related sequence features so that they can be used in NGS analysis pipelines. Sequence feature information in BioMuta is collected from the Catalogue of Somatic Mutations in Cancer (COSMIC), ClinVar, UniProtKB and through biocuration of information available from publications. Additionally, nsSNVs identified through automated analysis of NGS data from TCGA are also included in the database. Because of the petabytes of data and information present in NGS primary repositories, a platform HIVE (High-performance Integrated Virtual Environment) for storing, analyzing, computing and curating NGS data and associated metadata has been developed. Using HIVE, 31 979 nsSNVs were identified in TCGA-derived NGS data from breast cancer patients. All variations identified through this process are stored in a Curated Short Read archive, and the nsSNVs from the tumor samples are included in BioMuta. Currently, BioMuta has 26 cancer types with 13 896 small-scale and 308 986 large-scale study-derived variations. Integration of variation data allows identifications of novel or common nsSNVs that can be prioritized in validation studies

PubMed Central

George Washington University: Health Sciences Research Commons (HSRC)