Abstract Background Identification of novel cancer-causing genes is one of the main goals in cancer research. The rapid accumulation of genome-wide protein-protein interaction (PPI) data in humans has provided a new basis for studying the topological features of cancer genes in cellular networks. It is important to integrate multiple genomic data sources, including PPI networks, protein domains and Gene Ontology (GO) annotations, to facilitate the identification of cancer genes. Methods Topological features of the PPI network, as well as protein domain compositions, enrichment of gene ontology categories, sequence and evolutionary conservation features were extracted and compared between cancer genes and other genes. The predictive power of various classifiers for identification of cancer genes was evaluated by cross validation. Experimental validation of a subset of the prediction results was conducted using siRNA knockdown and viability assays in human colon cancer cell line DLD-1. Results Cross validation demonstrated advantageous performance of classifiers based on support vector machines (SVMs) with the inclusion of the topological features from the PPI network, protein domain compositions and GO annotations. We then applied the trained SVM classifier to human genes to prioritize putative cancer genes. siRNA knock-down of several SVM predicted cancer genes displayed greatly reduced cell viability in human colon cancer cell line DLD-1. Conclusion Topological features of PPI networks, protein domain compositions and GO annotations are good predictors of cancer genes. The SVM classifier integrates multiple features and as such is useful for prioritizing candidate cancer genes for experimental validations.</p

A Ergun

A Hamosh

A Shearn

A Yokoyama

A-L Barabasi

AW Whitehurst

C Alfarano

C Cortes

C Greenman

D Maglott

David P Davis

DW Litchfield

EW Sayers

F Natt

G Joshi-Tope

GS Stewart

HB Fraser

HY Chuang

J Luscher-Firzlaff

JA Hanley

James Lee

JS Kaminker

K Lage

Kangyu Zhang

L Franke

Li Li

M Kanehisa

M Yu

MA Harris

O Kim

P Aza-Blanc

PA Futreal

PF Jonsson

Q Cui

R Bergholdt

RD Finn

RK Thomas

RM Ewing

S Forbes

S Pan

S Peri

S Wachi

Shaun Cordes

SJ Furney

T Sjoblom

W-H Li

Y Ohta

Z Tu

Zhijun Tang

English

PubMed

Crossref

Discovering cancer genes by integrating network and functional properties

Springer - Publisher Connector

Abstract Background Identification of novel cancer-causing genes is one of the main goals in cancer research. The rapid accumulation of genome-wide protein-protein interaction (PPI) data in humans has provided a new basis for studying the topological features of cancer genes in cellular networks. It is important to integrate multiple genomic data sources, including PPI networks, protein domains and Gene Ontology (GO) annotations, to facilitate the identification of cancer genes. Methods Topological features of the PPI network, as well as protein domain compositions, enrichment of gene ontology categories, sequence and evolutionary conservation features were extracted and compared between cancer genes and other genes. The predictive power of various classifiers for identification of cancer genes was evaluated by cross validation. Experimental validation of a subset of the prediction results was conducted using siRNA knockdown and viability assays in human colon cancer cell line DLD-1. Results Cross validation demonstrated advantageous performance of classifiers based on support vector machines (SVMs) with the inclusion of the topological features from the PPI network, protein domain compositions and GO annotations. We then applied the trained SVM classifier to human genes to prioritize putative cancer genes. siRNA knock-down of several SVM predicted cancer genes displayed greatly reduced cell viability in human colon cancer cell line DLD-1. Conclusion Topological features of PPI networks, protein domain compositions and GO annotations are good predictors of cancer genes. The SVM classifier integrates multiple features and as such is useful for prioritizing candidate cancer genes for experimental validations.</p

Davis David P

Cordes Shaun

Lee James

Zhang Kangyu

Tang Zhijun

Directory of Open Access Journals

BMC Medical Genomics

A human phenome-interactome network of protein complexes implicated in genetic disorders. Nature biotechnology

A map of human cancer signaling. Molecular systems biology

Cleary ML: Leukemia proto-oncoprotein MLL forms a SET1-like histone methyltransferase complex with menin to regulate Hox gene expression. Molecular and cellular biology

Collins JJ: A network biology approach to prostate cancer. Molecular systems biology

Cooke MP: Identification of modulators of TRAIL-induced apoptosis via RNAi-based phenotypic screening. Molecular cell

Database resources of the National Center for Biotechnology Information. Nucleic acids research

Elledge SJ: MDC1 is a mediator of the mammalian DNA damage checkpoint. Nature

et al.: Reactome: a knowledgebase of biological pathways. Nucleic acids research

Etk/Bmx as a tumor necrosis factor receptor type 2-specific kinase: role in endothelial cell migration and angiogenesis. Molecular and cellular biology

Gehring W: Imaginal disc abnormalities in lethal mutants of Drosophila.

Global topological features of cancer proteins in the human interactome. Bioinformatics

GW: Genomic organization and structure of Bruton agammaglobulinemia tyrosine kinase: localization of mutations associated with varied clinical presentations and course in X chromosome-linked agammaglobulinemia.

Highthroughput oncogene mutation profiling in human cancer. Nature genetics

Human protein reference database as a discovery resource for proteomics. Nucleic acids research

Ideker T: Network-based classification of breast cancer metastasis. Molecular systems biology

Interactome-transcriptome analysis reveals the high centrality of genes differentially expressed in lung cancer tissues. Bioinformatics

Kremmer E, et al.: PARP10, a novel Myc-interacting protein with poly(ADP-ribose) polymerase activity, inhibits transformation. Oncogene

Kremmer E, et al.: The human trithorax protein hASH2 functions as an oncoprotein. Cancer research 2008, 68:749-758. Pre-publication history The pre-publication history for this paper can be accessed here:

Large-scale mapping of human protein-protein interactions by mass spectrometry. Molecular systems biology

Molecular Evolution Sunderland, Massachusetts 01375: Sinauer Associates,

MR: A census of human cancer genes. Nat Rev Cancer

Oltvai ZN: Network biology: understanding the cell's functional organization. Nat Rev Genet

Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic acids research

Patterns of somatic mutation in human cancer genomes. Nature

Pfam: clans, web tools and services. Nucl Acids Res

Pociot F: Integrative analysis for finding genes and networks involved in diabetes and other complex diseases. Genome biology

Protein kinase CK2: structure, regulation and role in cellular decisions of life and death. The Biochemical journal

Reconstruction of a functional human gene network, with an application for prioritizing positional candidate genes.

siRNAs in drug discovery: target validation and beyond. Current opinion in molecular therapeutics

Structural and functional properties of genes involved in human cancer.

Sun F: Further understanding human disease genes by comparing with housekeeping genes and other genes.

Synthetic lethal screen identification of chemosensitizer loci in cancer cells. Nature

Tatusova T: Entrez Gene: genecentered information at NCBI. Nucl Acids Res

The Biomolecular Interaction Network Database and related tools

The consensus coding sequences of human breast and colorectal cancers.

The Gene Ontology (GO) database and informatics resource. Nucleic acids research

The KEGG resource for deciphering the genome. Nucleic acids research

The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology

Using protein complexes to predict phenotypic effects of gene mutation. Genome biology

Vapnik V: Support-vector networks.

Zhang Z: Distinguishing Cancer-Associated Missense Mutations from Common Polymorphisms. Cancer research

file:///data/core-remote/dit/data/Springer-OA/pdf/ff0/aHR0cDovL2xpbmsuc3ByaW5nZXIuY29tLzEwLjExODYvMTc1NS04Nzk0LTItNjEucGRm.pdf

Discovering cancer genes by integrating network and functional properties

Abstract

Similar works

Full text

Available Versions

Crossref

Springer - Publisher Connector

Directory of Open Access Journals