Abstract Background Systematic approaches for identifying proteins involved in different types of cancer are needed. Experimental techniques such as microarrays are being used to characterize cancer, but validating their results can be a laborious task. Computational approaches are used to prioritize between genes putatively involved in cancer, usually based on further analyzing experimental data. Results We implemented a systematic method using the PIANA software that predicts cancer involvement of genes by integrating heterogeneous datasets. Specifically, we produced lists of genes likely to be involved in cancer by relying on: (i) protein-protein interactions; (ii) differential expression data; and (iii) structural and functional properties of cancer genes. The integrative approach that combines multiple sources of data obtained positive predictive values ranging from 23% (on a list of 811 genes) to 73% (on a list of 22 genes), outperforming the use of any of the data sources alone. We analyze a list of 20 cancer gene predictions, finding that most of them have been recently linked to cancer in literature. Conclusion Our approach to identifying and prioritizing candidate cancer genes can be used to produce lists of genes likely to be involved in cancer. Our results suggest that differential expression studies yielding high numbers of candidate cancer genes can be filtered using protein interaction networks. </p

A Aouacheria

A Chatr-aryamontri

A Subramanian

AC Gavin

AL Barabasi

AL Welm

B Schwikowski

B Vogelstein

Baldo Oliva

C Alfarano

C Fan

C Stark

Chris Sander

D Hanahan

DA Notterman

DB Allison

DR Rhodes

DX Nguyen

E Kunze

E Segal

EH Davidson

G Joshi-Tope

HJ Lee

J Lim

J Ptacek

JB Welsh

JH Bielas

JJ Hong

JP Mathew

K Lage

KI Goh

L Espana

L Salwinski

LJ Jensen

MA Harris

ME Higgins

O Mendez

P Hu

P Pagel

PA Futreal

PF Jonsson

Q Tian

R Aragues

R Hoffmann

R Ihaka

R Lucito

R Sharan

Ramon Aragues

S Draghici

S Kerrien

S Peri

S Varambally

SA Tomlins

SJ Furney

SM Dhanasekaran

T Mehta

TK Gandhi

VK Mootha

WC Cho

WK Huh

WP Kuo

Y Yuan

English

PubMed

Abstract Background Systematic approaches for identifying proteins involved in different types of cancer are needed. Experimental techniques such as microarrays are being used to characterize cancer, but validating their results can be a laborious task. Computational approaches are used to prioritize between genes putatively involved in cancer, usually based on further analyzing experimental data. Results We implemented a systematic method using the PIANA software that predicts cancer involvement of genes by integrating heterogeneous datasets. Specifically, we produced lists of genes likely to be involved in cancer by relying on: (i) protein-protein interactions; (ii) differential expression data; and (iii) structural and functional properties of cancer genes. The integrative approach that combines multiple sources of data obtained positive predictive values ranging from 23% (on a list of 811 genes) to 73% (on a list of 22 genes), outperforming the use of any of the data sources alone. We analyze a list of 20 cancer gene predictions, finding that most of them have been recently linked to cancer in literature. Conclusion Our approach to identifying and prioritizing candidate cancer genes can be used to produce lists of genes likely to be involved in cancer. Our results suggest that differential expression studies yielding high numbers of candidate cancer genes can be filtered using protein interaction networks. </p

Oliva Baldo

Sander Chris

Aragues Ramon

Directory of Open Access Journals

BMC Bioinformatics

Predicting cancer involvement of genes from heterogeneous data

Springer - Publisher Connector

Crossref

Background: Systematic approaches for identifying proteins involved in different types of cancer are needed. Experimental techniques such as microarrays are being used to characterize cancer, but validating their results can be a laborious task. Computational approaches are used to prioritize between genes putatively involved in cancer, usually based on further analyzing experimental data. Results: We implemented a systematic method using the PIANA software that predicts cancer involvement of genes by integrating heterogeneous datasets. Specifically, we produced lists of genes likely to be involved in cancer by relying on: (i) protein-protein interactions; (ii) differential expression data; and (iii) structural and functional properties of cancer genes. The integrative approach that combines multiple sources of data obtained positive predictive values ranging from 23% (on a list of 811 genes) to 73% (on a list of 22 genes), outperforming the use of any of the data sources alone. We analyze a list of 20 cancer gene predictions, finding that most of them have been recently linked to cancer in literature. Conclusion: /nOur approach to identifying and prioritizing candidate cancer genes can be used to produce lists of genes likely to be involved in cancer. Our results suggest that differential expression studies yielding high numbers of candidate cancer genes can be filtered using protein interaction networks.RA is supported by a grant from the Spanish Ministerio de Ciencia y Tecnología (MCyT, BIO2002-03609). The work has been supported by grants from the Spanish Ministerio de Educación y Ciencia (MEC, BIO02005-00533) and from the Spanish Ministerio de Ciencia y Tecnologia (PROFIT PSE-010000-2007-1 and FIT-350300-2006-40/41/42

Aragüés Peleato, Ramón

Sander, Chris

Oliva Miguel, Baldomero

UPF Digital Repository

file:///data/core-remote/dit/data/Springer-OA/pdf/fa5/aHR0cDovL2xpbmsuc3ByaW5nZXIuY29tLzEwLjExODYvMTQ3MS0yMTA1LTktMTcyLnBkZg==.pdf

Predicting cancer involvement of genes from heterogeneous data

Abstract

Similar works

Full text

Available Versions

Directory of Open Access Journals

Springer - Publisher Connector

Crossref

UPF Digital Repository

Springer - Publisher Connector