38 research outputs found
Increased entropy of signal transduction in the cancer metastasis phenotype
Studies into the statistical properties of biological networks have led to
important biological insights, such as the presence of hubs and hierarchical
modularity. There is also a growing interest in studying the statistical
properties of networks in the context of cancer genomics. However, relatively
little is known as to what network features differ between the cancer and
normal cell physiologies, or between different cancer cell phenotypes. Based on
the observation that frequent genomic alterations underlie a more aggressive
cancer phenotype, we asked if such an effect could be detectable as an increase
in the randomness of local gene expression patterns. Using a breast cancer gene
expression data set and a model network of protein interactions we derive
constrained weighted networks defined by a stochastic information flux matrix
reflecting expression correlations between interacting proteins. Based on this
stochastic matrix we propose and compute an entropy measure that quantifies the
degree of randomness in the local pattern of information flux around single
genes. By comparing the local entropies in the non-metastatic versus metastatic
breast cancer networks, we here show that breast cancers that metastasize are
characterised by a small yet significant increase in the degree of randomness
of local expression patterns. We validate this result in three additional
breast cancer expression data sets and demonstrate that local entropy better
characterises the metastatic phenotype than other non-entropy based measures.
We show that increases in entropy can be used to identify genes and signalling
pathways implicated in breast cancer metastasis. Further exploration of such
integrated cancer expression and protein interaction networks will therefore be
a fruitful endeavour.Comment: 5 figures, 2 Supplementary Figures and Table
On dynamic network entropy in cancer
The cellular phenotype is described by a complex network of molecular
interactions. Elucidating network properties that distinguish disease from the
healthy cellular state is therefore of critical importance for gaining
systems-level insights into disease mechanisms and ultimately for developing
improved therapies. By integrating gene expression data with a protein
interaction network to induce a stochastic dynamics on the network, we here
demonstrate that cancer cells are characterised by an increase in the dynamic
network entropy, compared to cells of normal physiology. Using a fundamental
relation between the macroscopic resilience of a dynamical system and the
uncertainty (entropy) in the underlying microscopic processes, we argue that
cancer cells will be more robust to random gene perturbations. In addition, we
formally demonstrate that gene expression differences between normal and cancer
tissue are anticorrelated with local dynamic entropy changes, thus providing a
systemic link between gene expression changes at the nodes and their local
network dynamics. In particular, we also find that genes which drive
cell-proliferation in cancer cells and which often encode oncogenes are
associated with reductions in the dynamic network entropy. In summary, our
results support the view that the observed increased robustness of cancer cells
to perturbation and therapy may be due to an increase in the dynamic network
entropy that allows cells to adapt to the new cellular stresses. Conversely,
genes that exhibit local flux entropy decreases in cancer may render cancer
cells more susceptible to targeted intervention and may therefore represent
promising drug targets.Comment: 10 pages, 3 figures, 4 tables. Submitte
An Integrative -omics Approach to Identify Functional Sub-Networks in Human Colorectal Cancer
Emerging evidence indicates that gene products implicated in human cancers often cluster together in “hot spots” in protein-protein interaction (PPI) networks. Additionally, small sub-networks within PPI networks that demonstrate synergistic differential expression with respect to tumorigenic phenotypes were recently shown to be more accurate classifiers of disease progression when compared to single targets identified by traditional approaches. However, many of these studies rely exclusively on mRNA expression data, a useful but limited measure of cellular activity. Proteomic profiling experiments provide information at the post-translational level, yet they generally screen only a limited fraction of the proteome. Here, we demonstrate that integration of these complementary data sources with a “proteomics-first” approach can enhance the discovery of candidate sub-networks in cancer that are well-suited for mechanistic validation in disease. We propose that small changes in the mRNA expression of multiple genes in the neighborhood of a protein-hub can be synergistically associated with significant changes in the activity of that protein and its network neighbors. Further, we hypothesize that proteomic targets with significant fold change between phenotype and control may be used to “seed” a search for small PPI sub-networks that are functionally associated with these targets. To test this hypothesis, we select proteomic targets having significant expression changes in human colorectal cancer (CRC) from two independent 2-D gel-based screens. Then, we use random walk based models of network crosstalk and develop novel reference models to identify sub-networks that are statistically significant in terms of their functional association with these proteomic targets. Subsequently, using an information-theoretic measure, we evaluate synergistic changes in the activity of identified sub-networks based on genome-wide screens of mRNA expression in CRC. Cross-classification experiments to predict disease class show excellent performance using only a few sub-networks, underwriting the strength of the proposed approach in discovering relevant and reproducible sub-networks
Prediction and Testing of Biological Networks Underlying Intestinal Cancer
Colorectal cancer progresses through an accumulation of somatic mutations, some of which reside in so-called “driver” genes that provide a growth advantage to the tumor. To identify points of intersection between driver gene pathways, we implemented a network analysis framework using protein interactions to predict likely connections – both precedented and novel – between key driver genes in cancer. We applied the framework to find significant connections between two genes, Apc and Cdkn1a (p21), known to be synergistic in tumorigenesis in mouse models. We then assessed the functional coherence of the resulting Apc-Cdkn1a network by engineering in vivo single node perturbations of the network: mouse models mutated individually at Apc (Apc1638N+/−) or Cdkn1a (Cdkn1a−/−), followed by measurements of protein and gene expression changes in intestinal epithelial tissue. We hypothesized that if the predicted network is biologically coherent (functional), then the predicted nodes should associate more specifically with dysregulated genes and proteins than stochastically selected genes and proteins. The predicted Apc-Cdkn1a network was significantly perturbed at the mRNA-level by both single gene knockouts, and the predictions were also strongly supported based on physical proximity and mRNA coexpression of proteomic targets. These results support the functional coherence of the proposed Apc-Cdkn1a network and also demonstrate how network-based predictions can be statistically tested using high-throughput biological data
Candidate gene prioritization by network analysis of differential expression using machine learning approaches
<p>Abstract</p> <p>Background</p> <p>Discovering novel disease genes is still challenging for diseases for which no prior knowledge - such as known disease genes or disease-related pathways - is available. Performing genetic studies frequently results in large lists of candidate genes of which only few can be followed up for further investigation. We have recently developed a computational method for constitutional genetic disorders that identifies the most promising candidate genes by replacing prior knowledge by experimental data of differential gene expression between affected and healthy individuals.</p> <p>To improve the performance of our prioritization strategy, we have extended our previous work by applying different machine learning approaches that identify promising candidate genes by determining whether a gene is surrounded by highly differentially expressed genes in a functional association or protein-protein interaction network.</p> <p>Results</p> <p>We have proposed three strategies scoring disease candidate genes relying on network-based machine learning approaches, such as kernel ridge regression, heat kernel, and Arnoldi kernel approximation. For comparison purposes, a local measure based on the expression of the direct neighbors is also computed. We have benchmarked these strategies on 40 publicly available knockout experiments in mice, and performance was assessed against results obtained using a standard procedure in genetics that ranks candidate genes based solely on their differential expression levels (<it>Simple Expression Ranking</it>). Our results showed that our four strategies could outperform this standard procedure and that the best results were obtained using the <it>Heat Kernel Diffusion Ranking </it>leading to an average ranking position of 8 out of 100 genes, an AUC value of 92.3% and an error reduction of 52.8% relative to the standard procedure approach which ranked the knockout gene on average at position 17 with an AUC value of 83.7%.</p> <p>Conclusion</p> <p>In this study we could identify promising candidate genes using network based machine learning approaches even if no knowledge is available about the disease or phenotype.</p
Protein Networks as Logic Functions in Development and Cancer
Many biological and clinical outcomes are based not on single proteins, but on modules of proteins embedded in protein networks. A fundamental question is how the proteins within each module contribute to the overall module activity. Here, we study the modules underlying three representative biological programs related to tissue development, breast cancer metastasis, or progression of brain cancer, respectively. For each case we apply a new method, called Network-Guided Forests, to identify predictive modules together with logic functions which tie the activity of each module to the activity of its component genes. The resulting modules implement a diverse repertoire of decision logic which cannot be captured using the simple approximations suggested in previous work such as gene summation or subtraction. We show that in cancer, certain combinations of oncogenes and tumor suppressors exert competing forces on the system, suggesting that medical genetics should move beyond cataloguing individual cancer genes to cataloguing their combinatorial logic
Google Goes Cancer: Improving Outcome Prediction for Cancer Patients by Network-Based Ranking of Marker Genes
Predicting the clinical outcome of cancer patients based on the expression of marker genes in their tumors has received increasing interest in the past decade. Accurate predictors of outcome and response to therapy could be used to personalize and thereby improve therapy. However, state of the art methods used so far often found marker genes with limited prediction accuracy, limited reproducibility, and unclear biological relevance. To address this problem, we developed a novel computational approach to identify genes prognostic for outcome that couples gene expression measurements from primary tumor samples with a network of known relationships between the genes. Our approach ranks genes according to their prognostic relevance using both expression and network information in a manner similar to Google's PageRank. We applied this method to gene expression profiles which we obtained from 30 patients with pancreatic cancer, and identified seven candidate marker genes prognostic for outcome. Compared to genes found with state of the art methods, such as Pearson correlation of gene expression with survival time, we improve the prediction accuracy by up to 7%. Accuracies were assessed using support vector machine classifiers and Monte Carlo cross-validation. We then validated the prognostic value of our seven candidate markers using immunohistochemistry on an independent set of 412 pancreatic cancer samples. Notably, signatures derived from our candidate markers were independently predictive of outcome and superior to established clinical prognostic factors such as grade, tumor size, and nodal status. As the amount of genomic data of individual tumors grows rapidly, our algorithm meets the need for powerful computational approaches that are key to exploit these data for personalized cancer therapies in clinical practice
Cell cycle and aging, morphogenesis, and response to stimuli genes are individualized biomarkers of glioblastoma progression and survival
<p>Abstract</p> <p>Background</p> <p>Glioblastoma is a complex multifactorial disorder that has swift and devastating consequences. Few genes have been consistently identified as prognostic biomarkers of glioblastoma survival. The goal of this study was to identify general and clinical-dependent biomarker genes and biological processes of three complementary events: lifetime, overall and progression-free glioblastoma survival.</p> <p>Methods</p> <p>A novel analytical strategy was developed to identify general associations between the biomarkers and glioblastoma, and associations that depend on cohort groups, such as race, gender, and therapy. Gene network inference, cross-validation and functional analyses further supported the identified biomarkers.</p> <p>Results</p> <p>A total of 61, 47 and 60 gene expression profiles were significantly associated with lifetime, overall, and progression-free survival, respectively. The vast majority of these genes have been previously reported to be associated with glioblastoma (35, 24, and 35 genes, respectively) or with other cancers (10, 19, and 15 genes, respectively) and the rest (16, 4, and 10 genes, respectively) are novel associations. <it>Pik3r1</it>, <it>E2f3, Akr1c3</it>, <it>Csf1</it>, <it>Jag2</it>, <it>Plcg1</it>, <it>Rpl37a</it>, <it>Sod2</it>, <it>Topors</it>, <it>Hras</it>, <it>Mdm2, Camk2g</it>, <it>Fstl1</it>, <it>Il13ra1</it>, <it>Mtap </it>and <it>Tp53 </it>were associated with multiple survival events.</p> <p>Most genes (from 90 to 96%) were associated with survival in a general or cohort-independent manner and thus the same trend is observed across all clinical levels studied. The most extreme associations between profiles and survival were observed for <it>Syne1</it>, <it>Pdcd4</it>, <it>Ighg1</it>, <it>Tgfa</it>, <it>Pla2g7</it>, and <it>Paics</it>. Several genes were found to have a cohort-dependent association with survival and these associations are the basis for individualized prognostic and gene-based therapies. <it>C2</it>, <it>Egfr</it>, <it>Prkcb</it>, <it>Igf2bp3</it>, and <it>Gdf10 </it>had gender-dependent associations; <it>Sox10</it>, <it>Rps20</it>, <it>Rab31</it>, and <it>Vav3 </it>had race-dependent associations; <it>Chi3l1</it>, <it>Prkcb</it>, <it>Polr2d</it>, and <it>Apool </it>had therapy-dependent associations. Biological processes associated glioblastoma survival included morphogenesis, cell cycle, aging, response to stimuli, and programmed cell death.</p> <p>Conclusions</p> <p>Known biomarkers of glioblastoma survival were confirmed, and new general and clinical-dependent gene profiles were uncovered. The comparison of biomarkers across glioblastoma phases and functional analyses offered insights into the role of genes. These findings support the development of more accurate and personalized prognostic tools and gene-based therapies that improve the survival and quality of life of individuals afflicted by glioblastoma multiforme.</p