thesis

The integration of gene and miRNA expression using pathway topology: a case study on Epithelial Ovarian Cancer

Abstract

Pathways are formal descriptions of the biological processes involving finely regulated structures by which a cell converts molecules or processes signals. The study of gene expression in terms of pathways is defined as pathway analysis and aims at identifying groups of functionally related genes that show coordinated expression changes. Recently, pathway analysis moved from algorithms using merely gene list to ones exploiting the topology that define gene connections. A crucial, and unfortunately limiting step for these novel methods are the availability of the pathways as gene networks in which nodes are genes and edges are the relations between two elements. To this aim, we develop a pathway data interpreter, called graphite, able to uniformly store, process and convert pathway information into gene networks. graphite has been made publicly available as R package within the Bioconductor platform. In the field of the topological pathway analysis, graphite fills the existing gap lying between technical and methodological aspects. graphite i) allows performing more informative analysis on omics data and ii) allows developing new methods based on the increased accessibil- ity of biological knowledge. However, the pathways of the four main public resources integrated into graphite (KEGG, Reactome, Biocarta and PID), still lack of crucial interactors: the microRNAs. The microRNAs are small non-coding RNAs that post-transcriptionally regulate gene expression, their function on the messenger target is repressive but their effect on the transcription is dependent of the topology of the pathway in which the miRNA is involved. In the last decade, many targets have been discovered and experimentally validated, dedicated databases are available providing these information. Thus, I worked on an extension of graphite package able to integrate microRNAs in pathway topology, i) linking the non-coding RNAs to their validated target genes, ii) providing integrated networks suitable for the topological pathway analyses. The feasibility of this approach has been validated on a specific biological context, the early stage of Epithelial Ovarian Cancer (EOC). EOC has long been considered as a single disease. The emerging opinion, however, sees ovarian cancer as a general term that encloses a group of histo-pathological subtypes sharing a common anatomic location. In collaboration with the Mario Negri institute, 257 stage I EOC tumour biopsies were collected and stratified into training and validation sets. miRNA microarray data was used to generate the most highly reproducible signatures for each histotype through a dedicated resampling inferential strategy. qRT- PCR was used to validate the results in both the training and validation set. The results indicate that the clear cell histotype is characterized by high expression levels of miR- 30a and miR-30a*, while mucinous patients by high levels of miR-192 and miR-194, interestingly as well as mucinous non-ovarian tissues. Then, the integrative approach that combines mRNA and miRNA profiles using graphite has been applied to identify the mucinous specific regulatory circuits. Taken together our findings demonstrate that EOC histotypes have discriminant regulatory circuits that drive the differentiation of the tumour environment. Our approach successfully guides us towards important biological results with interesting therapeutic implications in EOC

    Similar works