According to international human genome sequencing consortium 2004[43], it was known that only less than 2% of the total human genome code for proteins. This ignited quite a surprise in the scientific community. Since then, a lot of researchers are attracted towards the noncoding part of the genome. There are explosion of researches addressing the role of the 98% of the human untranslated regions of the genome. This shows that the transcription is not only limited to the protein coding regions of the genome rather more than 90% of the genome are likely to be transcribed. [43] This will result in the transcription of tens and thousands of the long noncoding RNAs (lncRNAs) with little or no coding potential. However, the molecular mechanism and function of long noncoding RNAs are still an open research topic. Although the functions of limited lncRNAs are identified, there is still a gap in identifying the function of novel lncRNAs.
This project implements different computational methods to predict the function of novel lncRNAs identified from TCGA glioblastoma multiforme samples. The methods used in this functional prediction include both expression and sequence-based analysis approach. In expression-based analysis, the co-expressing genes with lncRNAs are used to predict the possible functional relation. In sequence based analysis, the gene-protein and lncRNA-protein interactions together with miRNA-lncRNA interactions are considered towards the possible functional predictions.
The result from the integrated functional prediction on the novel lncRNAs show that TCGA_gbm3-153501 novel lncRNA which is co-expressed together with the THBS1 gene with correlation coefficient of more that 0.5 is predicted to function in cell-cell and cell-to-matrix interactions, platelet aggregation, angiogenesis, and tumorigenesis. [202] MSI1, RBM3 and RBM8A are RNA binding proteins (RBPs) that have binding site on both the first top five differentially expressed lncRNAs which are TCGA_gbm-2-104096501, TCGA_gbm-3-153501, TCGA_gbm-5-63687001 and TCGA_gbm-17-10671251 and IGF2 which is among the top 10 differentially expressed genes. Therefore, these lncRNAs are predicted to have functional role in cell proliferation and maintenance of stem cells in the central nervous system