6 research outputs found
Machine Learning Gene Signature to Metastatic ccRCC Based on ceRNA Network
Clear-cell renal-cell carcinoma (ccRCC) is a silent-development pathology with a high rate of metastasis in patients. The activity of coding genes in metastatic progression is well known. New studies evaluate the association with non-coding genes, such as competitive endogenous RNA (ceRNA). This study aims to build a ceRNA network and a gene signature for ccRCC associated with metastatic development and analyze their biological functions. Using data from The Cancer Genome Atlas (TCGA), we constructed the ceRNA network with differentially expressed genes, assembled nine preliminary gene signatures from eight feature selection techniques, and evaluated the classification metrics to choose a final signature. After that, we performed a genomic analysis, a risk analysis, and a functional annotation analysis. We present an 11-gene signature: SNHG15, AF117829.1, hsa-miR-130a-3p, hsa-mir-381-3p, BTBD11, INSR, HECW2, RFLNB, PTTG1, HMMR, and RASD1. It was possible to assess the generalization of the signature using an external dataset from the International Cancer Genome Consortium (ICGC-RECA), which showed an Area Under the Curve of 81.5%. The genomic analysis identified the signature participants on chromosomes with highly mutated regions. The hsa-miR-130a-3p, AF117829.1, hsa-miR-381-3p, and PTTG1 were significantly related to the patient’s survival and metastatic development. Additionally, functional annotation resulted in relevant pathways for tumor development and cell cycle control, such as RNA polymerase II transcription regulation and cell control. The gene signature analysis within the ceRNA network, with literature evidence, suggests that the lncRNAs act as “sponges” upon the microRNAs (miRNAs). Therefore, this gene signature presents coding and non-coding genes and could act as potential biomarkers for a better understanding of ccRCC
An Integrated Data Analysis Using Bioinformatics and Random Forest to Predict Prognosis of Patients With Squamous Cell Lung Cancer
Lung cancer is the leading cause of cancer death worldwide, regardless of gender. Among the types of lung cancer, Lung Squamous Cell Carcinoma (LUSC) is the second most common type, characterized by a diagnosis in advanced stages, a poor prognosis, and a high association with smoking. Due to the severity of lung cancer, it is essential to understand its molecular mechanisms. In this context, this study uses transcriptomic and clinical data to implement bioinformatics pipelines, and machine learning, through random forest models to predict patients’ overall survival and obtain a gene signature of LUSC for tumor progression. We analyzed clinical and molecular data from the project LUSC-TCGA, and we performed differential expression analyses (DEA) comparing normal tissues against tumor tissues. Based on DEA-selected genes, the patients were divided into three clusters, followed by a feature selection and classification. Finally, it was possible to obtain classifications results close to 70% of accuracy for the three clusters. Finally, we also performed a functional enrichment analysis. The clustering analysis revealed in cluster 2, enriched genes such as CDT1, CENPI, and NLGN1, associated with the molecular EMT (epithelial-to-mesenchymal transition) process. Our approach facilitated the identification of genes that are biologically relevant to the LUSC development process, holding significant genes for predicting patient survival, such as gene ALDH3B1, C7, FAM83A, FOSB, GCGR, BMP7, PPP1R27 and AQP1, and putative therapeutic targets for LUSC such as gene FAM83A, CAV1, TNS4, EIF4G1, TFAP2A, GCGR and PPP1R27
A Novel Machine Learning 13-Gene Signature: Improving Risk Analysis and Survival Prediction for Clear Cell Renal Cell Carcinoma Patients
Patients with clear cell renal cell carcinoma (ccRCC) have poor survival outcomes, especially if it has metastasized. It is of paramount importance to identify biomarkers in genomic data that could help predict the aggressiveness of ccRCC and its resistance to drugs. Thus, we conducted a study with the aims of evaluating gene signatures and proposing a novel one with higher predictive power and generalization in comparison to the former signatures. Using ccRCC cohorts of the Cancer Genome Atlas (TCGA-KIRC) and International Cancer Genome Consortium (ICGC-RECA), we evaluated linear survival models of Cox regression with 14 signatures and six methods of feature selection, and performed functional analysis and differential gene expression approaches. In this study, we established a 13-gene signature (AR, AL353637.1, DPP6, FOXJ1, GNB3, HHLA2, IL4, LIMCH1, LINC01732, OTX1, SAA1, SEMA3G, ZIC2) whose expression levels are able to predict distinct outcomes of patients with ccRCC. Moreover, we performed a comparison between our signature and others from the literature. The best-performing gene signature was achieved using the ensemble method Min-Redundancy and Max-Relevance (mRMR). This signature comprises unique features in comparison to the others, such as generalization through different cohorts and being functionally enriched in significant pathways: Urothelial Carcinoma, Chronic Kidney disease, and Transitional cell carcinoma, Nephrolithiasis. From the 13 genes in our signature, eight are known to be correlated with ccRCC patient survival and four are immune-related. Our model showed a performance of 0.82 using the Receiver Operator Characteristic (ROC) Area Under Curve (AUC) metric and it generalized well between the cohorts. Our findings revealed two clusters of genes with high expression (SAA1, OTX1, ZIC2, LINC01732, GNB3 and IL4) and low expression (AL353637.1, AR, HHLA2, LIMCH1, SEMA3G, DPP6, and FOXJ1) which are both correlated with poor prognosis. This signature can potentially be used in clinical practice to support patient treatment care and follow-up