6 research outputs found
Gene expression profiling of breast cancer survivability by pooled cDNA microarray analysis using logistic regression, artificial neural networks and decision trees
BACKGROUND: Microarray technology can acquire information about thousands of genes simultaneously. We analyzed published breast cancer microarray databases to predict five-year recurrence and compared the performance of three data mining algorithms of artificial neural networks (ANN), decision trees (DT) and logistic regression (LR) and two composite models of DT-ANN and DT-LR. The collection of microarray datasets from the Gene Expression Omnibus, four breast cancer datasets were pooled for predicting five-year breast cancer relapse. After data compilation, 757 subjects, 5 clinical variables and 13,452 genetic variables were aggregated. The bootstrap method, Mann–Whitney U test and 20-fold cross-validation were performed to investigate candidate genes with 100 most-significant p-values. The predictive powers of DT, LR and ANN models were assessed using accuracy and the area under ROC curve. The associated genes were evaluated using Cox regression. RESULTS: The DT models exhibited the lowest predictive power and the poorest extrapolation when applied to the test samples. The ANN models displayed the best predictive power and showed the best extrapolation. The 21 most-associated genes, as determined by integration of each model, were analyzed using Cox regression with a 3.53-fold (95% CI: 2.24-5.58) increased risk of breast cancer five-year recurrence… CONCLUSIONS: The 21 selected genes can predict breast cancer recurrence. Among these genes, CCNB1, PLK1 and TOP2A are in the cell cycle G2/M DNA damage checkpoint pathway. Oncologists can offer the genetic information for patients when understanding the gene expression profiles on breast cancer recurrence
Gene-Function-Based Clusters Explore Intricate Networks of Gene Expression of Circulating Tumor Cells in Patients with Colorectal Cancer
Colorectal cancer (CRC) is a complex disease characterized by dynamically deregulated gene expression and crosstalk between signaling pathways. In this study, a new approach based on gene-function-based clusters was introduced to explore the CRC-associated networks of gene expression. Each cluster contained genes involved in coordinated regulatory activity, such as RAS signaling, the cell cycle process, transcription, or translation. A retrospective case–control study was conducted with the inclusion of 119 patients with histologically confirmed colorectal cancer and 308 controls. The quantitative expression data of 15 genes were obtained from the peripheral blood samples of all participants to investigate cluster–gene and gene–gene interactions. DUSP6, MDM2, and EIF2S3 were consistently selected as CRC-associated factors with high significance in all logistic models. CPEB4 became an insignificant factor only when combined with the clusters for cell cycle processes and for transcription. The CPEB4/DUSP6 complex was a prerequisite for the significance of MMD, whereas EXT2, RNF4, ZNF264, WEE1, and MCM4 were affected by more than two clusters. Intricate networks among MMD, RAS signaling factors (DUSP6, GRB2, and NF1), and translation factors (EIF2S3, CPEB4, and EXT2) were also revealed. Our results suggest that limited G1/S transition, uncontrolled DNA replication, and the cap-independent initiation of translation may be dominant and concurrent scenarios in circulating tumor cells derived from colorectal cancer. This gene-function-based cluster approach is simple and useful for revealing intricate CRC-associated gene expression networks. These findings may provide clues to the metastatic mechanisms of circulating tumor cells in patients with colorectal cancer
CPEB4 and IRF4 expression in peripheral mononuclear cells are potential prognostic factors for advanced lung cancer
Lung cancer is a heterogeneous disease with varied outcomes. Molecular markers are eagerly investigated to predict a patient's treatment response or outcome. Previous studies used frozen biopsy tissues to identify crucial genes as prognostic markers. We explored the prognostic value of peripheral blood (PB) molecular signatures in patients with advanced non-small cell lung cancer (NSCLC).
Methods: Peripheral blood mononuclear cell (PBMC) fractions from patients with advanced NSCLC were applied for RNA extraction, cDNA synthesis, and real-time polymerase chain reaction (PCR) for the expression profiling of eight genes: DUSP6, MMD, CPEB4, RNF4, STAT2, NF1, IRF4, and ZNF264. Proportional hazard (PH) models were constructed to evaluate the association of the eight expressing genes and multiple clinical factors [e.g., sex, smoking status, and Charlson comorbidity index (CCI)] with overall survival.
Results: One hundred and forty-one patients with advanced NSCLC were enrolled. They included 109 (77.30%) patients with adenocarcinoma, 12 (8.51%) patients with squamous cell carcinoma, and 20 (14.18%) patients with other pathological lung cancer types. A PH model containing two significant survival-associated genes, CPEB4 and IRF4, could help in predicting the overall survival of patients with advanced stage NSCLC [hazard ratio (HR) = 0.48, p < 0.0001). Adding multiple clinical factors further improved the prediction power of prognosis (HR = 0.33; p < 0.0001).
Conclusion: Molecular signatures in PB can stratify the prognosis in patients with advanced NSCLC. Further prospective, interventional clinical trials should be performed to test if gene profiling also predicts resistance to chemotherapy
Gene Expression Profiling of Colorectal Tumors and Normal Mucosa by Microarrays Meta-Analysis Using Prediction Analysis of Microarray, Artificial Neural Network, Classification, and Regression Trees
Background. Microarray technology shows great potential but previous studies were limited by small number of samples in the colorectal cancer (CRC) research. The aims of this study are to investigate gene expression profile of CRCs by pooling cDNA microarrays using PAM, ANN, and decision trees (CART and C5.0). Methods. Pooled 16 datasets contained 88 normal mucosal tissues and 1186 CRCs. PAM was performed to identify significant expressed genes in CRCs and models of PAM, ANN, CART, and C5.0 were constructed for screening candidate genes via ranking gene order of significances.
Results. The first screening identified 55 genes. The test accuracy of each model was over 0.97 averagely. Less than eight genes achieve excellent classification accuracy. Combining the results of four models, we found the top eight differential genes in CRCs; suppressor genes, CA7, SPIB, GUCA2B, AQP8, IL6R and CWH43; oncogenes, SPP1 and TCN1. Genes of higher significances showed lower variation in rank ordering by different methods. Conclusion. We adopted a two-tier genetic screen, which not only reduced the number of candidate genes but also yielded good accuracy (nearly 100%). This method can be applied to future studies. Among the top eight genes, CA7, TCN1, and CWH43 have not been reported to be related to CRC
Gene Expression Profiling of Colorectal Tumors and Normal Mucosa by Microarrays Meta-Analysis Using Prediction Analysis of Microarray, Artificial Neural Network, Classification, and Regression Trees
Background. Microarray technology shows great potential but previous studies were limited by small number of samples in the colorectal cancer (CRC) research. The aims of this study are to investigate gene expression profile of CRCs by pooling cDNA microarrays using PAM, ANN, and decision trees (CART and C5.0). Methods. Pooled 16 datasets contained 88 normal mucosal tissues and 1186 CRCs. PAM was performed to identify significant expressed genes in CRCs and models of PAM, ANN, CART, and C5.0 were constructed for screening candidate genes via ranking gene order of significances. Results. The first screening identified 55 genes. The test accuracy of each model was over 0.97 averagely. Less than eight genes achieve excellent classification accuracy. Combining the results of four models, we found the top eight differential genes in CRCs; suppressor genes, CA7, SPIB, GUCA2B, AQP8, IL6R and CWH43; oncogenes, SPP1 and TCN1. Genes of higher significances showed lower variation in rank ordering by different methods. Conclusion. We adopted a two-tier genetic screen, which not only reduced the number of candidate genes but also yielded good accuracy (nearly 100%). This method can be applied to future studies. Among the top eight genes, CA7, TCN1, and CWH43 have not been reported to be related to CRC