4 research outputs found

    Prediction of epigenetically regulated genes in breast cancer cell lines

    Get PDF
    Methylation of CpG islands within the DNA promoter regions is one mechanism that leads to aberrant gene expression in cancer. In particular, the abnormal methylation of CpG islands may silence associated genes. Therefore, using high-throughput microarrays to measure CpG island methylation will lead to better understanding of tumor pathobiology and progression, while revealing potentially new biomarkers. We have examined a recently developed high-throughput technology for measuring genome-wide methylation patterns called mTACL. Here, we propose a computational pipeline for integrating gene expression and CpG island methylation profles to identify epigenetically regulated genes for a panel of 45 breast cancer cell lines, which is widely used in the Integrative Cancer Biology Program (ICBP). The pipeline (i) reduces the dimensionality of the methylation data, (ii) associates the reduced methylation data with gene expression data, and (iii) ranks methylation-expression associations according to their epigenetic regulation. Dimensionality reduction is performed in two steps: (i) methylation sites are grouped across the genome to identify regions of interest, and (ii) methylation profles are clustered within each region. Associations between the clustered methylation and the gene expression data sets generate candidate matches within a fxed neighborhood around each gene. Finally, the methylation-expression associations are ranked through a logistic regression, and their significance is quantified through permutation analysis. Our two-step dimensionality reduction compressed 90% of the original data, reducing 137,688 methylation sites to 14,505 clusters. Methylation-expression associations produced 18,312 correspondences, which were used to further analyze epigenetic regulation. Logistic regression was used to identify 58 genes from these correspondences that showed a statistically signifcant negative correlation between methylation profles and gene expression in the panel of breast cancer cell lines. Subnetwork enrichment of these genes has identifed 35 common regulators with 6 or more predicted markers. In addition to identifying epigenetically regulated genes, we show evidence of differentially expressed methylation patterns between the basal and luminal subtypes. Our results indicate that the proposed computational protocol is a viable platform for identifying epigenetically regulated genes. Our protocol has generated a list of predictors including COL1A2, TOP2A, TFF1, and VAV3, genes whose key roles in epigenetic regulation is documented in the literature. Subnetwork enrichment of these predicted markers further suggests that epigenetic regulation of individual genes occurs in a coordinated fashion and through common regulators

    Cross-Platform Array Screening Identifies COL1A2, THBS1, TNFRSF10D and UCHL1 as Genes Frequently Silenced by Methylation in Melanoma

    Get PDF
    Epigenetic regulation of tumor suppressor genes (TSGs) has been shown to play a central role in melanomagenesis. By integrating gene expression and methylation array analysis we identified novel candidate genes frequently methylated in melanoma. We validated the methylation status of the most promising genes using highly sensitive Sequenom Epityper assays in a large panel of melanoma cell lines and resected melanomas, and compared the findings with those from cultured melanocytes. We found transcript levels of UCHL1, COL1A2, THBS1 and TNFRSF10D were inversely correlated with promoter methylation. For THBS1 and UCHL1 the effect of this methylation on expression was confirmed at the protein level. Identification of these candidate TSGs and future research designed to understand how their silencing is related to melanoma development will increase our understanding of the etiology of this cancer and may provide tools for its early diagnosis

    Identification of epilepsy related pathways from methylome analysis

    Get PDF
    Epilepsy is a disorder which affects approximately 1% of the world’s population. It is possible to explain the underlying mechanisms of epilepsy by not only alterations in genetic structure and environmental factors, but also epigenetic mechanisms. As a significant epigenetic modification, DNA methylation may play a crucial role for understanding biological pathways related to epilepsy. Even though the relationship between CpG islands, DNA methylation and gene expression is quite complex, in this thesis we aimed to determine significant biological pathways related to epilepsy, using the difference of methylation levels at CpG loci in family trios. Dataset was gathered from 10 family trios (30 individuals) and it includes methylation level information for each CpG locus determined by Illumina HumanMethylation450 Bead Chip which is a microarray based analysis tool. Considering the difference of methylation levels at each CpG locus, we prepared lists of genes that we marked as significant. Applying several filters to the dataset allowed us to gather more reliable information from the dataset. We identified a number of significant pathways that may be related to epilepsy, especially immune system related pathways stand out as a significant and novel finding

    Computational Intelligence Based Classifier Fusion Models for Biomedical Classification Applications

    Get PDF
    The generalization abilities of machine learning algorithms often depend on the algorithms’ initialization, parameter settings, training sets, or feature selections. For instance, SVM classifier performance largely relies on whether the selected kernel functions are suitable for real application data. To enhance the performance of individual classifiers, this dissertation proposes classifier fusion models using computational intelligence knowledge to combine different classifiers. The first fusion model called T1FFSVM combines multiple SVM classifiers through constructing a fuzzy logic system. T1FFSVM can be improved by tuning the fuzzy membership functions of linguistic variables using genetic algorithms. The improved model is called GFFSVM. To better handle uncertainties existing in fuzzy MFs and in classification data, T1FFSVM can also be improved by applying type-2 fuzzy logic to construct a type-2 fuzzy classifier fusion model (T2FFSVM). T1FFSVM, GFFSVM, and T2FFSVM use accuracy as a classifier performance measure. AUC (the area under an ROC curve) is proved to be a better classifier performance metric. As a comparison study, AUC-based classifier fusion models are also proposed in the dissertation. The experiments on biomedical datasets demonstrate promising performance of the proposed classifier fusion models comparing with the individual composing classifiers. The proposed classifier fusion models also demonstrate better performance than many existing classifier fusion methods. The dissertation also studies one interesting phenomena in biology domain using machine learning and classifier fusion methods. That is, how protein structures and sequences are related each other. The experiments show that protein segments with similar structures also share similar sequences, which add new insights into the existing knowledge on the relation between protein sequences and structures: similar sequences share high structure similarity, but similar structures may not share high sequence similarity
    corecore