104 research outputs found
Regulators Associated with Clinical Outcomes Revealed by Dna Methylation Data in Breast Cancer
The regulatory architecture of breast cancer is extraordinarily complex and gene misregulation can occur at many levels, with transcriptional malfunction being a major cause. This dysfunctional process typically involves additional regulatory modulators including DNA methylation. Thus, the interplay between transcription factor (TF) binding and DNA methylation are two components of a cancer regulatory interactome presumed to display correlated signals. As proof of concept, we performed a systematic motif-based in silico analysis to infer all potential TFs that are involved in breast cancer prognosis through an association with DNA methylation changes. Using breast cancer DNA methylation and clinical data derived from The Cancer Genome Atlas (TCGA), we carried out a systematic inference of TFs whose misregulation underlie different clinical subtypes of breast cancer. Our analysis identified TFs known to be associated with clinical outcomes of p53 and ER (estrogen receptor) subtypes of breast cancer, while also predicting new TFs that may also be involved. Furthermore, our results suggest that misregulation in breast cancer can be caused by the binding of alternative factors to the binding sites of TFs whose activity has been ablated. Overall, this study provides a comprehensive analysis that links DNA methylation to TF binding to patient prognosis
Transcription Factor Binding Profiles Reveal Cyclic Expression of Human Protein-coding Genes and Non-coding RNAs
Cell cycle is a complex and highly supervised process that must proceed with regulatory precision to achieve successful cellular division. Despite the wide application, microarray time course experiments have several limitations in identifying cell cycle genes. We thus propose a computational model to predict human cell cycle genes based on transcription factor (TF) binding and regulatory motif information in their promoters. We utilize ENCODE ChIP-seq data and motif information as predictors to discriminate cell cycle against non-cell cycle genes. Our results show that both the trans- TF features and the cis- motif features are predictive of cell cycle genes, and a combination of the two types of features can further improve prediction accuracy. We apply our model to a complete list of GENCODE promoters to predict novel cell cycle driving promoters for both protein-coding genes and non-coding RNAs such as lincRNAs. We find that a similar percentage of lincRNAs are cell cycle regulated as protein-coding genes, suggesting the importance of non-coding RNAs in cell cycle division. The model we propose here provides not only a practical tool for identifying novel cell cycle genes with high accuracy, but also new insights on cell cycle regulation by TFs and cis-regulatory elements
Integrative Analysis of Survival-Associated Gene Sets in Breast Cancer
Patient gene expression information has recently become a clinical feature used to evaluate breast cancer prognosis. The emergence of prognostic gene sets that take advantage of these data has led to a rich library of information that can be used to characterize the molecular nature of a patient’s cancer. Identifying robust gene sets that are consistently predictive of a patient’s clinical outcome has become one of the main challenges in the field. We inputted our previously established BASE algorithm with patient gene expression data and gene sets from MSigDB to develop the gene set activity score (GSAS), a metric that quantitatively assesses a gene set’s activity level in a given patient. We utilized this metric, along with patient time-to-event data, to perform survival analyses to identify the gene sets that were significantly correlated with patient survival. We then performed cross-dataset analyses to identify robust prognostic gene sets and to classify patients by metastasis status. Additionally, we created a gene set network based on component gene overlap to explore the relationship between gene sets derived from MSigDB. We developed a novel gene set based on this network’s topology and applied the GSAS metric to characterize its role in patient survival
E2F4 Regulatory Program Predicts Patient Survival Prognosis in Breast Cancer
Genetic and molecular signatures have been incorporated into cancer prognosis prediction and treatment decisions with good success over the past decade. Clinically, these signatures are usually used in early-stage cancers to evaluate whether they require adjuvant therapy following surgical resection. A molecular signature that is prognostic across more clinical contexts would be a useful addition to current signatures. We defined a signature for the ubiquitous tissue factor, E2F4, based on its shared target genes in multiple tissues. These target genes were identified by chromatin immunoprecipitation sequencing (ChIP-seq) experiments using a probabilistic method. We then computationally calculated the regulatory activity score (RAS) of E2F4 in cancer tissues, and examined how E2F4 RAS correlates with patient survival
An Approach for Determining and Measuring Network Hierarchy Applied to Comparing the Phosphorylome and the Regulome
Many biological networks naturally form a hierarchy with a preponderance of downward information flow. In this study, we define a score to quantify the degree of hierarchy in a network and develop a simulated-annealing algorithm to maximize the hierarchical score globally over a network. We apply our algorithm to determine the hierarchical structure of the phosphorylome in detail and investigate the correlation between its hierarchy and kinase properties. We also compare it to the regulatory network, finding that the phosphorylome is more hierarchical than the regulome
DPRP: a database of phenotype-specific regulatory programs derived from transcription factor binding data
Gene expression profiling has been extensively used in the past decades, resulting in an enormous amount of expression data available in public databases. These data sets are informative in elucidating transcriptional regulation of genes underlying various biological and clinical conditions. However, it is usually difficult to identify transcription factors (TFs) responsible for gene expression changes directly from their own expression, as TF activity is often regulated at the posttranscriptional level. In recent years, technical advances have made it possible to systematically determine the target genes of TFs by ChIP-seq experiments. To identify the regulatory programs underlying gene expression profiles, we constructed a database of phenotype-specific regulatory programs (DPRP, http://syslab.nchu.edu.tw/DPRP/) derived from the integrative analysis of TF binding data and gene expression data. DPRP provides three methods: the Fisher's Exact Test, the Kolmogorov-Smirnov test and the BASE algorithm to facilitate the application of gene expression data for generating new hypotheses on transcriptional regulatory programs in biological and clinical studies
DPRP: A Database of Phenotype-Specific Regulatory Programs Derived from Transcription Factor Binding Data
Gene expression profiling has been extensively used in the past decades, resulting in an enormous amount of expression data available in public databases. These data sets are informative in elucidating transcriptional regulation of genes underlying various biological and clinical conditions. However, it is usually difficult to identify transcription factors (TFs) responsible for gene expression changes directly from their own expression, as TF activity is often regulated at the posttranscriptional level. In recent years, technical advances have made it possible to systematically determine the target genes of TFs by ChIP-seq experiments. To identify the regulatory programs underlying gene expression profiles, we constructed a database of phenotype-specific regulatory programs (DPRP, http://syslab.nchu.edu.tw/DPRP/) derived from the integrative analysis of TF binding data and gene expression data. DPRP provides three methods: the Fisher’s Exact Test, the Kolmogorov–Smirnov test and the BASE algorithm to facilitate the application of gene expression data for generating new hypotheses on transcriptional regulatory programs in biological and clinical studies
Pan-cancer systematic identification of lncRNAs associated with cancer prognosis
Background The “dark matter” of the genome harbors several non-coding RNA species including Long non-coding RNAs (lncRNAs), which have been implicated in neoplasia but remain understudied. RNA-seq has provided deep insights into the nature of lncRNAs in cancer but current RNA-seq data are rarely accompanied by longitudinal patient survival information. In contrast, a plethora of microarray studies have collected these clinical metadata that can be leveraged to identify novel associations between gene expression and clinical phenotypes. Methods In this study, we developed an analysis framework that computationally integrates RNA-seq and microarray data to systematically screen 9,463 lncRNAs for association with mortality risk across 20 cancer types. Results In total, we identified a comprehensive list of associations between lncRNAs and patient survival and demonstrate that these prognostic lncRNAs are under selective pressure and may be functional. Our results provide valuable insights that facilitate further exploration of lncRNAs and their potential as cancer biomarkers and drug targets
Application of RNAi-Induced Gene Expression Profiles for Prognostic Prediction in Breast Cancer
Homologous recombination (HR) is the primary pathway for repairing double-strand DNA breaks implicating in the development of cancer. RNAi-based knockdowns of BRCA1 and RAD51 in this pathway have been performed to investigate the resulting transcriptomic profiles. Here we propose a computational framework to utilize these profiles to calculate a score, named RNA-Interference derived Proliferation Score (RIPS), which reflects cell proliferation ability in individual breast tumors. RIPS is predictive of breast cancer classes, prognosis, genome instability, and neoadjuvant chemosensitivity. This framework directly translates the readout of knockdown experiments into potential clinical applications and generates a robust biomarker in breast cancer
Preferences for HIV prevention strategies among newly arrived Asian-born men who have sex with men living in Australia: A discrete choice experiment.
The HIV epidemic in Australia is changing with higher risk for HIV among newly-arrived Asian-born men who have sex with men (MSM) compared to Australian-born MSM. We evaluated the preferences for HIV prevention strategies among 286 Asian-born MSM living in Australia for <5 years. A latent class analysis uncovered three classes of respondents who were defined by their preferences: "PrEP" (52%), "Consistent condoms" (31%), and "No strategy" (17%). Compared to the "No strategy" class, men in the "PrEP" class were less likely to be a student or ask their partner for their HIV status. Men in the "Consistent condoms" class were more likely to get information about HIV from online, and less likely to ask their partner for their HIV status. Overall, PrEP was the preferred HIV prevention strategy for newly arrived migrants. Removing structural barriers to access PrEP can accelerate progress toward ending HIV transmission
- …