107 research outputs found

    A motif-independent metric for DNA sequence specificity

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Genome-wide mapping of protein-DNA interactions has been widely used to investigate biological functions of the genome. An important question is to what extent such interactions are regulated at the DNA sequence level. However, current investigation is hampered by the lack of computational methods for systematic evaluating sequence specificity.</p> <p>Results</p> <p>We present a simple, unbiased quantitative measure for DNA sequence specificity called the Motif Independent Measure (MIM). By analyzing both simulated and real experimental data, we found that the MIM measure can be used to detect sequence specificity independent of presence of transcription factor (TF) binding motifs. We also found that the level of specificity associated with H3K4me1 target sequences is highly cell-type specific and highest in embryonic stem (ES) cells. We predicted H3K4me1 target sequences by using the N- score model and found that the prediction accuracy is indeed high in ES cells.The software to compute the MIM is freely available at: <url>https://github.com/lucapinello/mim</url>. </p> <p>Conclusions</p> <p>Our method provides a unified framework for quantifying DNA sequence specificity and serves as a guide for development of sequence-based prediction models.</p

    TRANSCRIPTIONAL ADAPTATION TO TARGETED INHIBITORS VIA BET BROMODOMAIN PROTEINS IN TRIPLE-NEGATIVE BREAST CANCER

    Get PDF
    Targeted kinase inhibitors have displayed limited efficacy in treating breast cancer due to the ability of tumor cells to upregulate bypass signaling networks in response to treatment. Triple-negative breast cancers (TNBCs) often present with dysregulation of the BRaf-MEK-ERK pathway, making them sensitive to MEK inhibitor (MEKi) treatment. Despite initial clinical responses, drug resistance often develops involving non-genomic adaptive bypass mechanisms. Inhibition of MEK1/2 by trametinib in TNBC patients induced dramatic transcriptional responses, including upregulation of receptor tyrosine kinases (RTKs) when comparing tumor samples before and after one week of treatment. In preclinical models, MEK inhibition induced genome-wide enhancer formation involving the seeding of BRD4, MED1, H3K27 acetylation and p300 that drove transcriptional adaptation. Inhibition of P-TEFb associated proteins arrested enhancer seeding and RTK upregulation. BRD4 bromodomain inhibitors or RNAi knockdown of BRD4 overcame trametinib resistance, producing sustained growth inhibition in cells, xenografts and syngeneic mouse TNBC models. These data highlight pharmacological targeting of P-TEFb members, including BET bromodomains, in conjunction with MEK inhibition as an effective strategy to durably inhibit epigenomic remodeling required for adaptive resistance. The ability of BET bromodomain inhibitors to block the adaptive response of TNBC to MEKi made us question whether BET bromodomain inhibitors could block adaptation to targeted inhibition of additional kinases or signaling pathways. Screening of an inhibitor library targeting kinases and epigenetic regulators identified a series of molecules which displayed anti-proliferative synergy with BET bromodomain inhibitors (JQ1, OTX015) in TNBC. GSK2801, an inhibitor of BAZ2A/B bromodomains, of the imitation switch chromatin remodeling complexes, and BRD9, of the SWI/SNF complex, demonstrated unique synergy independent of BRD4 control of P-TEFb-mediated pause-release of RNA polymerase II. GSK2801, or RNAi knockdown of BAZ2A/B, in combination with JQ1 selectively displaced BRD2 at promoters/enhancers of ETS-regulated genes. Additional displacement of BRD2 from ribosomal DNA in the nucleolus coincided with decreased 45S rRNA, revealing a function of BRD2 in regulating RNA polymerase I transcription. In 2D cultures, enhanced displacement of BRD2 from chromatin by combination drug treatment induced senescence. In spheroid cultures, combination treatment induced cleaved caspase-3 characteristic of apoptosis in tumor cells but not co-cultured mammary fibroblasts. Thus, GSK2801 blocks BRD2-driven transcription in combination with BET inhibition and induces apoptosis of TNBC. Cumulatively, the data presented in this thesis provides a series of synergistic drug combinations which effectively inhibit growth and survival of TNBC in combination with targeted inhibitors via inhibition of P-TEFb transcriptional elongation by BRD4, or regulation of ETS target genes and rRNA transcription by BRD2.Doctor of Philosoph

    Meta-analysis framework for peak calling by combining multiple ChIP-seq algorithms and gene clustering by combining multiple transcriptomic studies

    Get PDF
    With the availability of large amount of genomics studies, integrating information from multiple sources improves knowledge discovery. To address the complexity of genome and numerous genetic features, meta-analysis that aggregate information achieves higher statistical power for the measure of interest, and identify patterns among study results, sources of disagreement among those results. As Next-Generation Sequencing (NGS) technologies are becoming affordable and can provide per-base resolution, NGS data serves as an appealing tool to analyze genomic fea-tures. Among various applications of NGS technologies, chromatin immunoprecipitation followed by high-throughput sequencing(ChIP-seq) is primarily used to provide quantitative, genome-wide mapping of target protein and DNA interaction events. Signal peak calling algorithms identified target regions of interest enriched in vitro. Despite the existing pro-grams for previous ChIP-Chip platforms, peak calling of putative protein binding sites from large, sequencing based data-sets presents a bioinformatic challenge that has required considerable computational innovation. Popular peak calling algorithms, such as MACS, SPP, CisGenome, SISSRs, USeq, and PeakSeq, are widely applied but each of them has different emphasis on sensitivity, specificity or different size and shape selection of peaks. In the first project of this dissertation, we propose a meta-analysis framework, ChIP-MetaCaller, to combine multiple top-performing algorithms to identify and reprioritize the peaks. We provide a forward selection algorithm to decide best combination of algorithms’ output to perform meta-analysis and showed that the result improves motif enrichment and sensitivity. The results are more trackable by biologists for further validation and hypothesis generation. The mechanisms of complex diseases like cancers involve changes in multiple genes, each conferring small and incremental risk that potentially converge in deregulated biological pathways, cellular functions and local circuit changes. To understand this complex network requires discovery of co-expression gene modules. Literature shows using meta-analysis can improve performance of identifying these modules from machine learning techniques in some pilot studies. In the second project of this dissertation, we proposed approach which is based on the clustering results of each individual study. Combining standardized distances from genes to the medoids lead to an integrated distance matrix and perform the meta-clustering. We compared the performance of proposed approach and Meta Clustering combining distance under three simulation settings and three real data sets and provide guidance for practitioners. Two projects included in this dissertation tackles different biological questions based on genomics data. Both of them improve performance from existing methods by information integration applying meta-analysis frameworks, and provide comprehensive biomarker detection.This work could improve public health by providing more effective methodologies for biomarker detection in the integration of multiple genomic studies

    Translating lung function genome-wide association study (GWAS) findings: new insights for lung biology

    Get PDF
    Chronic respiratory diseases are a major cause of worldwide mortality and morbidity. Although hereditary severe deficiency of α1 antitrypsin (A1AD) has been established to cause emphysema, A1AD accounts for only ∼1% of Chronic Obstructive Pulmonary Disease (COPD) cases. Genome-wide association studies (GWAS) have been successful at detecting multiple loci harboring variants predicting the variation in lung function measures and risk of COPD. However, GWAS are incapable of distinguishing causal from noncausal variants. Several approaches can be used for functional translation of genetic findings. These approaches have the scope to identify underlying alleles and pathways that are important in lung function and COPD. Computational methods aim at effective functional variant prediction by combining experimentally generated regulatory information with associated region of the human genome. Classically, GWAS association follow-up concentrated on manipulation of a single gene. However association data has identified genetic variants in >50 loci predicting disease risk or lung function. Therefore there is a clear precedent for experiments that interrogate multiple candidate genes in parallel, which is now possible with genome editing technology. Gene expression profiling can be used for effective discovery of biological pathways underpinning gene function. This information may be used for informed decisions about cellular assays post genetic manipulation. Investigating respiratory phenotypes in human lung tissue and specific gene knockout mice is a valuable in vivo approach that can complement in vitro work. Herein, we review state-of-the-art in silico, in vivo, and in vitro approaches that may be used to accelerate functional translation of genetic findings
    corecore