54 research outputs found

    Integrated Module and Gene-Specific Regulatory Inference Implicates Upstream Signaling Networks

    Get PDF
    <div><p>Regulatory networks that control gene expression are important in diverse biological contexts including stress response and development. Each gene's regulatory program is determined by module-level regulation (e.g. co-regulation via the same signaling system), as well as gene-specific determinants that can fine-tune expression. We present a novel approach, <i>Modular regulatory network learning with per gene information</i> (MERLIN), that infers regulatory programs for individual genes while probabilistically constraining these programs to reveal module-level organization of regulatory networks. Using edge-, regulator- and module-based comparisons of simulated networks of known ground truth, we find MERLIN reconstructs regulatory programs of individual genes as well or better than existing approaches of network reconstruction, while additionally identifying modular organization of the regulatory networks. We use MERLIN to dissect global transcriptional behavior in two biological contexts: yeast stress response and human embryonic stem cell differentiation. Regulatory modules inferred by MERLIN capture co-regulatory relationships between signaling proteins and downstream transcription factors thereby revealing the upstream signaling systems controlling transcriptional responses. The inferred networks are enriched for regulators with genetic or physical interactions, supporting the inference, and identify modules of functionally related genes bound by the same transcriptional regulators. Our method combines the strengths of per-gene and per-module methods to reveal new insights into transcriptional regulation in stress and development.</p></div

    Interplay of transcription factors and signaling proteins in specifying the regulatory programs of modules.

    No full text
    <p><b>A.</b> Shown are the fraction of modules that are regulated by TFs alone, signaling proteins alone or both <b>B.</b> Shown are the co-regulatory, genetic and protein-protein interactions between regulators associated with HOG1 associated modules. HOG1 is a protein kinase involved in osmotic stress and cell wall organization. HOG1 is predicted to be a regulator for Modules 2 and 37, and is known to be directly upstream of SKO1 which is predicted to regulate genes in Module 19. Co-regulatory relations are inferred between two regulators if they share common targets. Genetic and protein-protein interactions are obtained from BioGRID <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003252#pcbi.1003252-Chatraryamontri1" target="_blank">[66]</a>.</p

    Application of MERLIN to differentiation time course of human ES to neural precursor cells identifies two large modules with opposite patterns of expression.

    No full text
    <p><b>A.</b> Shown are the two modules, Modules 1, and 7, that exhibit characteristic temporal patterns of expression together with their predicted regulators from MERLIN and regulators whose ChIP-seq targets are enriched in the module. Known pluripotency maintenance regulators (POU5F1), and predicted neural fate driver genes are shown in larger fonts. <b>B.</b> Predicted targets of POU5F1 using MERLIN. <b>*</b> denotes membership in Module 1, which we associate with maintenance of ES state. MERLIN can infer both repressive and activating relationships between TF and target genes, e.g. CCDC11 and POU5F1. We also show ChIP-seq (Red column, NANOG-ChIP, <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003252#pcbi.1003252-Gerstein1" target="_blank">[44]</a>) and ChIP-chip datasets (Magenta columns, SOX2-OCT4-NANOG targets from Boyer et al., <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003252#pcbi.1003252-Boyer1" target="_blank">[71]</a>) <b>C.</b> Predicted targets of DUSP5 using MERLIN. Some DUSP5 targets are also occupied by NANOG transcription factor.</p

    Per-gene and per-module regulatory network inference approaches.

    No full text
    <p><b>A.</b> Modeling transcriptional regulatory networks as a probabilistic graphical model. Shown is a cartoon of a gene promoter HSP12, with two regulators that bind to its promoter to regulate its level. Regulatory networks are represented using a directed graph specifying who regulates whom, with arrows from regulators to target genes. The network logic of how the regulator levels predicts the target gene expression level is modeled through conditional probability distributions in a probabilistic graphical model. <b>B.</b> Per-gene regulatory network learning. Regulators for each gene are inferred independently. <b>C.</b> Per-module network inference. Regulators are inferred for each module. All genes in the same module have the same parameters. <b>D.</b> Per-gene module constrained network learning used in MERLIN. Gene-specific regulatory programs are inferred while imposing module constraints to enable genes in the same module to share regulators. <b>E.</b> MERLIN learning framework for inferring regulatory module networks. The algorithm starts with an initial set of expression clusters and candidate regulators and iterates between learning regulatory programs for each gene, and revisiting the module membership. The final inferred network is the output of MERLIN comprising the per-gene regulatory programs and the module membership of each gene.</p

    Yeast regulatory modules with regulators enriched with physical or genetic interactions.

    No full text
    <p>Each row corresponds to a Module. The first row specifies the Module ID, the second column has the module genes, the third column has the TF regulators predicted by MERLIN, the fourth column has the signaling proteins predicted by MERLIN, the fifth column has regulators predicted based on ChIP-chip, and the last column has a summary of the Gene Ontology terms associated with each module.</p

    Modules expression patterns inferred by MERLIN on the human ES cell differentiation time course into neural progenitor cells.

    No full text
    <p><b>A.</b> Each row corresponds to the mean expression profile of one module. The rows are ordered based on hierarchical clustering of the means of the modules followed by an optimal leaf ordering of the rows so that rows that are the most similar to each other are closest. This ordering enables us to see a gradual change in the different temporal dynamics captured in the MERLIN modules. <b>B–G.</b> Selected modules associated with complex temporal patterns. Cyan represents targets (rows) of regulators (columns) predicted by MERLIN, Teal represents ChIP-seq targets, and Purple represents presence of motif instance of a transcription factor ±2 kb around the Transcription Start Site (TSS) of a gene. <b>B–E</b> Modules associated with upregulation at the beginning and end of the time course whose members are also associated with oncogenes such as KLF4, MYC, JUND, and CTB2. <b>F–G</b> Modules associated with initial and late downregulation. <b>G.</b> Shown are the genes of module 144 which is enriched for genes exhibiting high CpG density in promoters in Neural Progenitor cells as annotated in MSigDB <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003252#pcbi.1003252-Liberzon1" target="_blank">[43]</a>.</p

    Amino acid starvation modules associated with Met32 and other amino acid bio-synthesis regulators.

    No full text
    <p>Modules predicted by MERLIN to be associated with Met32 exhibit distinct patterns of expression. Shown are four modules that each have Met32 as an inferred regulator based on gene expression. Cyan represents expression-regulators, teal represents ChIP-chip targets of regulators whose ChIP-chip targets are enriched in the module, purple represents targets that have a motif sequence of a regulator. Only regulators that are enriched in this module are shown. For each module, the heatmap is separated into the expression of the genes in the module and expression of the regulators selected by MERLIN.</p

    Global organization of yeast stress response network revealed by MERLIN.

    No full text
    <p><b>A.</b> Major patterns of expression in each module inferred by MERLIN. Each row represents the mean of the expression profile of a module. The rows are ordered based on hierarchical clustering of the means of the modules followed by an optimal leaf ordering of the rows so that rows that are the most similar to each other are closest. <b>B.</b> Shown are the regulators and modules with edges from regulators to target modules (squares). The size of the module is indicated by the number of genes in each module. The color and the ordering of the module nodes is according to the number of regulators associated with each module. <b>C.</b> Histogram of average Pearson's correlation between each pair of genes assigned to a module. Majority of the modules have greater than 0.5 correlation suggesting genes in a module are co-expressed. <b>D.</b> Histogram of regulatory modularity of a module measuring the extent to which genes from the same module share predicted regulators versus between genes from different modules. High regulatory modularity suggests genes in the same module share more regulators than genes that are not in the same module. <b>E.</b> The distribution of the number of regulators per module <b>F.</b> Scatter plot of module size (number of genes assigned to a module) versus the number of regulators associated with a module based on enrichment of its predicted targets in the module. Module indegree and module size are linearly related (). Outlier modules with more regulators than expected by a linear fit to the module size are indicated on the plot. <b>G.</b> Distribution of the number of modules associated with a regulator. <b>H.</b> Scatter plot of the number of modules associated with a regulator based on its predicted target set enrichment versus the number of target genes predicted to be regulated by the regulator.</p

    Comparison of MERLIN against per-gene and per-module network inference algorithms using simulated data.

    No full text
    <p><b>A.</b> Comparison based on fold enrichment of true edges in the inferred network. The cartoon illustrates that this metric compares each edge in isolation. The fold enrichment is positive, and higher it is the better the inferred network in terms of the true edges recovered, and the false edges not inferred. <b>B.</b> Fraction of regulators whose targets in the true network are significantly overlapping with its targets in the inferred network (higher is better). The cartoons shows that this metric compares a set of genes, namely, the targets of a regulator. Each network had different numbers of total regulators, NET100: 11, NET200: 22, NET300: 33, NET400: 44, NET500: 55, NET1000: 111, which we used to obtain a fraction of regulators in each network. <b>C.</b> Overlap as measured by F-score between regulator-module relationships in the true network and regulator-module relationships from the inferred networks. The cartoon shows this metric compares networks based on the regulators associated with known modules. F-score ranges from 0 to 1, and the closer it is to 1 the better the performance.</p

    Multiplexed Sequence-Specific Capture of Chromatin and Mass Spectrometric Discovery of Associated Proteins

    No full text
    Comprehensive understanding of a gene’s expression and regulation at the molecular level requires identification of all proteins interacting with the gene. HyCCAPP (Hybridization Capture of Chromatin Associated Proteins for Proteomics) is an approach that uses single-stranded DNA oligonucleotides to capture specific genomic sequences in cross-linked chromatin fragments and identify associated proteins by mass spectrometry. Previous studies have shown HyCCAPP to provide useful information on protein–DNA interactions, revealing the proteins associated with the <i>GAL1-10</i> region in yeast. We present here a multiplexed version of HyCCAPP. Utilizing a toehold-mediated capture/release strategy, HyCCAPP is targeted to multiple genomic loci in parallel, and the protein binders at each locus are eluted in a programmable and selective fashion. Multiplexed HyCCAPP was applied to four genes (25S rDNA, <i>ARX1</i>, <i>CTT1</i>, and <i>RPL30</i>) in <i>S. cerevisiae</i> under normal and stressed conditions. Capture and release efficiencies and specificities were comparable to those obtained without multiplexing. Using mass spectrometry-based bottom-up proteomics, hundreds of proteins were discovered at each locus in each condition. Statistical analysis revealed 34–88 enriched proteins in each gene capture. Many of these proteins had expected functions, including DNA-related and ribosome biogenesis-associated activities. Multiplexed HyCCAPP provides a useful strategy for the identification of proteins interacting with specific chromatin regions
    • …
    corecore