18 research outputs found

    Additional file 1 of Transcription factor-binding k-mer analysis clarifies the cell type dependency of binding specificities and cis-regulatory SNPs in humans

    No full text
    Additional file 1: Figure S1 Filtering of ChIP-seq samples. A: Schematic overview of ChIP-seq sample filtering. B: Violin plot showing the AUROC of the prediction of the top 10% PWM-supported k-mers based on the MOCCS2score. The red violin plot represents all CTCF ChIP-seq samples, the green plot represents soft-filtered CTCF ChIP-seq samples, and the blue plot represents hard-filtered CTCF ChIP-seq samples. High-quality ChIP-seq samples with high AUROC scores were retained after hard filtering. C: Distribution of each quality control metric of ChIP-seq sample filtering for samples that passed the hard filter (pink) and others (blue). D: Bar plots display the number of ChIP-seq samples that passed through the soft and hard filters. Bars are colored according to cell type classes or TFs. Figure S2 Simulation of significant k-mer detection. A: The procedure for generating simulated datasets. Simulated data generated by embedding a “true significant k-mer” within random sequences was applied to MOCCS2 and the q-values of the MOCCS2score were calculated for each k-mer. B: Parameters for each simulation condition from #1 to #5. α is the percentage of input sequences containing embedded “true significant k-mers” , N is the number of peaks in a ChIP-seq sample, and σ is the standard deviation of the embedded “true significant k-mers” from the center of the peak. C: Simulation results for significant k-mer detection. The sensitivity, specificity, and FDR for detecting “true significant k-mers” are shown for different parameter settings. Figure S3 Number of peaks and significant k-mers in MOCCS profiles. A: Number of peaks in MOCCS profiles. The x-axis represents the log-transformed number of peaks with a base of 10 and the y-axis represents the number of ChIP-seq samples. B: Relationship between the number of peaks and significant k-mers in MOCCS profiles (left, q < 0.05; right, q < 0.01). Figure S4 Similarities in MOCCS profiles and peak locations for sample pairs of same or different TFs. A: Comparison of k-sim Jaccard, Pearson and peak overlap indices (a-c: groups of the same cell types). B: Two-dimensional density plot of k-sim Jaccard or Pearson with the peak overlap index (a-c: groups of the same cell types). C: Correlation coefficient of k-sim Jaccard or Pearson with the peak overlap index in each group. The y-axis indicates Spearman’ s correlation coefficient. Red and blue indicate k-sim Pearson and Jaccard values, respectively (a-c: groups of the same cell types) Figure S5 Similarities in MOCCS profiles and peak locations for sample pairs of same/different cell types. A: Comparison of the k-sim Jaccard, Pearson, and peak overlap indices (a, d, and e: groups of the same TFs). B: Two-dimensional density plot of k-sim Jaccard or Pearson with the peak overlap index (a, d, and e: groups of the same TFs). C: Correlation coefficient of k-sim Jaccard or Pearson with the peak overlap index in each group. The y-axis indicates Spearman’ s correlation coefficient. Red and blue indicate k-sim Pearson and Jaccard values, respectively (a, d, and e: groups of the same TFs). Figure S6 Heat maps of cell type-dependent TFs. The heat map color indicates the k-sim Jaccard value for the 33 cell type-dependent TFs. The color labels of the heat maps indicate the cell type classes. Cell type classes with only a single ChIP-seq sample were excluded from the visualization. Asterisks indicate the statistical significance of ChIP-seq samples with the same and different cell type classes (Mann–Whitney U test, p < 0.05). Figure S7 Violin plots of all cell type-dependent TFs. The y-axis indicates the k-sim Jaccard value. The same and different groups were arranged along the x-axis. Asterisks indicate the statistical significance of ChIP-seq samples with the same and different cell type classes (Mann–Whitney U test, p < 0.05). Figure S8 Simulation of differential k-mer detection. A: Simulated data processing. Simulated data with an embedded “true differential k-mer” and “true significant k-mer” was prepared by embedding a “true” k-mer within α% of a randomly generated sample of 2W + 1 bp (W = 350) DNA sequences and applied to MOCCS2. “True significant k-mers” were embedded following a normal distribution whose mean was W + 1 and whose standard deviation was σ. “True differential k-mers” were embedded in S1 (or S2), similar to “true significant k-mers,” and were embedded in S2 (or S1) following a uniform distribution whose mean was 1 and whose standard deviation was (2 × W + 1) − (k − 1). It should be noted that we set k as k=6. B: Parameters for each simulation condition from #1 to #5. L is the number of differential k-mers and m is the number of significant k-mers. Figure S9 ΔMOCCS2score profiles were consistent with the in vitro SNP-SELEX and PWM motif fold change. A: Spearman’ s correlation coefficient between PBS (SNP-SELEX) and ΔMOCCS2score in each TF for the original and permuted data. Red points indicate the original Spearman’ s correlation coefficient, and blue points indicate the permutated data. B: Difference in ΔMOCCS2score profile consistency among the positions of SNPs in k-mers. The kth SNP position indicates the kth allele on the left side of the k-mer. C: The ΔMOCCS2score is consistent with the PWM motif fold change. Figure S10 Number of peak-overlapping GWAS-SNPs with significant ΔMOCCS2scores. Number of peak-overlapping GWAS-SNPs in each ChIP-seq sample. Each bar represents a ChIP-seq sample, and the y-axis represents the number of peak-overlapping GWAS-SNPs. The red fraction represents the number of peak-overlapping GWAS-SNPs with significant ΔMOCCS2scores (q < 0.05), and the gray fraction represents the number of GWAS SNPs with non-significant ΔMOCCS2scores. Figure S11 Prediction of SNP-affected TFs and cell type classes using ΔMOCCS2score profiles. Top ChIP-seq samples with high ΔMOCCS2scores in each phenotype (IBD, inflammatory bowel disease; CD, Crohn’ s disease; MS, multiple sclerosis; SLE, systemic lupus erythematosus). The ΔMOCCS2score was calculated for each SNP and ChIP-seq sample. Bar graph colors represent TFs or cell type classes. Figure S12 Association between the allele frequency and ΔMOCCS2score. Association between the allele frequency and (A) the absolute values of the ΔMOCCS2score or (B) the ratio of SNPs with significant ΔMOCCS2scores in each phenotype (IBD, inflammatory bowel disease; CD, Crohn’ s disease; MS, multiple sclerosis; SLE, systemic lupus erythematosus). Figure S13 Accuracy of detecting canonical motifs using MOCCS2score for different k. AUROC for detecting canonical PWM motifs using the MOCCS2score in the difference of value k. The x-axis represents the ratio of PWM-supported k-mers in all k-mers and the y-axis represents the AUROC. The colors of the violin plots represent the different k values

    The nonlinear ARX model of the IEGs.

    No full text
    <p>(A) The simulation result of the nonlinear ARX model (solid lines) together with the experimental results in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0057037#pone-0057037-g001" target="_blank">Figure 1B</a> (dots). The colour codes are the same as in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0057037#pone-0057037-g001" target="_blank">Figure 1B</a>. The experimental data in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0057037#pone-0057037-g001" target="_blank">Figure 1B</a> and <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0057037#pone.0057037.s002" target="_blank">Figure S2</a> were used for parameter estimation of the nonlinear ARX model. (B) The identified systems by the nonlinear ARX model. The upstream dependency (selected inputs), Hill functions, and frequency response curve of the nonlinear ARX model were shown. The selected inputs, pERK (solid line), pCREB (dotted line), pJNK (dashed line), and c-FOS (dashed and dotted line) were numbered.</p

    System identification by the nonlinear ARX model.

    No full text
    <p>(A) The modeling scheme of the nonlinear ARX model. Upstream dependency was determined by lag order number, <i>m</i>. For example, if <i>m</i> = 0, upstream signal is not transmitted downstream, otherwise signal is transmitted downstream. The signals of the selected upstream molecules were transformed successively by Hill function and linear ARX model, that characterise a system with switch-like (solid line) or graded (dotted line) dose response, and with temporal filters such as a low-pass filter (dotted line) and that with an inverse notch (solid line), respectively (see <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0057037#s2" target="_blank">Materials and methods</a>). (B) Temporal signal transformation in the nonlinear ARX model. For example, signal transformation in the nonlinear ARX model of c-FOS was shown. pERK and pCREB were selected upstream molecules, but pp38 and pJNK were not (<i>m</i> = 0). The signals of pERK and pCREB were transformed by the Hill equations. Then, the transformed signals by the Hill equations were temporally transformed by the linear ARX model. The sum of the transformed signals by the linear ARX model was c-FOS, the final output of the nonlinear ARX model of c-FOS.</p

    The selected inputs and parameters of the Hill function and frequency response curves of the nonlinear ARX model.

    No full text
    <p>The selected inputs and parameters of the Hill function and frequency response curves of the nonlinear ARX model.</p

    The selective expression of EGR1 in response to pulsatile ERK phosphorylation.

    No full text
    <p>(A) The step (5 ng/ml, red), pulse (5 ng/ml, 6 min, blue), and pulsatile NGF stimulation (0.5 ng/ml, 6 min with 12-min intervals for four times, green) were given as indicated by bars (top), and pERK, pCREB, EGR1, and c-FOS were measured in experiments (dots). Using the experimental data of pERK and pCREB as the selected inputs, the outputs (c-FOS and EGR1) were simulated by the nonlinear ARX model (solid lines). (B) Interval dependency of EGR1 and c-FOS expression. The pulsatile NGF stimulation (0.5 ng/ml, 15-min duration for each pulse) with the indicated intervals were given, and pERK, EGR1, and c-FOS expression were measured in experiments. The area under the curve (AUC) (0–480 min) of EGR1 and c-FOS are shown in bars. The intervals are indicated by the colour codes. Bars represent means ±S.D.(n = 4). Note that 15-min duration of pulses was used in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0057037#pone-0057037-g004" target="_blank">Figure 4B</a> because of the technical limitation of probe numbers of the automated liquid-handling robots, and pulsatile stimulation with 6-min pulse duration and 12-min intervals were available at most four times (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0057037#pone-0057037-g004" target="_blank">Figure 4A</a>).</p

    Temporal Decoding of MAP Kinase and CREB Phosphorylation by Selective Immediate Early Gene Expression

    Get PDF
    <div><p>A wide range of growth factors encode information into specific temporal patterns of MAP kinase (MAPK) and CREB phosphorylation, which are further decoded by expression of immediate early gene products (IEGs) to exert biological functions. However, the IEG decoding system remain unknown. We built a data-driven based on time courses of MAPK and CREB phosphorylation and IEG expression in response to various growth factors to identify how signal is processed. We found that IEG expression uses common decoding systems regardless of growth factors and expression of each IEG differs in upstream dependency, switch-like response, and linear temporal filters. Pulsatile ERK phosphorylation was selectively decoded by expression of EGR1 rather than c-FOS. Conjunctive NGF and PACAP stimulation was selectively decoded by synergistic JUNB expression through switch-like response to c-FOS. Thus, specific temporal patterns and combinations of MAPKs and CREB phosphorylation can be decoded by selective IEG expression via distinct temporal filters and switch-like responses. The data-driven modeling is versatile for analysis of signal processing and does not require detailed prior knowledge of pathways.</p> </div

    Conjunctive stimulation of NGF and PACAP induced synergistic JUNB expression through switch-like response to c-FOS.

    No full text
    <p>The step stimulation of NGF alone (5 ng/ml, red), PACAP alone (100 nM, blue), and both NGF and PACAP (violet) were given, and pERK, pCREB, c-FOS, JUNB, and FOSB were measured in experiments (dots). The simulation results of the nonlinear ARX model are shown (solid lines). Black dots indicate the sum of the IEG in response to NGF alone and to PACAP alone, and arrows indicate the difference from the sum.</p

    System identification reveals temporal decoding systems of MAP kinase and CREB phosphorylation by selective IEG expression.

    No full text
    <p>We made a system identification of temporal decoding of MAP kinase and CREB phosphorylation by selective immediate early genes expression such as c-FOS, EGR1, c-JUN and JUNB using time series data and the nonlinear ARX model. We found that the expression of IEGs has a distinct upstream dependency, and there are distinct switch-like responses and temporal filters for decoding upstream signals. For example, pulsatile ERK phosphorylation was decoded by selective expression of EGR1 rather than c-FOS, and conjunctive NGF and PACAP stimulation was decoded by synergistic JUNB expression through a switch-like response to c-FOS.</p

    Prediction and validation of the identified system by pharmacological perturbation.

    No full text
    <p>(<b>A</b>) The predictive simulation result and experimental result by PACAP stimulation in the presence (black) or absence (blue) of trametinib. Lines, simulation; dots, experimental and recovered data. Experimental and recovered data of pERK and pCREB, and the simulated data of c-Jun, c-Fos, Egr1, FosB, and JunB are given as <i>Inputs</i>, and simulation was performed using the NARX model in <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1005913#pcbi.1005913.g005" target="_blank">Fig 5</a> (see “Simulation of the integrated NARX model” section in <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1005913#sec008" target="_blank">Materials and methods</a>). In the experiment, PC12 cells were treated in the absence (blue dots) or in the presence (black dots) of trametinib (10 ÎŒM) added at 30 min before stimulation with PACAP (100 nM). Note that the PACAP stimulation data are used, as in <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1005913#pcbi.1005913.g004" target="_blank">Fig 4</a>. (<b>B</b>) Simulation using experimental and recovered data as <i>Inputs</i>. For each set of the <i>Inputs</i> (left panel for each) and <i>Outputs</i> (right panel for each), the unequally spaced time series data were recovered (pluses) (right panel for each), and the responses of <i>Outputs</i> were simulated by the NARX model identified in <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1005913#pcbi.1005913.g005" target="_blank">Fig 5A–5C</a> (solid lines) (right panel for each).</p
    corecore