7 research outputs found

    Additional file 1 of Transcription factor-binding k-mer analysis clarifies the cell type dependency of binding specificities and cis-regulatory SNPs in humans

    No full text
    Additional file 1: Figure S1 Filtering of ChIP-seq samples. A: Schematic overview of ChIP-seq sample filtering. B: Violin plot showing the AUROC of the prediction of the top 10% PWM-supported k-mers based on the MOCCS2score. The red violin plot represents all CTCF ChIP-seq samples, the green plot represents soft-filtered CTCF ChIP-seq samples, and the blue plot represents hard-filtered CTCF ChIP-seq samples. High-quality ChIP-seq samples with high AUROC scores were retained after hard filtering. C: Distribution of each quality control metric of ChIP-seq sample filtering for samples that passed the hard filter (pink) and others (blue). D: Bar plots display the number of ChIP-seq samples that passed through the soft and hard filters. Bars are colored according to cell type classes or TFs. Figure S2 Simulation of significant k-mer detection. A: The procedure for generating simulated datasets. Simulated data generated by embedding a “true significant k-mer” within random sequences was applied to MOCCS2 and the q-values of the MOCCS2score were calculated for each k-mer. B: Parameters for each simulation condition from #1 to #5. α is the percentage of input sequences containing embedded “true significant k-mers” , N is the number of peaks in a ChIP-seq sample, and σ is the standard deviation of the embedded “true significant k-mers” from the center of the peak. C: Simulation results for significant k-mer detection. The sensitivity, specificity, and FDR for detecting “true significant k-mers” are shown for different parameter settings. Figure S3 Number of peaks and significant k-mers in MOCCS profiles. A: Number of peaks in MOCCS profiles. The x-axis represents the log-transformed number of peaks with a base of 10 and the y-axis represents the number of ChIP-seq samples. B: Relationship between the number of peaks and significant k-mers in MOCCS profiles (left, q < 0.05; right, q < 0.01). Figure S4 Similarities in MOCCS profiles and peak locations for sample pairs of same or different TFs. A: Comparison of k-sim Jaccard, Pearson and peak overlap indices (a-c: groups of the same cell types). B: Two-dimensional density plot of k-sim Jaccard or Pearson with the peak overlap index (a-c: groups of the same cell types). C: Correlation coefficient of k-sim Jaccard or Pearson with the peak overlap index in each group. The y-axis indicates Spearman’ s correlation coefficient. Red and blue indicate k-sim Pearson and Jaccard values, respectively (a-c: groups of the same cell types) Figure S5 Similarities in MOCCS profiles and peak locations for sample pairs of same/different cell types. A: Comparison of the k-sim Jaccard, Pearson, and peak overlap indices (a, d, and e: groups of the same TFs). B: Two-dimensional density plot of k-sim Jaccard or Pearson with the peak overlap index (a, d, and e: groups of the same TFs). C: Correlation coefficient of k-sim Jaccard or Pearson with the peak overlap index in each group. The y-axis indicates Spearman’ s correlation coefficient. Red and blue indicate k-sim Pearson and Jaccard values, respectively (a, d, and e: groups of the same TFs). Figure S6 Heat maps of cell type-dependent TFs. The heat map color indicates the k-sim Jaccard value for the 33 cell type-dependent TFs. The color labels of the heat maps indicate the cell type classes. Cell type classes with only a single ChIP-seq sample were excluded from the visualization. Asterisks indicate the statistical significance of ChIP-seq samples with the same and different cell type classes (Mann–Whitney U test, p < 0.05). Figure S7 Violin plots of all cell type-dependent TFs. The y-axis indicates the k-sim Jaccard value. The same and different groups were arranged along the x-axis. Asterisks indicate the statistical significance of ChIP-seq samples with the same and different cell type classes (Mann–Whitney U test, p < 0.05). Figure S8 Simulation of differential k-mer detection. A: Simulated data processing. Simulated data with an embedded “true differential k-mer” and “true significant k-mer” was prepared by embedding a “true” k-mer within α% of a randomly generated sample of 2W + 1 bp (W = 350) DNA sequences and applied to MOCCS2. “True significant k-mers” were embedded following a normal distribution whose mean was W + 1 and whose standard deviation was σ. “True differential k-mers” were embedded in S1 (or S2), similar to “true significant k-mers,” and were embedded in S2 (or S1) following a uniform distribution whose mean was 1 and whose standard deviation was (2 × W + 1) − (k − 1). It should be noted that we set k as k=6. B: Parameters for each simulation condition from #1 to #5. L is the number of differential k-mers and m is the number of significant k-mers. Figure S9 ΔMOCCS2score profiles were consistent with the in vitro SNP-SELEX and PWM motif fold change. A: Spearman’ s correlation coefficient between PBS (SNP-SELEX) and ΔMOCCS2score in each TF for the original and permuted data. Red points indicate the original Spearman’ s correlation coefficient, and blue points indicate the permutated data. B: Difference in ΔMOCCS2score profile consistency among the positions of SNPs in k-mers. The kth SNP position indicates the kth allele on the left side of the k-mer. C: The ΔMOCCS2score is consistent with the PWM motif fold change. Figure S10 Number of peak-overlapping GWAS-SNPs with significant ΔMOCCS2scores. Number of peak-overlapping GWAS-SNPs in each ChIP-seq sample. Each bar represents a ChIP-seq sample, and the y-axis represents the number of peak-overlapping GWAS-SNPs. The red fraction represents the number of peak-overlapping GWAS-SNPs with significant ΔMOCCS2scores (q < 0.05), and the gray fraction represents the number of GWAS SNPs with non-significant ΔMOCCS2scores. Figure S11 Prediction of SNP-affected TFs and cell type classes using ΔMOCCS2score profiles. Top ChIP-seq samples with high ΔMOCCS2scores in each phenotype (IBD, inflammatory bowel disease; CD, Crohn’ s disease; MS, multiple sclerosis; SLE, systemic lupus erythematosus). The ΔMOCCS2score was calculated for each SNP and ChIP-seq sample. Bar graph colors represent TFs or cell type classes. Figure S12 Association between the allele frequency and ΔMOCCS2score. Association between the allele frequency and (A) the absolute values of the ΔMOCCS2score or (B) the ratio of SNPs with significant ΔMOCCS2scores in each phenotype (IBD, inflammatory bowel disease; CD, Crohn’ s disease; MS, multiple sclerosis; SLE, systemic lupus erythematosus). Figure S13 Accuracy of detecting canonical motifs using MOCCS2score for different k. AUROC for detecting canonical PWM motifs using the MOCCS2score in the difference of value k. The x-axis represents the ratio of PWM-supported k-mers in all k-mers and the y-axis represents the AUROC. The colors of the violin plots represent the different k values

    Detection of the protease genes of <i>A. trota</i> by colony hybridization using specific probes.

    No full text
    <p>Each strain was inoculated on the NA plate and incubated at 37°C for 24 h. Bacterial colonies were transferred onto the Hybond-N<sup>+</sup> membrane, and bacterial DNAs were baked. The colony reacted with specific probes for either the serine protease gene (A) or metalloprotease gene (B), as described in the text. The table on the right side of the figure shows the names of the strains used.</p

    Enterotoxic activity of <i>A. trota</i> 701.

    No full text
    <p>Enterotoxic activity was elucidated using the mouse intestinal loop test. A total of 3×10<sup>7</sup> cells of <i>A. trota</i> 701 was mixed with either pre-immunization serum (white bar) or anti-ALH serum (black bar), and ingested into the mice intestinal loop. When the accumulation of fluid induced by the sample was more than 0.2 g/cm, the sample was considered to be enterotoxigenic. One group consisted of four mice.</p

    Immunodetection of hemolysins produced by <i>A. sobria</i> and <i>A. trota</i>.

    No full text
    <p><i>A. sobria</i> 288, <i>A. trota</i> 701, and <i>A. trota</i> ATCC49657 were cultivated in NB medium at 37°C. A portion of the culture was collected at the period indicated and the culture supernatant and cell lysate were prepared as described in the text. Each SDS sample, which was equivalent to 50 ”L culture, was applied to the lane (A). After SDS-PAGE, hemolysin was detected in each lane using anti-ALH antiserum, as described in the text. Larger amounts of SDS samples, equivalent to 250 ”L culture, were applied to the lanes of SDS-PAGE for clearer detection (B). The sample containing pre-ALH and ALH was applied to lane P as a positive control. The upper band in lane P is pre-ALH and lower band is mature ALH.</p

    Deduced amino acid sequence of serine protease and <i>A. trota</i> chaperone.

    No full text
    <p>The sequence (A) is the deduced amino acid sequence of serine protease and the sequence (B) is that of <i>A. trota</i> chaperone. Amino acid sequences are numbered as 1 from the initiator Met, and the numbers on the right side of the amino acid sequences indicate the number of amino acid residues from A of the initiation Met. The nucleotide sequence encoding these genes was deposited in GenBank (Accession No. KF914659). Underlined sections represent the amino acid sequence used to make anti-serine protease peptide antiserum.</p

    Immunodetection of serine protease in the culture supernatants of <i>A. sobria</i> and <i>A. trota</i>.

    No full text
    <p><i>A. sobria</i> 288, <i>A. trota</i> 701, and <i>A. trota</i> ATCC49657 were cultivated in NB medium at 37°C. A portion of the culture was collected at the period indicated and the culture supernatant was prepared as described in the text. Each SDS sample, which was equivalent to 50 ”L culture, was applied to the lane. After SDS-PAGE, serine protease was detected in each lane using anti-serine protease peptide antiserum, as described in the text. The sample containing <i>A. sobria</i> serine protease was applied to lane P as a positive control. The band along with the arrow indicated the <i>A. sobria</i> serine protease with a molecular size of 64 kDa.</p

    Cultivation of <i>A. trota</i> on agar medium containing erythrocytes or skim milk.

    No full text
    <p>Each strain was cultivated in nutrient broth for 20 ”L of culture solutions were dropped on each agar medium. After inoculation, these plates were incubated at 37°C for 24 h. Hemolytic activity (A) and proteolytic activity (B) were assessed by the appearance of a transparent zone around the bacteria on each plate, respectively. The table at the right side of the figure shows the names of the strains used.</p
    corecore