31 research outputs found

    Systematic identification and characterization of regulatory elements derived from human endogenous retroviruses

    No full text
    <div><p>Human endogenous retroviruses (HERVs) and other long terminal repeat (LTR)-type retrotransposons (HERV/LTRs) have regulatory elements that possibly influence the transcription of host genes. We systematically identified and characterized these regulatory elements based on publicly available datasets of ChIP-Seq of 97 transcription factors (TFs) provided by ENCODE and Roadmap Epigenomics projects. We determined transcription factor-binding sites (TFBSs) using the ChIP-Seq datasets and identified TFBSs observed on HERV/LTR sequences (HERV-TFBSs). Overall, 794,972 HERV-TFBSs were identified. Subsequently, we identified “HERV/LTR-shared regulatory element (HSRE),” defined as a TF-binding motif in HERV-TFBSs, shared within a substantial fraction of a HERV/LTR type. HSREs could be an indication that the regulatory elements of HERV/LTRs are present before their insertions. We identified 2,201 HSREs, comprising specific associations of 354 HERV/LTRs and 84 TFs. Clustering analysis showed that HERV/LTRs can be grouped according to the TF binding patterns; HERV/LTR groups bounded to pluripotent TFs (e.g., SOX2, POU5F1, and NANOG), embryonic endoderm/mesendoderm TFs (e.g., GATA4/6, SOX17, and FOXA1/2), hematopoietic TFs (e.g., SPI1 (PU1), GATA1/2, and TAL1), and CTCF were identified. Regulatory elements of HERV/LTRs tended to locate nearby and/or interact three-dimensionally with the genes involved in immune responses, indicating that the regulatory elements play an important role in controlling the immune regulatory network. Further, we demonstrated subgroup-specific TF binding within LTR7, LTR5B, and LTR5_Hs, indicating that gains or losses of the regulatory elements occurred during genomic invasions of the HERV/LTRs. Finally, we constructed dbHERV-REs, an interactive database of HERV/LTR regulatory elements (<a href="http://herv-tfbs.com/" target="_blank">http://herv-tfbs.com/</a>). This study provides fundamental information in understanding the impact of HERV/LTRs on host transcription, and offers insights into the transcriptional modulation systems of HERV/LTRs and ancestral HERVs.</p></div

    Pedigrees of the seven large PC families.

    No full text
    <p>Solid black rectangles represent affected patients with PC. Patients with PC analyzed by exome-seq were numbered (from 01 to 22). PC, prostate cancer.</p

    Germline Variants of Prostate Cancer in Japanese Families

    No full text
    <div><p>Prostate cancer (PC) is the second most common cancer in men. Family history is the major risk factor for PC. Only two susceptibility genes were identified in PC, <i>BRCA2</i> and <i>HOXB13</i>. A comprehensive search of germline variants for patients with PC has not been reported in Japanese families. In this study, we conducted exome sequencing followed by Sanger sequencing to explore responsible germline variants in 140 Japanese patients with PC from 66 families. In addition to known susceptibility genes, <i>BRCA2</i> and <i>HOXB13</i>, we identified <i>TRRAP</i> variants in a mutually exclusive manner in seven large PC families (three or four patients per family). We also found shared variants of <i>BRCA2</i>, <i>HOXB13</i>, and <i>TRRAP</i> from 59 additional small PC families (two patients per family). We identified two deleterious <i>HOXB13</i> variants (F127C and G132E). Further exploration of the shared variants in rest of the families revealed deleterious variants of the so-called cancer genes (<i>ATP1A1</i>, <i>BRIP1</i>, <i>FANCA</i>, <i>FGFR3</i>, <i>FLT3</i>, <i>HOXD11</i>, <i>MUTYH</i>, <i>PDGFRA</i>, <i>SMARCA4</i>, and <i>TCF3</i>). The germline variant profile provides a new insight to clarify the genetic etiology and heterogeneity of PC among Japanese men.</p></div

    Shared genes with variants.

    No full text
    <p>(A) Heat map of the shared genes with variants. Each column shows the family identification of large PC families or PC pairs of the small PC families. Each row shows the gene names and shared variants are filled with red (deleterious) or orange (nondeleterious) color. (B) Deleterious variants of the Cancer Gene Census genes. The variant status is shown. ExAC_all, MAF of all subjects in the ExAC; iJGVD, MAF in the iJGVD; HGVD, MAF in the HGVD; NA, Not applicable; PC, prostate cancer.</p

    Long-range interactions between HERV-TFBSs/HSREs and promoters of host genes.

    No full text
    <p>The interactions were extracted using pcHi-C dataset in GM12878 cells [<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1006883#pgen.1006883.ref054" target="_blank">54</a>, <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1006883#pgen.1006883.ref055" target="_blank">55</a>]. Results from unique-read TFBSs are shown. A) Proportion of HERV/LTR copies overlapped with promoter-interacting regions. Proportions of total HERV/LTRs, HERV/LTRs with HERV-TFBSs, and HERV/LTRs with HSREs are separately shown. B) Transcription levels (log<sub>10</sub> (RPKM+1)) of protein-coding genes and number of HERV-TFBSs interacting with the genes. Genes were divided into five categories based on the number of HERV-TFBSs interacting with the genes (0, 1, 2–5, 6–10, and 10<). Categories of the 0, 1, 2–5, 6–10, and 10< respectively contained 13,265, 1,179, 1,946, 822, and 1,639 of genes. P values were calculated using the Mann-Whitney U test with adjustment for multiple tests using the BH method. C) The word cloud indicating HERV/LTR types enriched in the interacting regions. Word sizes are proportional to the −log<sub>10</sub> (p value) calculated using the Fisher’s exact test. The word colors indicate HERV/LTR families. D) Hi-C-based GO enrichment analysis. A set of all HERV-TFBSs in GM12878 cells was used. HERV-TFBSs identified in cells treated with special conditions (e.g., supplement of interferon) were excluded. GO terms were summarized by REVIGO [<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1006883#pgen.1006883.ref073" target="_blank">73</a>]. GO terms with hold enrichment scores of >2 are shown.</p

    Variants of <i>HOXB13</i> and <i>TRRAP</i> in the 59 small PC families.

    No full text
    <p>(A) Variant status of <i>HOXB13</i> and <i>TRRAP</i>. ExAC_all, MAF of all subjects in the ExAC; iJGVD, MAF in the iJGVD; HGVD, MAF in the HGVD; NA, Not applicable. (B) Results of Sanger-seq for shared variants of <i>HOXB13</i> and <i>TRRAP</i> (i) Heterozygous variant of <i>HOXB13</i> G132E (c.G395A) in GFPC024. (ii) Homozygous variant of <i>HOXB13</i> G132E (c.G395A) in GFPC079. (iii) Heterozygous variant of <i>TRRAP</i> C1217R (c.T3649C) in GFPC072. The positions of variants are indicated by red arrows. PC, prostate cancer.</p

    Shared genes with variants in the large PC families.

    No full text
    <p>(A) Twenty-two genes in the seven large families remained after filtering and prioritizing. Known susceptibility genes (<i>BRCA2</i> and <i>HOXB13</i>) and one novel gene (<i>TRRAP</i>) are shown by green rectangles. The combined scores of Exomiser are shown on the right side of the gene names. (B) Variant status of <i>BRCA2</i>, <i>HOXB13</i>, and <i>TRRAP</i>. ExAC_all, MAF of all subjects in the ExAC; iJGVD, MAF in the iJGVD; HGVD, MAF in the HGVD; NA, Not applicable. PC, prostate cancer.</p

    Statistical enrichment of respective TFBSs in each type of HERV/LTRs.

    No full text
    <p>Results from unique-read TFBSs are shown. A) The heatmap with hierarchical clustering, which shows statistical enrichment of respective TFBSs in each type of HERV/LTRs. Color in heatmap (from blue to red) indicates enrichment significance (z score) to random expectation. The row indicates TFBSs from a ChIP-Seq analysis. The column indicates a HERV/LTR type. The dendrograms were cut at heights denoted by broken lines. Fourteen clusters were identified for HERV/LTRs and TFBSs. Of these, characteristic clusters of TFBSs (TF_1–8) and HERV/LTRs (HERV_1–9) are shown. The cut heights and the characteristic clusters were manually chosen according to dendrograms and color patterns in heatmap. The number of HERV/LTR types highly enriched in each TFBS dataset (z score >5) is shown on the right side of the heatmap. B) Characteristic clusters of TFBSs (TF_1–8). Ectoderm, endoderm, mesoderm, and mesendoderm were differentiated from HUES64 cells. C) Characteristic clusters of HERV/LTRs (HERV_1–9). Classification of the HERV/LTR family is based on RepeatMasker (20-Mar-2009) (<a href="http://www.repeatmasker.org/" target="_blank">http://www.repeatmasker.org/</a>).</p

    Changes in regulatory elements in LTR5 group.

    No full text
    <p>Results from all-read TFBSs are shown. A) The unrooted phylogenetic tree of LTR5A (red), LTR5B (green), and LTR5_Hs (blue) copies constructed using the maximum likelihood method. LTR5 was divided into five groups (I–V) based on the tree and their TFBSs (shown in (C)). Fragmented and outlier copies were excluded from the analysis. Copies of 233, 300, and 532 respectively belonging to LTR5A, LTR5B, and LTR5_Hs were included in the tree (out of 265, 431, and 645, respectively). Representative bootstrap values are shown at the corresponding nodes. B) Orthologous copies in the reference genomes of primates. The order of LTR5 copies is the same to (A). C) TFBSs present on each copy; representative TFBSs are shown. TFBSs of SPI1, TAL1, and GATA1/2 were from the ENCODE dataset, and others were from the Roadmap dataset. The order of LTR5 copies is the same to (A). D) TF-binding motifs at positions corresponding to HSREs on each LTR5 copy. The order of LTR5 copies is the same to (A). Black and gray colors respectively indicate the presence of motifs with p values of <0.0001 and <0.001, as identified by FIMO [<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1006883#pgen.1006883.ref064" target="_blank">64</a>]. E) Enrichment of sequence reads mapped to LTR5 copies belonging to respective subgroups. The Y-axis shows RPM relative to that of the input control. F) Relative number of HERV-DHSs mapped on each consensus position. The X-axis indicates nucleotide position in the consensus sequence of LTR5_Hs. The Y-axis indicates proportion of HERV/LTR copies harboring HERV-DHSs at each position.</p
    corecore