16 research outputs found

    Increased fractional use of the most frequently used TSS of a gene and decreased fractional use of each other TSS when gene expression level rises.

    No full text
    (A) Spearman's correlation (ρ) between the expression level of a gene and the fractional uses of its TSSs in the human universal sample. TSSs are ranked on the basis of their fractional uses in the sample concerned, with rank #1 being the most frequently used one (major TSS). Each dot represents a gene. Gray and black ρ and P are based on the original and down-sampled data, respectively. (B) Spearman's rank correlation between the expression level of a gene and the fractional uses of its TSSs in each human cell line or tissue sample examined. P −39 in all cases. Squares and triangles show the correlations on the basis of the original and down-sampled data, respectively. In both panels, the correlation for TSSs with a particular rank is calculated using the genes that have at least that particular number of TSSs. Sample IDs listed on the x-axis of (B) refer to those in S1 Table. Data are available at https://github.com/ZhixuanXu/Nonadaptive-alternative-TSSs. ID, identifier; RPM, reads mapped to the gene per million reads; TSS, transcription start site.</p

    Evolutionary conservations of <i>cis</i>-elements of human core promoters.

    No full text
    (A) The typical structure of a core promoter and consensus sequences of cis-elements. The most likely positions in nts relative to the TSS (+1) are given for core promoter cis-elements. (B–D) Mean PhastCons scores of cis-elements of global major TSSs, cis-elements of global minor TSSs, and pseudoelements for INR (B), BRE (C), and TATA (D). In (B)–(D), the mean PhastCons score is significantly different (P U test) between any pair of the three bins. Error bars show the standard error. Degenerate nucleotide symbols used are as follows. N: A, G, C, or T; H: A, T, or C; W: A or T; R: A or G; Y: C or T; M: A or C; K: G or T; S: G or C. Data are available at https://github.com/ZhixuanXu/Nonadaptive-alternative-TSSs. BRE, TFIIB recognition element; DPE, downstream promoter element; INR, initiator; nt, nucleotide; TATA, TATA box; TSS, transcription start site.</p

    TSS usages of human–mouse orthologous genes in each of six tissue samples.

    No full text
    (A) Spearman's correlations between the mean expression level of a gene in the two species and its interspecific distance in TSS usage. All correlations are negative; those significant at P = 0.05 are indicated by an asterisk. The scatter plot of the human–mouse comparison of the universal sample is presented as an example. (B) The fraction of genes for which the Simpson or Shannon index of TSS diversity is lower in the species where the gene expression level is higher. All fractions significantly exceed the random expectation of 50% (P P https://github.com/ZhixuanXu/Nonadaptive-alternative-TSSs. TPM, transcripts per million; TSS, transcription start site.</p

    The TSS diversity of a gene generally decreases with the gene expression level.

    No full text
    (A) The Simpson index of TSS diversity of a gene in the human universal sample declines with the expression level of the gene in the sample. (B) Spearman's correlations between gene expression level and Simpson index of TSS diversity in each of five human cell lines and 11 human tissue samples examined. (C) The Shannon index of TSS diversity of a gene in the human universal sample declines with the expression level of the gene in the sample. (D) Spearman's correlations between gene expression level and Shannon index of TSS diversity in each human cell line and tissue sample examined. In (A) and (C), each black dot represents a gene. Spearman's rank correlation coefficient (ρ) and associated P-value are presented for the original unbinned data (gray) and down-sampled data (black), respectively. Each red dot shows the mean X-value and mean Y-value of the genes in each of 10 equal-interval bins (i.e., all bins have the same log10RPM interval), while the error bars show standard errors (error bar is absent when a bin contains only one gene). In (B) and (D), gray squares and black triangles show the correlations on the basis of the original unbinned data and down-sampled data, respectively. P −3 for all correlations. Sample IDs listed on the x-axis refer to those in S1 Table. Data are available at https://github.com/ZhixuanXu/Nonadaptive-alternative-TSSs. ID, identifier; RPM, reads mapped to the gene per million reads; TSS, transcription start site.</p

    Variation in TSS usage among five human cell lines.

    No full text
    (A) Spearman's correlations between the mean expression level of a gene in two cell lines and the between-cell–line distance in TSS usage. Above and below the diagonal are results obtained from the original and down-sampled data, respectively. All correlations are negative; those significant at P = 0.05 are indicated by an asterisk. The scatter plot for the comparison between K562 and HeLa S3 is presented as an example. (B) Fraction of genes with a negative among-cell–line Spearman's correlation between the Simpson or Shannon index of TSS diversity and expression level. (C) Fraction of genes with a positive among-cell–line Spearman's correlation between the gene expression level and fractional use of a ranked TSS. In (B) and (C), results are based on down-sampled data and P −4 in all cases (binomial test). (D) The maximum number (M) of different major TSSs that a gene can have (given its observed TSSs) in the five human cell lines is greater than the observed number (N) of different major sites for almost all genes with M ≥ 2. The area of a circle is proportional to the indicated number of genes in the circle. (E) Only in a minority of human genes is the number (N) of observed major TSSs significantly greater than that (n) expected under no differential use of TSSs among five human cell lines. Each dot represents a gene, with red dots denote genes whose N exceeds n significantly (Q N than n (Q N than n (not necessarily significantly; red) and the rest of the genes (black). In this panel, N and n have been re-estimated using down-sampled data to equalize the sampling error among genes. Data are available at https://github.com/ZhixuanXu/Nonadaptive-alternative-TSSs. HepG2, human liver cancer cell line Hep G2; MCF7, human breast cancer cell line MCF-7; RPM, reads mapped to the gene per million reads; TSS, transcription start site.</p

    DataSheet_1_Complete mitochondrial genomes of the “Acmaeidae” limpets provide new insights into the internal phylogeny of the Patellogastropoda (Mollusca: Gastropoda).docx

    No full text
    The subclass Patellogastropoda (called “true limpets”) is one of the most primitive groups of the Gastropoda and contains approximately 350 species worldwide. Within this subclass, internal phylogeny among family members, including relationships of the “Acmaeidae” with other patellogastropod families, remains incompletely clarified. Here, we newly determined two complete mitochondrial genome sequences of “Acmaeidae” (Acmaea mitra and Niveotectura pallida) and one sequence from Lottiidae species (Discurria insessa) and combined them with mitochondrial genome sequences of 20 other published limpet species for phylogenetic analysis of the sequence dataset (nucleotides and amino acids) of 13 protein-coding genes using maximum likelihood and Bayesian inference methods. The resulting phylogenetic trees showed monophyly of Patellogastropoda species that were subsequently subdivided into two clades [clade I (Nacellidae, Pectinodontidae, Acmaeidae, and Patellidae) and clade II (Eoacmaeidae and Lottiidae)]. The sister relationship between the Acmaeidae and Pectinodontidae species revealed by phylogenetic analysis was also supported by sharing their similar gene arrangement patterns, which differ substantially from those of clade II members including the Lottiidae species. The polyphyletic relationship between Acmaeidae (grouped with Pectinodontidae as a sister taxon in clade I) and Lottiidae species (grouped with Eoacmaeidae in clade II) corroborates that they are phylogenetically distinct from each other. This mitochondrial genome phylogeny contradicts previous morphology-based hypotheses, yet highlights that Acmaeidae and Pectinodontidae are the most closely related. Further in-depth analysis of the complete mitochondrial genome sequences based on a broad range of samples including those from relatively unstudied and/or underrepresented taxa is required to fully understand the mitochondrial genome evolution and a more comprehensive phylogeny among the major groups of the Patellogastropoda.</p
    corecore