62 research outputs found

    Inferring adaptive codon preference to understand sources of selection shaping codon usage bias.

    Get PDF
    Alternative synonymous codons are often used at unequal frequencies. Classically, studies of such codon usage bias (CUB) attempted to separate the impact of neutral from selective forces by assuming that deviations from a predicted neutral equilibrium capture selection. However, GC-biased gene conversion (gBGC) can also cause deviation from a neutral null. Alternatively, selection has been inferred from CUB in highly expressed genes, but the accuracy of this approach has not been extensively tested, and gBGC can interfere with such extrapolations (e.g., if expression and gene conversion rates covary). It is therefore critical to examine deviations from a mutational null in a species with no gBGC. To achieve this goal, we implement such an analysis in the highly AT rich genome of Dictyostelium discoideum, where we find no evidence of gBGC. We infer neutral CUB under mutational equilibrium to quantify “adaptive codon preference,” a nontautologous genome wide quantitative measure of the relative selection strength driving CUB. We observe signatures of purifying selection consistent with selection favoring adaptive codon preference. Preferred codons are not GC rich, underscoring the independence from gBGC. Expression-associated “preference” largely matches adaptive codon preference but does not wholly capture the influence of selection shaping patterns across all genes, suggesting selective constraints associated specifically with high expression. We observe patterns consistent with effects on mRNA translation and stability shaping adaptive codon preference. Thus, our approach to quantifying adaptive codon preference provides a framework for inferring the sources of selection that shape CUB across different contexts within the genome

    Lineage-specific sequence evolution and exon edge conservation partially explain the relationship between evolutionary rate and expression level in A. thaliana

    Get PDF
    Rapidly evolving proteins can aid the identification of genes underlying phenotypic adaptation across taxa, but functional and structural elements of genes can also affect evolutionary rates. In plants, the ‘edges’ of exons, flanking intron junctions, are known to contain splice enhancers and to have a higher degree of conservation compared to the remainder of the coding region. However, the extent to which these regions may be masking indicators of positive selection or account for the relationship between dN/dS and other genomic parameters is unclear. We investigate the effects of exon edge conservation on the relationship of dN/dS to various sequence characteristics and gene expression parameters in the model plant Arabidopsis thaliana. We also obtain lineage-specific dN/dS estimates, making use of the recently sequenced genome of Thellungiella parvula, the second closest sequenced relative after the sister species Arabidopsis lyrata. Overall, we find that the effect of exon edge conservation, as well as the use of lineage-specific substitution estimates, upon dN/dS ratios partly explains the relationship between the rates of protein evolution and expression level. Furthermore, the removal of exon edges shifts dN/dS estimates upwards, increasing the proportion of genes potentially under adaptive selection. We conclude that lineage-specific substitutions and exon edge conservation have an important effect on dN/dS ratios and should be considered when assessing their relationship with other genomic parameters

    Varying efficacy of artesunate+amodiaquine and artesunate+sulphadoxine-pyrimethamine for the treatment of uncomplicated falciparum malaria in the Democratic Republic of Congo: a report of two in-vivo studies

    Get PDF
    BACKGROUND: Very few data on anti-malarial efficacy are available from the Democratic Republic of Congo (DRC). DRC changed its anti-malarial treatment policy to amodiaquine (AQ) and artesunate (AS) in 2005. METHODS: The results of two in vivo efficacy studies, which tested AQ and sulphadoxine-pyrimethamine (SP) monotherapies and AS+SP and AS+AQ combinations in Boende (Equatorial province), and AS+SP, AS+AQ and SP in Kabalo (Katanga province), between 2003 and 2004 are presented. The methodology followed the WHO 2003 protocol for assessing the efficacy of anti-malarials in areas of high transmission. RESULTS: Out of 394 included patients in Boende, the failure rates on day 28 after PCR-genotyping adjustment of AS+SP and AS+AQ were estimated as 24.6% [95% CI: 16.6-35.5] and 15.1% [95% CI: 8.6-25.7], respectively. For the monotherapies, failure rates were 35.9% [95% CI: 27.0-46.7] for SP and 18.3% [95% CI: 11.6-28.1] for AQ. Out of 207 patients enrolled in Kabalo, the failure rate on day 28 after PCR-genotyping adjustment was 0 [1-sided 95% CI: 5.8] for AS+SP and AS+AQ [1-sided 95% CI: 6.2]. It was 19.6% [95% CI: 11.4-32.7] for SP monotherapy. CONCLUSION: The finding of varying efficacy of the same combinations at two sites in one country highlights one difficulty of implementing a uniform national treatment policy in a large country. The poor efficacy of AS+AQ in Boende should alert the national programme to foci of resistance and emphasizes the need for systems for the prospective monitoring of treatment efficacy at sentinel sites in the country

    The Effect of Transposable Element Insertions on Gene Expression Evolution in Rodents

    Get PDF
    Background:Many genomes contain a substantial number of transposable elements (TEs), a few of which are known to be involved in regulating gene expression. However, recent observations suggest that TEs may have played a very important role in the evolution of gene expression because many conserved non-genic sequences, some of which are know to be involved in gene regulation, resemble TEs. Results:Here we investigate whether new TE insertions affect gene expression profiles by testing whether gene expression divergence between mouse and rat is correlated to the numbers of new transposable elements inserted near genes. We show that expression divergence is significantly correlated to the number of new LTR and SINE elements, but not to the numbers of LINEs. We also show that expression divergence is not significantly correlated to the numbers of ancestral TEs in most cases, which suggests that the correlations between expression divergence and the numbers of new TEs are causal in nature. We quantify the effect and estimate that TE insertion has accounted for ~20% (95% confidence interval: 12% to 26%) of all expression profile divergence in rodents. Conclusions:We conclude that TE insertions may have had a major impact on the evolution of gene expression levels in rodents

    MAGE I Transcription Factors Regulate KAP1 and KRAB Domain Zinc Finger Transcription Factor Mediated Gene Repression

    Get PDF
    Class I MAGE proteins (MAGE I) are normally expressed only in developing germ cells but are aberrantly expressed in many cancers. They have been shown to promote tumor survival, aggressive growth, and chemoresistance but the underlying mechanisms and MAGE I functions have not been fully elucidated. KRAB domain zinc finger transcription factors (KZNFs) are the largest group of vertebrate transcription factors and regulate neoplastic transformation, tumor suppression, cellular proliferation, and apoptosis. KZNFs bind the KAP1 protein and direct KAP1 to specific DNA sequences where it suppresses gene expression by inducing localized heterochromatin characterized by histone 3 lysine 9 trimethylation (H3me3K9). Discovery that MAGE I proteins also bind to KAP1 prompted us to investigate whether MAGE I can affect KZNF and KAP1 mediated gene regulation. We found that expression of MAGE I proteins, MAGE-A3 or MAGE-C2, relieved repression of a reporter gene by ZNF382, a KZNF with tumor suppressor activity. ChIP of MAGE I (-) HEK293T cells showed KAP1 and H3me3K9 are normally bound to the ID1 gene, a target of ZNF382, but that binding is greatly reduced in the presence of MAGE I proteins. MAGE I expression relieved KAP1 mediated ID1 repression, causing increased expression of ID1 mRNA and ID1 chromatin relaxation characterized by loss of H3me3K9. MAGE I binding to KAP1 also induced ZNF382 poly-ubiquitination and degradation, consistent with loss of ZNF382 leading to decreased KAP1 binding to ID1. In contrast, MAGE I expression caused increased KAP1 binding to Ki67, another KAP1 target gene, with increased H3me3K9 and decreased Ki67 mRNA expression. Since KZNFs are required to direct KAP1 to specific genes, these results show that MAGE I proteins can differentially regulate members of the KZNF family and KAP1 mediated gene repression

    Conditional expression explains molecular evolution of social genes in a microbe

    Get PDF
    Conflict is thought to play a critical role in the evolution of social interactions by promoting diversity or driving accelerated evolution. However, despite our sophisticated understanding of how conflict shapes social traits, we have limited knowledge of how it impacts molecular evolution across the underlying social genes. Here we address this problem by analyzing the genome-wide impact of social interactions using genome sequences from 67 Dictyostelium discoideum strains. We find that social genes tend to exhibit enhanced polymorphism and accelerated evolution. However, these patterns are not consistent with conflict driven processes, but instead reflect relaxed purifying selection. This pattern is most likely explained by the conditional nature of social interactions, whereby selection on genes expressed only in social interactions is diluted by generations of inactivity. This dilution of selection by inactivity enhances the role of drift, leading to increased polymorphism and accelerated evolution, which we call the Red King process

    Regular Patterns for Proteome-Wide Distribution of Protein Abundance across Species

    Get PDF
    A proteome of the bio-entity, including cell, tissue, organ, and organism, consists of proteins of diverse abundance. The principle that determines the abundance of different proteins in a proteome is of fundamental significance for an understanding of the building blocks of the bio-entity. Here, we report three regular patterns in the proteome-wide distribution of protein abundance across species such as human, mouse, fly, worm, yeast, and bacteria: in most cases, protein abundance is positively correlated with the protein's origination time or sequence conservation during evolution; it is negatively correlated with the protein's domain number and positively correlated with domain coverage in protein structure, and the correlations became stronger during the course of evolution; protein abundance can be further stratified by the function of the protein, whereby proteins that act on material conversion and transportation (mass category) are more abundant than those that act on information modulation (information category). Thus, protein abundance is intrinsically related to the protein's inherent characters of evolution, structure, and function

    Mutational Biases and Selective Forces Shaping the Structure of Arabidopsis Genes

    Get PDF
    Recently features of gene expression profiles have been associated with structural parameters of gene sequences in organisms representing a diverse set of taxa. The emerging picture indicates that natural selection, mediated by gene expression profiles, has a significant role in determining genic structures. However the current situation is less clear in plants as the available data indicates that the effect of natural selection mediated by gene expression is very weak. Moreover, the direction of the patterns in plants appears to contradict those observed in animal genomes. In the present work we analized expression data for >18000 Arabidopsis genes retrieved from public datasets obtained with different technologies (MPSS and high density chip arrays) and compared them with gene parameters. Our results show that the impact of natural selection mediated by expression on genes sequences is significant and distinguishable from the effects of regional mutational biases. In addition, we provide evidence that the level and the breadth of gene expression are related in opposite ways to many structural parameters of gene sequences. Higher levels of expression abundance are associated with smaller transcripts, consistent with the need to reduce costs of both transcription and translation. Expression breadth, however, shows a contrasting pattern, i.e. longer genes have higher breadth of expression, possibly to ensure those structural features associated with gene plasticity. Based on these results, we propose that the specific balance between these two selective forces play a significant role in shaping the structure of Arabidopsis genes

    Relationship between amino acid composition and gene expression in the mouse genome

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Codon bias is a phenomenon that refers to the differences in the frequencies of synonymous codons among different genes. In many organisms, natural selection is considered to be a cause of codon bias because codon usage in highly expressed genes is biased toward optimal codons. Methods have previously been developed to predict the expression level of genes from their nucleotide sequences, which is based on the observation that synonymous codon usage shows an overall bias toward a few codons called major codons. However, the relationship between codon bias and gene expression level, as proposed by the translation-selection model, is less evident in mammals.</p> <p>Findings</p> <p>We investigated the correlations between the expression levels of 1,182 mouse genes and amino acid composition, as well as between gene expression and codon preference. We found that a weak but significant correlation exists between gene expression levels and amino acid composition in mouse. In total, less than 10% of variation of expression levels is explained by amino acid components. We found the effect of codon preference on gene expression was weaker than the effect of amino acid composition, because no significant correlations were observed with respect to codon preference.</p> <p>Conclusion</p> <p>These results suggest that it is difficult to predict expression level from amino acid components or from codon bias in mouse.</p

    Intergenic and Genic Sequence Lengths Have Opposite Relationships with Respect to Gene Expression

    Get PDF
    Eukaryotic genomes are mostly composed of noncoding DNA whose role is still poorly understood. Studies in several organisms have shown correlations between the length of the intergenic and genic sequences of a gene and the expression of its corresponding mRNA transcript. Some studies have found a positive relationship between intergenic sequence length and expression diversity between tissues, and concluded that genes under greater regulatory control require more regulatory information in their intergenic sequences. Other reports found a negative relationship between expression level and gene length and the interpretation was that there is selection pressure for highly expressed genes to remain small. However, a correlation between gene sequence length and expression diversity, opposite to that observed for intergenic sequences, has also been reported, and to date there is no testable explanation for this observation. To shed light on these varied and sometimes conflicting results, we performed a thorough study of the relationships between sequence length and gene expression using cell-type (tissue) specific microarray data in Arabidopsis thaliana. We measured median gene expression across tissues (expression level), expression variability between tissues (expression pattern uniformity), and expression variability between replicates (expression noise). We found that intergenic (upstream and downstream) and genic (coding and noncoding) sequences have generally opposite relationships with respect to expression, whether it is tissue variability, median, or expression noise. To explain these results we propose a model, in which the lengths of the intergenic and genic sequences have opposite effects on the ability of the transcribed region of the gene to be epigenetically regulated for differential expression. These findings could shed light on the role and influence of noncoding sequences on gene expression
    corecore