10,021 research outputs found

    Codon Bias Patterns of E.coliE.coli's Interacting Proteins

    Get PDF
    Synonymous codons, i.e., DNA nucleotide triplets coding for the same amino acid, are used differently across the variety of living organisms. The biological meaning of this phenomenon, known as codon usage bias, is still controversial. In order to shed light on this point, we propose a new codon bias index, CompAICompAI, that is based on the competition between cognate and near-cognate tRNAs during translation, without being tuned to the usage bias of highly expressed genes. We perform a genome-wide evaluation of codon bias for E.coliE.coli, comparing CompAICompAI with other widely used indices: tAItAI, CAICAI, and NcNc. We show that CompAICompAI and tAItAI capture similar information by being positively correlated with gene conservation, measured by ERI, and essentiality, whereas, CAICAI and NcNc appear to be less sensitive to evolutionary-functional parameters. Notably, the rate of variation of tAItAI and CompAICompAI with ERI allows to obtain sets of genes that consistently belong to specific clusters of orthologous genes (COGs). We also investigate the correlation of codon bias at the genomic level with the network features of protein-protein interactions in E.coliE.coli. We find that the most densely connected communities of the network share a similar level of codon bias (as measured by CompAICompAI and tAItAI). Conversely, a small difference in codon bias between two genes is, statistically, a prerequisite for the corresponding proteins to interact. Importantly, among all codon bias indices, CompAICompAI turns out to have the most coherent distribution over the communities of the interactome, pointing to the significance of competition among cognate and near-cognate tRNAs for explaining codon usage adaptation

    PLIT: An alignment-free computational tool for identification of long non-coding RNAs in plant transcriptomic datasets

    Get PDF
    Long non-coding RNAs (lncRNAs) are a class of non-coding RNAs which play a significant role in several biological processes. RNA-seq based transcriptome sequencing has been extensively used for identification of lncRNAs. However, accurate identification of lncRNAs in RNA-seq datasets is crucial for exploring their characteristic functions in the genome as most coding potential computation (CPC) tools fail to accurately identify them in transcriptomic data. Well-known CPC tools such as CPC2, lncScore, CPAT are primarily designed for prediction of lncRNAs based on the GENCODE, NONCODE and CANTATAdb databases. The prediction accuracy of these tools often drops when tested on transcriptomic datasets. This leads to higher false positive results and inaccuracy in the function annotation process. In this study, we present a novel tool, PLIT, for the identification of lncRNAs in plants RNA-seq datasets. PLIT implements a feature selection method based on L1 regularization and iterative Random Forests (iRF) classification for selection of optimal features. Based on sequence and codon-bias features, it classifies the RNA-seq derived FASTA sequences into coding or long non-coding transcripts. Using L1 regularization, 31 optimal features were obtained based on lncRNA and protein-coding transcripts from 8 plant species. The performance of the tool was evaluated on 7 plant RNA-seq datasets using 10-fold cross-validation. The analysis exhibited superior accuracy when evaluated against currently available state-of-the-art CPC tools

    Conspiracy in bacterial genomes

    Full text link
    The rank ordered distribution of the codon usage frequencies for 123 bacteriae is best fitted by a three parameters function that is the sum of a constant, an exponential and a linear term in the rank n. The parameters depend (two parabolically) from the total GC content. The rank ordered distribution of the amino acids is fitted by a straight line. The Shannon entropy computed over all the codons is well fitted by a parabola in the GC content, while the partial entropies computed over subsets of the codons show peculiar different behavior, exhibiting therefore a first conspiracy effect. Moreover the sum of the codon usage frequencies over particular sets, e.g. with C and A (respectively G and U) as i-th nucleotide, shows a clear linear dependence from the GC content, exhibiting another conspiracy effect.Comment: revised version: introduction and conclusion enhanced, references added, figures added, some tables remove

    Transcription, signaling receptor activity, oxidative phosphorylation, and fatty acid metabolism mediate the presence of closely related species in distinct intertidal and cold-seep habitats

    Get PDF
    Bathyal cold seeps are isolated extreme deep-sea environments characterized by low species diversity while biomass can be high. The Hakon Mosby mud volcano (Barents Sea, 1,280 m) is a rather stable chemosynthetic driven habitat characterized by prominent surface bacterial mats with high sulfide concentrations and low oxygen levels. Here, the nematode Halomonhystera hermesithrives in high abundances (11,000 individuals 10 cm(-2)). Halomonhystera hermesi is a member of the intertidal Halomonhystera disjuncta species complex that includes five cryptic species (GD 1-5). GD1-5's common habitat is characterized by strong environmental fluctuations. Here, we compared the transcriptomes of H. hermesi and GD1, H. hermesi's closest relative. Genes encoding proteins involved in oxidative phosphorylation are more strongly expressed in H. hermesi than in GD1, and many genes were only observed in H. hermesi while being completely absent in GD1. Both observations could in part be attributed to high sulfide concentrations and low oxygen levels. Additionally, fatty acid elongation was also prominent in H. hermesi confirming the importance of highly unsaturated fatty acids in this species. Significant higher amounts of transcription factors and genes involved in signaling receptor activity were observed in GD1 (many of which were completely absent in H. hermesi), allowing fast signaling and transcriptional reprogramming which can mediate survival in dynamic intertidal environments. GC content was approximately 8% higher in H. hermesi coding unigenes resulting in differential codon usage between both species and a higher proportion of amino acids with GC-rich codons in H. hermesi. In general our results showed that most pathways were active in both environments and that only three genes are under natural selection. This indicates that also plasticity should be taken in consideration in the evolutionary history of Halomonhystera species. Such plasticity, as well as possible preadaptation to low oxygen and high sulfide levels might have played an important role in the establishment of a cold-seep Halomonhystera population

    A Minimum Principle in Codon-Anticodon Interaction

    Full text link
    Imposing a minimum principle in the framework of the so called crystal basis model of the genetic code, we determine the structure of the minimum set of anticodons which allows the translational-transcription for animal mitochondrial code. The results are in very good agreement with the observed anticodons.Comment: 13 pages, 6 Tables, to appear in Biosystem

    Recovering complete and draft population genomes from metagenome datasets.

    Get PDF
    Assembly of metagenomic sequence data into microbial genomes is of fundamental value to improving our understanding of microbial ecology and metabolism by elucidating the functional potential of hard-to-culture microorganisms. Here, we provide a synthesis of available methods to bin metagenomic contigs into species-level groups and highlight how genetic diversity, sequencing depth, and coverage influence binning success. Despite the computational cost on application to deeply sequenced complex metagenomes (e.g., soil), covarying patterns of contig coverage across multiple datasets significantly improves the binning process. We also discuss and compare current genome validation methods and reveal how these methods tackle the problem of chimeric genome bins i.e., sequences from multiple species. Finally, we explore how population genome assembly can be used to uncover biogeographic trends and to characterize the effect of in situ functional constraints on the genome-wide evolution

    Integrating Horizontal Gene Transfer and Common Descent to Depict Evolution and Contrast It with ‘‘Common Design

    Get PDF
    Horizontal gene transfer (HGT) and common descent interact in space and time. Because events of HGT co-occur with phylogenetic evolution, it is difficult to depict evolutionary patterns graphically. Tree-like representations of life’s diversification are useful, but they ignore the significance of HGT in evolutionary history, particularly of unicellular organisms, ancestors of multicellular life. Here we integrate the reticulated-tree model, ring of life, symbiogenesis whole-organism model, and eliminative pattern pluralism to represent evolution. Using Entamoeba histolytica alcohol dehydrogenase 2 (EhADH2), a bifunctional enzyme in the glycolytic pathway of amoeba, we illustrate how EhADH2 could be the product of both horizontally acquired features from ancestral prokaryotes (i.e. aldehyde dehydrogenase [ALDH] and alcohol dehydrogenase [ADH]), and subsequent functional integration of these enzymes into EhADH2, which is now inherited by amoeba via common descent. Natural selection has driven the evolution of EhADH2 active sites, which require specific amino acids (cysteine 252 in the ALDH domain; histidine 754 in the ADH domain), iron- and NAD1 as cofactors, and the substrates acetyl-CoA for ALDH and acetaldehyde for ADH. Alternative views invoking ‘‘common design’’ (i.e. the non-naturalistic emergence of major taxa independent from ancestry) to explain the interaction between horizontal and vertical evolution are unfounded
    • …
    corecore