10,021 research outputs found
Codon Bias Patterns of 's Interacting Proteins
Synonymous codons, i.e., DNA nucleotide triplets coding for the same amino
acid, are used differently across the variety of living organisms. The
biological meaning of this phenomenon, known as codon usage bias, is still
controversial. In order to shed light on this point, we propose a new codon
bias index, , that is based on the competition between cognate and
near-cognate tRNAs during translation, without being tuned to the usage bias of
highly expressed genes. We perform a genome-wide evaluation of codon bias for
, comparing with other widely used indices: , , and
. We show that and capture similar information by being
positively correlated with gene conservation, measured by ERI, and
essentiality, whereas, and appear to be less sensitive to
evolutionary-functional parameters. Notably, the rate of variation of and
with ERI allows to obtain sets of genes that consistently belong to
specific clusters of orthologous genes (COGs). We also investigate the
correlation of codon bias at the genomic level with the network features of
protein-protein interactions in . We find that the most densely
connected communities of the network share a similar level of codon bias (as
measured by and ). Conversely, a small difference in codon bias
between two genes is, statistically, a prerequisite for the corresponding
proteins to interact. Importantly, among all codon bias indices, turns
out to have the most coherent distribution over the communities of the
interactome, pointing to the significance of competition among cognate and
near-cognate tRNAs for explaining codon usage adaptation
PLIT: An alignment-free computational tool for identification of long non-coding RNAs in plant transcriptomic datasets
Long non-coding RNAs (lncRNAs) are a class of non-coding RNAs which play a significant role in several biological processes. RNA-seq based transcriptome sequencing has been extensively used for identification of lncRNAs. However, accurate identification of lncRNAs in RNA-seq datasets is crucial for exploring their characteristic functions in the genome as most coding potential computation (CPC) tools fail to accurately identify them in transcriptomic data. Well-known CPC tools such as CPC2, lncScore, CPAT are primarily designed for prediction of lncRNAs based on the GENCODE, NONCODE and CANTATAdb databases. The prediction accuracy of these tools often drops when tested on transcriptomic datasets. This leads to higher false positive results and inaccuracy in the function annotation process. In this study, we present a novel tool, PLIT, for the identification of lncRNAs in plants RNA-seq datasets. PLIT implements a feature selection method based on L1 regularization and iterative Random Forests (iRF) classification for selection of optimal features. Based on sequence and codon-bias features, it classifies the RNA-seq derived FASTA sequences into coding or long non-coding transcripts. Using L1 regularization, 31 optimal features were obtained based on lncRNA and protein-coding transcripts from 8 plant species. The performance of the tool was evaluated on 7 plant RNA-seq datasets using 10-fold cross-validation. The analysis exhibited superior accuracy when evaluated against currently available state-of-the-art CPC tools
Conspiracy in bacterial genomes
The rank ordered distribution of the codon usage frequencies for 123
bacteriae is best fitted by a three parameters function that is the sum of a
constant, an exponential and a linear term in the rank n. The parameters depend
(two parabolically) from the total GC content. The rank ordered distribution of
the amino acids is fitted by a straight line. The Shannon entropy computed over
all the codons is well fitted by a parabola in the GC content, while the
partial entropies computed over subsets of the codons show peculiar different
behavior, exhibiting therefore a first conspiracy effect. Moreover the sum of
the codon usage frequencies over particular sets, e.g. with C and A
(respectively G and U) as i-th nucleotide, shows a clear linear dependence from
the GC content, exhibiting another conspiracy effect.Comment: revised version: introduction and conclusion enhanced, references
added, figures added, some tables remove
Transcription, signaling receptor activity, oxidative phosphorylation, and fatty acid metabolism mediate the presence of closely related species in distinct intertidal and cold-seep habitats
Bathyal cold seeps are isolated extreme deep-sea environments characterized by low species diversity while biomass can be high. The Hakon Mosby mud volcano (Barents Sea, 1,280 m) is a rather stable chemosynthetic driven habitat characterized by prominent surface bacterial mats with high sulfide concentrations and low oxygen levels. Here, the nematode Halomonhystera hermesithrives in high abundances (11,000 individuals 10 cm(-2)). Halomonhystera hermesi is a member of the intertidal Halomonhystera disjuncta species complex that includes five cryptic species (GD 1-5). GD1-5's common habitat is characterized by strong environmental fluctuations. Here, we compared the transcriptomes of H. hermesi and GD1, H. hermesi's closest relative. Genes encoding proteins involved in oxidative phosphorylation are more strongly expressed in H. hermesi than in GD1, and many genes were only observed in H. hermesi while being completely absent in GD1. Both observations could in part be attributed to high sulfide concentrations and low oxygen levels. Additionally, fatty acid elongation was also prominent in H. hermesi confirming the importance of highly unsaturated fatty acids in this species. Significant higher amounts of transcription factors and genes involved in signaling receptor activity were observed in GD1 (many of which were completely absent in H. hermesi), allowing fast signaling and transcriptional reprogramming which can mediate survival in dynamic intertidal environments. GC content was approximately 8% higher in H. hermesi coding unigenes resulting in differential codon usage between both species and a higher proportion of amino acids with GC-rich codons in H. hermesi. In general our results showed that most pathways were active in both environments and that only three genes are under natural selection. This indicates that also plasticity should be taken in consideration in the evolutionary history of Halomonhystera species. Such plasticity, as well as possible preadaptation to low oxygen and high sulfide levels might have played an important role in the establishment of a cold-seep Halomonhystera population
A Minimum Principle in Codon-Anticodon Interaction
Imposing a minimum principle in the framework of the so called crystal basis
model of the genetic code, we determine the structure of the minimum set of
anticodons which allows the translational-transcription for animal
mitochondrial code. The results are in very good agreement with the observed
anticodons.Comment: 13 pages, 6 Tables, to appear in Biosystem
Recovering complete and draft population genomes from metagenome datasets.
Assembly of metagenomic sequence data into microbial genomes is of fundamental value to improving our understanding of microbial ecology and metabolism by elucidating the functional potential of hard-to-culture microorganisms. Here, we provide a synthesis of available methods to bin metagenomic contigs into species-level groups and highlight how genetic diversity, sequencing depth, and coverage influence binning success. Despite the computational cost on application to deeply sequenced complex metagenomes (e.g., soil), covarying patterns of contig coverage across multiple datasets significantly improves the binning process. We also discuss and compare current genome validation methods and reveal how these methods tackle the problem of chimeric genome bins i.e., sequences from multiple species. Finally, we explore how population genome assembly can be used to uncover biogeographic trends and to characterize the effect of in situ functional constraints on the genome-wide evolution
Integrating Horizontal Gene Transfer and Common Descent to Depict Evolution and Contrast It with ‘‘Common Design
Horizontal gene transfer (HGT) and common descent interact in space and time. Because events of HGT co-occur with phylogenetic evolution, it is difficult to depict evolutionary patterns graphically. Tree-like representations of life’s diversification are useful, but they ignore the significance of HGT in evolutionary history, particularly of unicellular organisms, ancestors of multicellular life. Here we integrate the reticulated-tree model, ring of life, symbiogenesis whole-organism model, and eliminative pattern pluralism to represent evolution. Using Entamoeba histolytica alcohol dehydrogenase 2 (EhADH2), a bifunctional enzyme in the glycolytic pathway of amoeba, we illustrate how EhADH2 could be the product of both horizontally acquired features from ancestral prokaryotes (i.e. aldehyde dehydrogenase [ALDH] and alcohol dehydrogenase [ADH]), and subsequent functional integration of these enzymes into EhADH2, which is now inherited by amoeba via common descent. Natural selection has driven the evolution of EhADH2 active sites, which require specific amino acids (cysteine 252 in the ALDH domain; histidine 754 in the ADH domain), iron- and NAD1 as cofactors, and the substrates acetyl-CoA for ALDH and acetaldehyde for ADH. Alternative views invoking ‘‘common design’’ (i.e. the non-naturalistic emergence of major taxa independent from ancestry) to explain the interaction between horizontal and vertical evolution are unfounded
- …