29 research outputs found
The distinctive signatures of promoter regions and operon junctions across prokaryotes
Here we show that regions upstream of first transcribed genes have oligonucleotide signatures that distinguish them from regions upstream of genes in the middle of operons. Databases of experimentally confirmed transcription units do not exist for most genomes. Thus, to expand the analyses into genomes with no experimentally confirmed data, we used genes conserved adjacent in evolutionarily distant genomes as representatives of genes inside operons. Likewise, we used divergently transcribed genes as representative examples of first transcribed genes. In model organisms, the trinucleotide signatures of regions upstream of these representative genes allow for operon predictions with accuracies close to those obtained with known operon data (0.8). Signature-based operon predictions have more similar phylogenetic profiles and higher proportions of genes in the same pathways than predicted transcription unit boundaries (TUBs). These results confirm that we are separating genes with related functions, as expected for operons, from genes not necessarily related, as expected for genes in different transcription units. We also test the quality of the predictions using microarray data in six genomes and show that the signature-predicted operons tend to have high correlations of expression. Oligonucleotide signatures should expand the number of tools available to identify operons even in poorly characterized genomes
Structural correlations in bacterial metabolic networks
<p>Abstract</p> <p>Background</p> <p>Evolution of metabolism occurs through the acquisition and loss of genes whose products acts as enzymes in metabolic reactions, and from a presumably simple primordial metabolism the organisms living today have evolved complex and highly variable metabolisms. We have studied this phenomenon by comparing the metabolic networks of 134 bacterial species with known phylogenetic relationships, and by studying a neutral model of metabolic network evolution.</p> <p>Results</p> <p>We consider the 'union-network' of 134 bacterial metabolisms, and also the union of two smaller subsets of closely related species. Each reaction-node is tagged with the number of organisms it belongs to, which we denote organism degree (OD), a key concept in our study. Network analysis shows that common reactions are found at the centre of the network and that the average OD decreases as we move to the periphery. Nodes of the same OD are also more likely to be connected to each other compared to a random OD relabelling based on their occurrence in the real data. This trend persists up to a distance of around five reactions. A simple growth model of metabolic networks is used to investigate the biochemical constraints put on metabolic-network evolution. Despite this seemingly drastic simplification, a 'union-network' of a collection of unrelated model networks, free of any selective pressure, still exhibit similar structural features as their bacterial counterpart.</p> <p>Conclusions</p> <p>The OD distribution quantifies topological properties of the evolutionary history of bacterial metabolic networks, and lends additional support to the importance of horizontal gene transfer during bacterial metabolic evolution where new reactions are attached at the periphery of the network. The neutral model of metabolic network growth can reproduce the main features of real networks, but we observe that the real networks contain a smaller common core, while they are more similar at the periphery of the network. This suggests that natural selection and biochemical correlations can act both to diversify and to narrow down metabolic evolution.</p
A Screen for RNA-Binding Proteins in Yeast Indicates Dual Functions for Many Enzymes
Hundreds of RNA-binding proteins (RBPs) control diverse aspects of post-transcriptional gene regulation. To identify novel and unconventional RBPs, we probed high-density protein microarrays with fluorescently labeled RNA and selected 200 proteins that reproducibly interacted with different types of RNA from budding yeast Saccharomyces cerevisiae. Surprisingly, more than half of these proteins represent previously known enzymes, many of them acting in metabolism, providing opportunities to directly connect intermediary metabolism with posttranscriptional gene regulation. We mapped the RNA targets for 13 proteins identified in this screen and found that they were associated with distinct groups of mRNAs, some of them coding for functionally related proteins. We also found that overexpression of the enzyme Map1 negatively affects the expression of experimentally defined mRNA targets. Our results suggest that many proteins may associate with mRNAs and possibly control their fates, providing dense connections between different layers of cellular regulation
Decoupling Environment-Dependent and Independent Genetic Robustness across Bacterial Species
The evolutionary origins of genetic robustness are still under debate: it may arise as a consequence of requirements imposed by varying environmental conditions, due to intrinsic factors such as metabolic requirements, or directly due to an adaptive selection in favor of genes that allow a species to endure genetic perturbations. Stratifying the individual effects of each origin requires one to study the pertaining evolutionary forces across many species under diverse conditions. Here we conduct the first large-scale computational study charting the level of robustness of metabolic networks of hundreds of bacterial species across many simulated growth environments. We provide evidence that variations among species in their level of robustness reflect ecological adaptations. We decouple metabolic robustness into two components and quantify the extents of each: the first, environmental-dependent, is responsible for at least 20% of the non-essential reactions and its extent is associated with the species' lifestyle (specialized/generalist); the second, environmental-independent, is associated (correlation = ∼0.6) with the intrinsic metabolic capacities of a species—higher robustness is observed in fast growers or in organisms with an extensive production of secondary metabolites. Finally, we identify reactions that are uniquely susceptible to perturbations in human pathogens, potentially serving as novel drug-targets
Large expert-curated database for benchmarking document similarity detection in biomedical literature search
Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency–Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research
Conservation of adjacency as evidence of paralogous operons
Most of the analyses on the conservation of gene order are limited to orthologous genes. However, the organization of genes into operons might also result in the conservation of gene order of paralogous genes. Thus, we sought computational evidence that conservation of gene order of paralogous genes represents another level of conservation of genes in operons. We found that pairs of genes within experimentally characterized operons of Escherichia coli K12 and Bacillus subtilis tend to have more adjacently conserved paralogs than pairs of genes at transcription unit boundaries. The fraction of same strand gene pairs corresponding to conserved paralogs averages 0.07 with a maximum of 0.22 in Borrelia burgdorferi. The use of evidence from the conservation of adjacency of paralogous genes can improve the prediction of operons in E.coli K12 by ∼0.27 over predictions using conservation of adjacency of orthologous genes alone