52 research outputs found
The Synergizer service for translating gene, protein and other biological identifiers
Summary: The Synergizer is a database and web service that provides translations of biological database identifiers. It is accessible both programmatically and interactively
Genome Analysis Reveals Interplay between 5′UTR Introns and Nuclear mRNA Export for Secretory and Mitochondrial Genes
In higher eukaryotes, messenger RNAs (mRNAs) are exported from the nucleus to the cytoplasm via factors deposited near the 5′ end of the transcript during splicing. The signal sequence coding region (SSCR) can support an alternative mRNA export (ALREX) pathway that does not require splicing. However, most SSCR–containing genes also have introns, so the interplay between these export mechanisms remains unclear. Here we support a model in which the furthest upstream element in a given transcript, be it an intron or an ALREX–promoting SSCR, dictates the mRNA export pathway used. We also experimentally demonstrate that nuclear-encoded mitochondrial genes can use the ALREX pathway. Thus, ALREX can also be supported by nucleotide signals within mitochondrial-targeting sequence coding regions (MSCRs). Finally, we identified and experimentally verified novel motifs associated with the ALREX pathway that are shared by both SSCRs and MSCRs. Our results show strong correlation between 5′ untranslated region (5′UTR) intron presence/absence and sequence features at the beginning of the coding region. They also suggest that genes encoding secretory and mitochondrial proteins share a common regulatory mechanism at the level of mRNA export
Network-based functional enrichment
<p>Abstract</p> <p>Background</p> <p>Many methods have been developed to infer and reason about molecular interaction networks. These approaches often yield networks with hundreds or thousands of nodes and up to an order of magnitude more edges. It is often desirable to summarize the biological information in such networks. A very common approach is to use gene function enrichment analysis for this task. A major drawback of this method is that it ignores information about the edges in the network being analyzed, i.e., it treats the network simply as a set of genes. In this paper, we introduce a novel method for functional enrichment that explicitly takes network interactions into account.</p> <p>Results</p> <p>Our approach naturally generalizes Fisher’s exact test, a gene set-based technique. Given a function of interest, we compute the subgraph of the network induced by genes annotated to this function. We use the sequence of sizes of the connected components of this sub-network to estimate its connectivity. We estimate the statistical significance of the connectivity empirically by a permutation test. We present three applications of our method: i) determine which functions are enriched in a given network, ii) given a network and an interesting sub-network of genes within that network, determine which functions are enriched in the sub-network, and iii) given two networks, determine the functions for which the connectivity improves when we merge the second network into the first. Through these applications, we show that our approach is a natural alternative to network clustering algorithms.</p> <p>Conclusions</p> <p>We presented a novel approach to functional enrichment that takes into account the pairwise relationships among genes annotated by a particular function. Each of the three applications discovers highly relevant functions. We used our methods to study biological data from three different organisms. Our results demonstrate the wide applicability of our methods. Our algorithms are implemented in C++ and are freely available under the GNU General Public License at our supplementary website. Additionally, all our input data and results are available at <url>http://bioinformatics.cs.vt.edu/~murali/supplements/2011-incob-nbe/</url>.</p
A critical assessment of Mus musculus gene function prediction using integrated genomic evidence
Background: Several years after sequencing the human genome and the mouse genome, much remains to be discovered about the functions of most human and mouse genes. Computational prediction of gene function promises to help focus limited experimental resources on the most likely hypotheses. Several algorithms using diverse genomic data have been applied to this task in model organisms; however, the performance of such approaches in mammals has not yet been evaluated.
Results: In this study, a standardized collection of mouse functional genomic data was assembled; nine bioinformatics teams used this data set to independently train classifiers and generate predictions of function, as defined by Gene Ontology (GO) terms, for 21,603 mouse genes; and the best performing submissions were combined in a single set of predictions. We identified strengths and weaknesses of current functional genomic data sets and compared the performance of function prediction algorithms. This analysis inferred functions for 76% of mouse genes, including 5,000 currently uncharacterized genes. At a recall rate of 20%, a unified set of predictions averaged 41% precision, with 26% of GO terms achieving a precision better than 90%.
Conclusion: We performed a systematic evaluation of diverse, independently developed computational approaches for predicting gene function from heterogeneous data sources in mammals. The results show that currently available data for mammals allows predictions with both breadth and accuracy. Importantly, many highly novel predictions emerge for the 38% of mouse genes that remain uncharacterized
An update on the strategies in multicomponent activity monitoring within the phytopharmaceutical field
<p>Abstract</p> <p>Background</p> <p>To-date modern drug research has focused on the discovery and synthesis of single active substances. However, multicomponent preparations are gaining increasing importance in the phytopharmaceutical field by demonstrating beneficial properties with respect to efficacy and toxicity.</p> <p>Discussion</p> <p>In contrast to single drug combinations, a botanical multicomponent therapeutic possesses a complex repertoire of chemicals that belong to a variety of substance classes. This may explain the frequently observed pleiotropic bioactivity spectra of these compounds, which may also suggest that they possess novel therapeutic opportunities. Interestingly, considerable bioactivity properties are exhibited not only by remedies that contain high doses of phytochemicals with prominent pharmaceutical efficacy, but also preparations that lack a sole active principle component. Despite that each individual substance within these multicomponents has a low molar fraction, the therapeutic activity of these substances is established via a potentialization of their effects through combined and simultaneous attacks on multiple molecular targets. Although beneficial properties may emerge from such a broad range of perturbations on cellular machinery, validation and/or prediction of their activity profiles is accompanied with a variety of difficulties in generic risk-benefit assessments. Thus, it is recommended that a comprehensive strategy is implemented to cover the entirety of multicomponent-multitarget effects, so as to address the limitations of conventional approaches.</p> <p>Summary</p> <p>An integration of standard toxicological methods with selected pathway-focused bioassays and unbiased data acquisition strategies (such as gene expression analysis) would be advantageous in building an interaction network model to consider all of the effects, whether they were intended or adverse reactions.</p
Network-Based Prediction and Analysis of HIV Dependency Factors
HIV Dependency Factors (HDFs) are a class of human proteins that are essential for HIV replication, but are not lethal to the host cell when silenced. Three previous genome-wide RNAi experiments identified HDF sets with little overlap. We combine data from these three studies with a human protein interaction network to predict new HDFs, using an intuitive algorithm called SinkSource and four other algorithms published in the literature. Our algorithm achieves high precision and recall upon cross validation, as do the other methods. A number of HDFs that we predict are known to interact with HIV proteins. They belong to multiple protein complexes and biological processes that are known to be manipulated by HIV. We also demonstrate that many predicted HDF genes show significantly different programs of expression in early response to SIV infection in two non-human primate species that differ in AIDS progression. Our results suggest that many HDFs are yet to be discovered and that they have potential value as prognostic markers to determine pathological outcome and the likelihood of AIDS development. More generally, if multiple genome-wide gene-level studies have been performed at independent labs to study the same biological system or phenomenon, our methodology is applicable to interpret these studies simultaneously in the context of molecular interaction networks and to ask if they reinforce or contradict each other
Ten Years of Pathway Analysis: Current Approaches and Outstanding Challenges
Pathway analysis has become the first choice for gaining insight into the underlying biology of differentially expressed genes and proteins, as it reduces complexity and has increased explanatory power. We discuss the evolution of knowledge base–driven pathway analysis over its first decade, distinctly divided into three generations. We also discuss the limitations that are specific to each generation, and how they are addressed by successive generations of methods. We identify a number of annotation challenges that must be addressed to enable development of the next generation of pathway analysis methods. Furthermore, we identify a number of methodological challenges that the next generation of methods must tackle to take advantage of the technological advances in genomics and proteomics in order to improve specificity, sensitivity, and relevance of pathway analysis
Cold adaptation drives population genomic divergence in the ecological specialist, Drosophila montana
Funding: UK Natural Environment Research Council (Grant Number(s): NE/L501852/1, NE/P000592/1); Academy of Finland (GrantNumber(s): 267244, 268214, 322980), Ella ja Georg Ehrnroothin Säätiö.Detecting signatures of ecological adaptation in comparative genomics is challenging, but analysing population samples with characterised geographic distributions, such as clinal variation, can help identify genes showing covariation with important ecological variation. Here, we analysed patterns of geographic variation in the cold-adapted species Drosophila montana across phenotypes, genotypes and environmental conditions and tested for signatures of cold adaptation in population genomic divergence. We first derived the climatic variables associated with the geographic distribution of 24 populations across two continents to trace the scale of environmental variation experienced by the species, and measured variation in the cold tolerance of the flies of six populations from different geographic contexts. We then performed pooled whole genome sequencing of these six populations, and used Bayesian methods to identify SNPs where genetic differentiation is associated with both climatic variables and the population phenotypic measurements, while controlling for effects of demography and population structure. The top candidate SNPs were enriched on the X and fourth chromosomes, and they also lay near genes implicated in other studies of cold tolerance and population divergence in this species and its close relatives. We conclude that ecological adaptation has contributed to the divergence of D. montana populations throughout the genome and in particular on the X and fourth chromosomes, which also showed highest interpopulation FST. This study demonstrates that ecological selection can drive genomic divergence at different scales, from candidate genes to chromosome-wide effects.Publisher PDFPeer reviewe
Defining the Specificity of Cotranslationally Acting Chaperones by Systematic Analysis of mRNAs Associated with Ribosome-Nascent Chain Complexes
Polypeptides exiting the ribosome must fold and assemble in the crowded environment of the cell. Chaperones and other protein homeostasis factors interact with newly translated polypeptides to facilitate their folding and correct localization. Despite the extensive efforts, little is known about the specificity of the chaperones and other factors that bind nascent polypeptides. To address this question we present an approach that systematically identifies cotranslational chaperone substrates through the mRNAs associated with ribosome-nascent chain-chaperone complexes. We here focused on two Saccharomyces cerevisiae chaperones: the Signal Recognition Particle (SRP), which acts cotranslationally to target proteins to the ER, and the Nascent chain Associated Complex (NAC), whose function has been elusive. Our results provide new insights into SRP selectivity and reveal that NAC is a general cotranslational chaperone. We found surprising differential substrate specificity for the three subunits of NAC, which appear to recognize distinct features within nascent chains. Our results also revealed a partial overlap between the sets of nascent polypeptides that interact with NAC and SRP, respectively, and showed that NAC modulates SRP specificity and fidelity in vivo. These findings give us new insight into the dynamic interplay of chaperones acting on nascent chains. The strategy we used should be generally applicable to mapping the specificity, interplay, and dynamics of the cotranslational protein homeostasis network
- …