2,354 research outputs found

    The Ashbya Genome Database (AGD)—a tool for the yeast community and genome biologists

    Get PDF
    The Ashbya Genome Database (AGD) is a comprehensive online source of information covering genes from the filamentous fungus Ashbya gossypii. The database content is based upon comparative genome annotation between A.gossypii and the closely related budding yeast Saccharomyces cerevisiae taking both sequence similarity and synteny (conserved order and orientation) into account. Release 2 of AGD contains 4718 protein-encoding loci located across seven chromosomes. Information can be retrieved using systematic or standard locus names from A.gossypii as well as budding and fission yeast. Approximately 90% of the genes in the genome of A.gossypii are homologous and syntenic to loci of budding yeast. Therefore, AGD is a useful tool not only for the various yeast communities in general but also for biologists who are interested in evolutionary aspects of genome research and comparative genome annotation. The database provides scientists with a convenient graphical user interface that includes various locus search and genome browsing options, data download and export functionalities and numerous reciprocal links to external databases including SGD, MIPS, GeneDB, KEGG, GermOnline and Swiss-Prot/TrEMBL. AGD is accessible at http://agd.unibas.c

    Applicability of tandem affinity purification MudPIT to pathway proteomics in yeast

    Get PDF
    A combined multidimensional chromatography-mass spectrometry approach known as "MudPIT" enables rapid identification of proteins that interact with a tagged bait while bypassing some of the problems associated with analysis of polypeptides excised from SDS-polyacrylamide gels. However, the reproducibility, success rate, and applicability of MudPIT to the rapid characterization of dozens of proteins have not been reported. We show here that MudPIT reproducibly identified bona fide partners for budding yeast Gcn5p. Additionally, we successfully applied MudPIT to rapidly screen through a collection of tagged polypeptides to identify new protein interactions. Twenty-five proteins involved in transcription and progression through mitosis were modified with a new tandem affinity purification (TAP) tag. TAP-MudPIT analysis of 22 yeast strains that expressed these tagged proteins uncovered known or likely interacting partners for 21 of the baits, a figure that compares favorably with traditional approaches. The proteins identified here comprised 102 previously known and 279 potential physical interactions. Even for the intensively studied Swi2p/Snf2p, the catalytic subunit of the Swi/Snf chromatin remodeling complex, our analysis uncovered a new interacting protein, Rtt102p. Reciprocal tagging and TAP-MudPIT analysis of Rtt102p revealed subunits of both the Swi/Snf and RSC complexes, identifying Rtt102p as a common interactor with, and possible integral component of, these chromatin remodeling machines. Our experience indicates it is feasible for an investigator working with a single ion trap instrument in a conventional molecular/cellular biology laboratory to carry out proteomic characterization of a pathway, organelle, or process (i.e. "pathway proteomics") by systematic application of TAP-MudPIT

    How long is co-operation in genomics sustainable?

    Get PDF
    Publications on the 16 yeast chromosome sequences group together over 400 different authors from Europe, Japan, Australia and the USA. When research is not organised in networks, it is carried out in large sequencing centres such as the Sanger Centre in Britain, the Helix Institute in Japan or Saint Louis University in the USA. Both cases illustrate the collective nature of knowledge creation. Other examples of co-operation between numerous researchers in various countries, more closely related to innovation, might also be mentioned, such as the development of software for comparing proteins or DNA sequences. Collective publications reveal the collective nature of research, whether it is carried out by major consortia (the case of yeast) or around large research facilities (such as the synchrotron or major genome sequencing centres). This collective nature stems from two factors: (1) the advantages of co-ordinating efforts on major projects (e.g. economies of scale and of collection) and (2) very strong interdependency in the creation and utilisation of knowledge (related to cumulativeness).

    Ranking relations using analogies in biological and information networks

    Get PDF
    Analogical reasoning depends fundamentally on the ability to learn and generalize about relations between objects. We develop an approach to relational learning which, given a set of pairs of objects S={A(1):B(1),A(2):B(2),,A(N):B(N)}\mathbf{S}=\{A^{(1)}:B^{(1)},A^{(2)}:B^{(2)},\ldots,A^{(N)}:B ^{(N)}\}, measures how well other pairs A:B fit in with the set S\mathbf{S}. Our work addresses the following question: is the relation between objects A and B analogous to those relations found in S\mathbf{S}? Such questions are particularly relevant in information retrieval, where an investigator might want to search for analogous pairs of objects that match the query set of interest. There are many ways in which objects can be related, making the task of measuring analogies very challenging. Our approach combines a similarity measure on function spaces with Bayesian analysis to produce a ranking. It requires data containing features of the objects of interest and a link matrix specifying which relationships exist; no further attributes of such relationships are necessary. We illustrate the potential of our method on text analysis and information networks. An application on discovering functional interactions between pairs of proteins is discussed in detail, where we show that our approach can work in practice even if a small set of protein pairs is provided.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS321 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Length, Protein-Protein Interactions, and Complexity

    Full text link
    The evolutionary reason for the increase in gene length from archaea to prokaryotes to eukaryotes observed in large scale genome sequencing efforts has been unclear. We propose here that the increasing complexity of protein-protein interactions has driven the selection of longer proteins, as longer proteins are more able to distinguish among a larger number of distinct interactions due to their greater average surface area. Annotated protein sequences available from the SWISS-PROT database were analyzed for thirteen eukaryotes, eight bacteria, and two archaea species. The number of subcellular locations to which each protein is associated is used as a measure of the number of interactions to which a protein participates. Two databases of yeast protein-protein interactions were used as another measure of the number of interactions to which each \emph{S. cerevisiae} protein participates. Protein length is shown to correlate with both number of subcellular locations to which a protein is associated and number of interactions as measured by yeast two-hybrid experiments. Protein length is also shown to correlate with the probability that the protein is encoded by an essential gene. Interestingly, average protein length and number of subcellular locations are not significantly different between all human proteins and protein targets of known, marketed drugs. Increased protein length appears to be a significant mechanism by which the increasing complexity of protein-protein interaction networks is accommodated within the natural evolution of species. Consideration of protein length may be a valuable tool in drug design, one that predicts different strategies for inhibiting interactions in aberrant and normal pathways.Comment: 13 pages, 5 figures, 2 tables, to appear in Physica

    PROPHECY—a database for high-resolution phenomics

    Get PDF
    The rapid recent evolution of the field phenomics—the genome-wide study of gene dispensability by quantitative analysis of phenotypes—has resulted in an increasing demand for new data analysis and visualization tools. Following the introduction of a novel approach for precise, genome-wide quantification of gene dispensability in Saccharomyces cerevisiae we here announce a public resource for mining, filtering and visualizing phenotypic data—the PROPHECY database. PROPHECY is designed to allow easy and flexible access to physiologically relevant quantitative data for the growth behaviour of mutant strains in the yeast deletion collection during conditions of environmental challenges. PROPHECY is publicly accessible at http://prophecy.lundberg.gu.se

    Yeast Protein Interactome Topology Provides Framework for Coordinated-Functionality

    Get PDF
    The architecture of the network of protein-protein physical interactions in Saccharomyces cerevisiae is exposed through the combination of two complementary theoretical network measures, betweenness centrality and `Q-modularity'. The yeast interactome is characterized by well-defined topological modules connected via a small number of inter-module protein interactions. Should such topological inter-module connections turn out to constitute a form of functional coordination between the modules, we speculate that this coordination is occurring typically in a pair-wise fashion, rather than by way of high-degree hub proteins responsible for coordinating multiple modules. The unique non-hub-centric hierarchical organization of the interactome is not reproduced by gene duplication-and-divergence stochastic growth models that disregard global selective pressures.Comment: Final, revised version. 13 pages. Please see Nucleic Acids open access article for higher resolution figure

    Diffusion Component Analysis: Unraveling Functional Topology in Biological Networks

    Full text link
    Complex biological systems have been successfully modeled by biochemical and genetic interaction networks, typically gathered from high-throughput (HTP) data. These networks can be used to infer functional relationships between genes or proteins. Using the intuition that the topological role of a gene in a network relates to its biological function, local or diffusion based "guilt-by-association" and graph-theoretic methods have had success in inferring gene functions. Here we seek to improve function prediction by integrating diffusion-based methods with a novel dimensionality reduction technique to overcome the incomplete and noisy nature of network data. In this paper, we introduce diffusion component analysis (DCA), a framework that plugs in a diffusion model and learns a low-dimensional vector representation of each node to encode the topological properties of a network. As a proof of concept, we demonstrate DCA's substantial improvement over state-of-the-art diffusion-based approaches in predicting protein function from molecular interaction networks. Moreover, our DCA framework can integrate multiple networks from heterogeneous sources, consisting of genomic information, biochemical experiments and other resources, to even further improve function prediction. Yet another layer of performance gain is achieved by integrating the DCA framework with support vector machines that take our node vector representations as features. Overall, our DCA framework provides a novel representation of nodes in a network that can be used as a plug-in architecture to other machine learning algorithms to decipher topological properties of and obtain novel insights into interactomes.Comment: RECOMB 201
    corecore