10 research outputs found

    New insights into protein-protein interaction data lead to increased estimates of the S. cerevisiae interactome size

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>As protein interactions mediate most cellular mechanisms, protein-protein interaction networks are essential in the study of cellular processes. Consequently, several large-scale interactome mapping projects have been undertaken, and protein-protein interactions are being distilled into databases through literature curation; yet protein-protein interaction data are still far from comprehensive, even in the model organism <it>Saccharomyces cerevisiae</it>. Estimating the interactome size is important for evaluating the completeness of current datasets, in order to measure the remaining efforts that are required.</p> <p>Results</p> <p>We examined the yeast interactome from a new perspective, by taking into account how thoroughly proteins have been studied. We discovered that the set of literature-curated protein-protein interactions is qualitatively different when restricted to proteins that have received extensive attention from the scientific community. In particular, these interactions are less often supported by yeast two-hybrid, and more often by more complex experiments such as biochemical activity assays. Our analysis showed that high-throughput and literature-curated interactome datasets are more correlated than commonly assumed, but that this bias can be corrected for by focusing on well-studied proteins. We thus propose a simple and reliable method to estimate the size of an interactome, combining literature-curated data involving well-studied proteins with high-throughput data. It yields an estimate of at least 37, 600 direct physical protein-protein interactions in <it>S. cerevisiae</it>.</p> <p>Conclusions</p> <p>Our method leads to higher and more accurate estimates of the interactome size, as it accounts for interactions that are genuine yet difficult to detect with commonly-used experimental assays. This shows that we are even further from completing the yeast interactome map than previously expected.</p

    Mycobacterium tuberculosis and Clostridium difficille interactomes: demonstration of rapid development of computational system for bacterial interactome prediction

    Get PDF
    Background\ud Protein-protein interaction (PPI) networks (interactomes) of most organisms, except for some model organisms, are largely unknown. Experimental methods including high-throughput techniques are highly resource intensive. Therefore, computational discovery of PPIs can accelerate biological discovery by presenting "most-promising" pairs of proteins that are likely to interact. For many bacteria, genome sequence, and thereby genomic context of proteomes, is readily available; additionally, for some of these proteomes, localization and functional annotations are also available, but interactomes are not available. We present here a method for rapid development of computational system to predict interactome of bacterial proteomes. While other studies have presented methods to transfer interologs across species, here, we propose transfer of computational models to benefit from cross-species annotations, thereby predicting many more novel interactions even in the absence of interologs. Mycobacterium tuberculosis (Mtb) and Clostridium difficile (CD) have been used to demonstrate the work.\ud \ud Results\ud We developed a random forest classifier over features derived from Gene Ontology annotations and genetic context scores provided by STRING database for predicting Mtb and CD interactions independently. The Mtb classifier gave a precision of 94% and a recall of 23% on a held out test set. The Mtb model was then run on all the 8 million protein pairs of the Mtb proteome, resulting in 708 new interactions (at 94% expected precision) or 1,595 new interactions at 80% expected precision. The CD classifier gave a precision of 90% and a recall of 16% on a held out test set. The CD model was run on all the 8 million protein pairs of the CD proteome, resulting in 143 new interactions (at 90% expected precision) or 580 new interactions (at 80% expected precision). We also compared the overlap of predictions of our method with STRING database interactions for CD and Mtb and also with interactions identified recently by a bacterial 2-hybrid system for Mtb. To demonstrate the utility of transfer of computational models, we made use of the developed Mtb model and used it to predict CD protein-pairs. The cross species model thus developed yielded a precision of 88% at a recall of 8%. To demonstrate transfer of features from other organisms in the absence of feature-based and interaction-based information, we transferred missing feature values from Mtb orthologs into the CD data. In transferring this data from orthologs (not interologs), we showed that a large number of interactions can be predicted.\ud \ud Conclusions\ud Rapid discovery of (partial) bacterial interactome can be made by using existing set of GO and STRING features associated with the organisms. We can make use of cross-species interactome development, when there are not even sufficient known interactions to develop a computational prediction system. Computational model of well-studied organism(s) can be employed to make the initial interactome prediction for the target organism. We have also demonstrated successfully, that annotations can be transferred from orthologs in well-studied organisms enabling accurate predictions for organisms with no annotations. These approaches can serve as building blocks to address the challenges associated with feature coverage, missing interactions towards rapid interactome discovery for bacterial organisms.\ud \ud Availability\ud The predictions for all Mtb and CD proteins are made available at: http://severus.dbmi.pitt.edu/TB and http://severus.dbmi.pitt.edu/CD respectively for browsing as well as for download

    Protein–protein interactions and genetic diseases: The interactome

    Get PDF
    AbstractProtein–protein interactions mediate essentially all biological processes. Despite the quality of these data being widely questioned a decade ago, the reproducibility of large-scale protein interaction data is now much improved and there is little question that the latest screens are of high quality. Moreover, common data standards and coordinated curation practices between the databases that collect the interactions have made these valuable data available to a wide group of researchers. Here, I will review how protein–protein interactions are measured, collected and quality controlled. I discuss how the architecture of molecular protein networks has informed disease biology, and how these data are now being computationally integrated with the newest genomic technologies, in particular genome-wide association studies and exome-sequencing projects, to improve our understanding of molecular processes perturbed by genetics in human diseases. This article is part of a Special Issue entitled: From Genome to Function

    A reference map of the human binary protein interactome.

    Full text link
    Global insights into cellular organization and genome function require comprehensive understanding of the interactome networks that mediate genotype-phenotype relationships(1,2). Here we present a human 'all-by-all' reference interactome map of human binary protein interactions, or 'HuRI'. With approximately 53,000 protein-protein interactions, HuRI has approximately four times as many such interactions as there are high-quality curated interactions from small-scale studies. The integration of HuRI with genome(3), transcriptome(4) and proteome(5) data enables cellular function to be studied within most physiological or pathological cellular contexts. We demonstrate the utility of HuRI in identifying the specific subcellular roles of protein-protein interactions. Inferred tissue-specific networks reveal general principles for the formation of cellular context-specific functions and elucidate potential molecular mechanisms that might underlie tissue-specific phenotypes of Mendelian diseases. HuRI is a systematic proteome-wide reference that links genomic variation to phenotypic outcomes

    Cross-species network and transcript transfer

    Get PDF
    Metabolic processes, signal transduction, gene regulation, as well as gene and protein expression are largely controlled by biological networks. High-throughput experiments allow the measurement of a wide range of cellular states and interactions. However, networks are often not known in detail for specific biological systems and conditions. Gene and protein annotations are often transferred from model organisms to the species of interest. Therefore, the question arises whether biological networks can be transferred between species or whether they are specific for individual contexts. In this thesis, the following aspects are investigated: (i) the conservation and (ii) the cross-species transfer of eukaryotic protein-interaction and gene regulatory (transcription factor- target) networks, as well as (iii) the conservation of alternatively spliced variants. In the simplest case, interactions can be transferred between species, based solely on the sequence similarity of the orthologous genes. However, such a transfer often results either in the transfer of only a few interactions (medium/high sequence similarity threshold) or in the transfer of many speculative interactions (low sequence similarity threshold). Thus, advanced network transfer approaches also consider the annotations of orthologous genes involved in the interaction transfer, as well as features derived from the network structure, in order to enable a reliable interaction transfer, even between phylogenetically very distant species. In this work, such an approach for the transfer of protein interactions is presented (COIN). COIN uses a sophisticated machine-learning model in order to label transferred interactions as either correctly transferred (conserved) or as incorrectly transferred (not conserved). The comparison and the cross-species transfer of regulatory networks is more difficult than the transfer of protein interaction networks, as a huge fraction of the known regulations is only described in the (not machine-readable) scientific literature. In addition, compared to protein interactions, only a few conserved regulations are known, and regulatory elements appear to be strongly context-specific. In this work, the cross-species analysis of regulatory interaction networks is enabled with software tools and databases for global (ConReg) and thousands of context-specific (CroCo) regulatory interactions that are derived and integrated from the scientific literature, binding site predictions and experimental data. Genes and their protein products are the main players in biological networks. However, to date, the aspect is neglected that a gene can encode different proteins. These alternative proteins can differ strongly from each other with respect to their molecular structure, function and their role in networks. The identification of conserved and species-specific splice variants and the integration of variants in network models will allow a more complete cross-species transfer and comparison of biological networks. With ISAR we support the cross-species transfer and comparison of alternative variants by introducing a gene-structure aware (i.e. exon-intron structure aware) multiple sequence alignment approach for variants from orthologous and paralogous genes. The methods presented here and the appropriate databases allow the cross-species transfer of biological networks, the comparison of thousands of context-specific networks, and the cross-species comparison of alternatively spliced variants. Thus, they can be used as a starting point for the understanding of regulatory and signaling mechanisms in many biological systems.In biologischen Systemen werden Stoffwechselprozesse, Signalübertragungen sowie die Regulation von Gen- und Proteinexpression maßgeblich durch biologische Netzwerke gesteuert. Hochdurchsatz-Experimente ermöglichen die Messung einer Vielzahl von zellulären Zuständen und Wechselwirkungen. Allerdings sind für die meisten Systeme und Kontexte biologische Netzwerke nach wie vor unbekannt. Gen- und Proteinannotationen werden häufig von Modellorganismen übernommen. Demnach stellt sich die Frage, ob auch biologische Netzwerke und damit die systemischen Eigenschaften ähnlich sind und übertragen werden können. In dieser Arbeit wird: (i) Die Konservierung und (ii) die artenübergreifende Übertragung von eukaryotischen Protein-Interaktions- und regulatorischen (Transkriptionsfaktor-Zielgen) Netzwerken, sowie (iii) die Konservierung von Spleißvarianten untersucht. Interaktionen können im einfachsten Fall nur auf Basis der Sequenzähnlichkeit zwischen orthologen Genen übertragen werden. Allerdings führt eine solche Übertragung oft dazu, dass nur sehr wenige Interaktionen übertragen werden können (hoher bis mittlerer Sequenzschwellwert) oder dass ein Großteil der übertragenden Interaktionen sehr spekulativ ist (niedriger Sequenzschwellwert). Verbesserte Methoden berücksichtigen deswegen zusätzlich noch die Annotationen der Orthologen, Eigenschaften der Interaktionspartner sowie die Netzwerkstruktur und können somit auch Interaktionen auf phylogenetisch weit entfernte Arten (zuverlässig) übertragen. In dieser Arbeit wird ein solcher Ansatz für die Übertragung von Protein-Interaktionen vorgestellt (COIN). COIN verwendet Verfahren des maschinellen Lernens, um Interaktionen als richtig (konserviert) oder als falsch übertragend (nicht konserviert) zu klassifizieren. Der Vergleich und die artenübergreifende Übertragung von regulatorischen Interaktionen ist im Vergleich zu Protein-Interaktionen schwieriger, da ein Großteil der bekannten Regulationen nur in der (nicht maschinenlesbaren) wissenschaftlichen Literatur beschrieben ist. Zudem sind im Vergleich zu Protein-Interaktionen nur wenige konservierte Regulationen bekannt und regulatorische Elemente scheinen stark kontextabhängig zu sein. In dieser Arbeit wird die artenübergreifende Analyse von regulatorischen Netzwerken mit Softwarewerkzeugen und Datenbanken für globale (ConReg) und kontextspezifische (CroCo) regulatorische Interaktionen ermöglicht. Regulationen wurden dafür aus Vorhersagen, experimentellen Daten und aus der wissenschaftlichen Literatur abgeleitet und integriert. Grundbaustein für viele biologische Netzwerke sind Gene und deren Proteinprodukte. Bisherige Netzwerkmodelle vernachlässigen allerdings meist den Aspekt, dass ein Gen verschiedene Proteine kodieren kann, die sich von der Funktion, der Proteinstruktur und der Rolle in Netzwerken stark voneinander unterscheiden können. Die Identifizierung von konservierten und artspezifischen Proteinprodukten und deren Integration in Netzwerkmodelle würde einen vollständigeren Übertrag und Vergleich von Netzwerken ermöglichen. In dieser Arbeit wird der artenübergreifende Vergleich von Proteinprodukten mit einem multiplen Sequenzalignmentverfahren für alternative Varianten von paralogen und orthologen Genen unterstützt, unter Berücksichtigung der bekannten Exon-Intron-Grenzen (ISAR). Die in dieser Arbeit vorgestellten Verfahren, Datenbanken und Softwarewerkzeuge ermöglichen die Übertragung von biologischen Netzwerken, den Vergleich von tausenden kontextspezifischen Netzwerken und den artenübergreifenden Vergleich von alternativen Varianten. Sie können damit die Ausgangsbasis für ein Verständnis von Kommunikations- und Regulationsmechanismen in vielen biologischen Systemen bilden

    Beyond hairballs: depicting complexity of a kinase-phosphatase network in the budding yeast

    Full text link
    Les kinases et les phosphatases (KP) représentent la plus grande famille des enzymes dans la cellule. Elles régulent les unes les autres ainsi que 60 % du protéome, formant des réseaux complexes kinase-phosphatase (KP-Net) jouant un rôle essentiel dans la signalisation cellulaire. Ces réseaux caractérisés d’une organisation de type commandes-exécutions possèdent généralement une structure hiérarchique. Malgré les nombreuse études effectuées sur le réseau KP-Net chez la levure, la structure hiérarchique ainsi que les principes fonctionnels sont toujours peux connu pour ce réseau. Dans ce contexte, le but de cette thèse consistait à effectuer une analyse d’intégration des données provenant de différentes sources avec la structure hiérarchique d’un réseau KP-Net de haute qualité chez la levure, S. cerevisiae, afin de générer des hypothèses concernant les principes fonctionnels de chaque couche de la hiérarchie du réseau KP-Net. En se basant sur une curation de données d’interactions effectuée dans la présente et dans d’autres études, le plus grand et authentique réseau KP-Net reconnu jusqu’à ce jour chez la levure a été assemblé dans cette étude. En évaluant le niveau hiérarchique du KP-Net en utilisant la métrique de la centralisation globale et en élucidant sa structure hiérarchique en utilisant l'algorithme vertex-sort (VS), nous avons trouvé que le réseau KP-Net possède une structure hiérarchique ayant la forme d’un sablier, formée de trois niveaux disjoints (supérieur, central et inférieur). En effet, le niveau supérieur du réseau, contenant un nombre élevé de KPs, était enrichi par des KPs associées à la régulation des signaux cellulaire; le niveau central, formé d’un nombre limité de KPs fortement connectées les unes aux autres, était enrichi en KPs impliquées dans la régulation du cycle cellulaire; et le niveau inférieur, composé d’un nombre important de KPs, était enrichi en KPs impliquées dans des processus cellulaires diversifiés. En superposant une grande multitude de propriétés biologiques des KPs sur le réseau KP-Net, le niveau supérieur était enrichi en phosphatases alors que le niveau inférieur en était appauvri, suggérant que les phosphatases seraient moins régulées par phosphorylation et déphosphorylation que les kinases. De plus, le niveau central était enrichi en KPs représentant des « bottlenecks », participant à plus d’une voie de signalisation, codées par des gènes essentiels et en KPs qui étaient les plus strictement régulées dans l’espace et dans le temps. Ceci implique que les KPs qui jouent un rôle essentiel dans le réseau KP-Net devraient être étroitement contrôlées. En outre, cette étude a montré que les protéines des KPs classées au niveau supérieur du réseau sont exprimées à des niveaux d’abondance plus élevés et à un niveau de bruit moins élevé que celles classées au niveau inférieur du réseau, suggérant que l’expression des enzymes à des abondances élevées invariables au niveau supérieur du réseau KP-Net pourrait être importante pour assurer un système robuste de signalisation. L’étude de l’algorithme VS a montré que le degré des nœuds affecte leur classement dans les différents niveaux d’un réseau hiérarchique sans biaiser les résultats biologiques du réseau étudié. En outre, une analyse de robustesse du réseau KP-Net a montré que les niveaus du réseau KP-Net sont modérément stable dans des réseaux bruités générés par ajout d’arrêtes au réseau KP-Net. Cependant, les niveaux de ces réseaux bruités et de ceux du réseau KP-Net se superposent significativement. De plus, les propriétés topologiques et biologiques du réseau KP-Net étaient retenues dans les réseaux bruités à différents niveaux. Ces résultats indiquant que bien qu’une robustesse partielle de nos résultats ait été observée, ces derniers représentent l’état actuel de nos connaissances des réseaux KP-Nets. Finalement, l’amélioration des techniques dédiées à l’identification des substrats des KPs aideront davantage à comprendre comment les réseaux KP-Nets fonctionnent. À titre d’exemple, je décris, dans cette thèse, une stratégie que nous avons conçu et qui permet à déterminer les interactions KP-substrats et les sous-unités régulatrices sur lesquelles ces interactions dépendent. Cette stratégie est basée sur la complémentation des fragments de protéines basée sur la cytosine désaminase chez la levure (OyCD PCA). L’OyCD PCA représente un essai in vivo à haut débit qui promet une description plus précise des réseaux KP-Nets complexes. En l’appliquant pour déterminer les substrats de la kinase cycline-dépendante de type 1 (Cdk1, appelée aussi Cdc28) chez la levure et l’implication des cyclines dans la phosphorylation de ces substrats par Cdk1, l’essai OyCD PCA a montré un comportement compensatoire collectif des cyclines pour la majorité des substrats. De plus, cet essai a montré que la tubuline- γ est phosphorylée spécifiquement par Clb3-Cdk1, établissant ainsi le moment pendant lequel cet événement contrôle l'assemblage du fuseau mitotique.Kinases and phosphatases (KP) form the largest family of enzymes in living cells. They regulate each other and 60 % of the proteome forming complex kinase-phosphatase networks (KP-Net) essential for cell signaling. Such networks having the command-execution aspect tend to have a hierarchical structure. Despite the extensive study of the KP-Net in the budding yeast, the hierarchical structure as well as the functional principles of this network are still not known. In this context, this thesis aims to perform an integrative analysis of multi-omics data with the hierarchical structure of a bona fide KP-Net in the budding yeast Saccharomyces cerevisiae, in order to generate hypotheses about the functional principles of each layer in the KP-Net hierarchy. Based on a literature curation effort accomplished in this and in other studies, the largest bona fide KP-Net of the S. cerevisiae known to date was assembled in this thesis. By assessing the hierarchical level of the KP-Net using the global reaching centrality and by elucidating the its hierarchical structure using the vertex-sort (VS) algorithm, we found that the KP-Net has a moderate hierarchical structure made of three disjoint layers (top, core and bottom) resembling a bow tie shape. The top layer having a large size was found enriched for signaling regulation; the core layer made of few strongly connected KPs was found enriched mostly for cell cycle regulation; and the bottom layer having a large size was found enriched for diverse biological processes. On overlaying a wide range of KP biological properties on top of the KP-Net hierarchical structure, the top layer was found enriched for and the bottom layer was found depleted for phosphatases, suggesting that phosphatases are less regulated by phosphorylation and dephosphoryation interactions (PDI) than kinases. Moreover, the core layer was found enriched for KPs representing bottlenecks, pathway-shared components, essential genes and for the most tightly regulated KPs in time and space, implying that KPs playing an essential role in the KP-Net should be firmly controlled. Interestingly, KP proteins in the top layer were found more abundant and less noisy than those of the bottom layer, suggesting that availability of enzymes at invariable protein expression level at the top of the network might be important to ensure a robust signaling. Analysis of the VS algorithm showed that node degrees affect their classification in the different layers of a network hierarchical structure without biasing biological results of the sorted network. Robustness analysis of the KP-Net showed that KP-Net layers are moderately stable in noisy networks generated by adding edges to the KP-Net. However, layers of these noisy overlap significantly with those of the KP-Net. Moreover, topological and biological properties of the KP-Net were retained in the noisy networks to different levels. These findings indicate that despite the observed partial robustness of our results, they mostly represent our current knowledge about KP-Nets. Finally, enhancement of techniques dedicated to identify KPs substrates will enhance our understanding about how KP-Nets function. As an example, I describe here a strategy that we devised to help in determining KP-substrate interactions and the regulatory subunits on which these interactions depend. The strategy is based on a protein-fragment complementation assay based on the optimized yeast cytosine deaminase (OyCD PCA). The OyCD PCA represents a large scale in vivo screen that promises a substantial improvement in delineating the complex KP-Nets. We applied the strategy to determine substrates of the cyclin-dependent kinase 1 (Cdk1; also called Cdc28) and cyclins implicated in phosphorylation of these substrates by Cdk1 in S. cerevisiae. The OyCD PCA showed a wide compensatory behavior of cyclins for most of the substrates and the phosphorylation of γ-tubulin specifically by Clb3-Cdk1, thus establishing the timing of the latter event in controlling assembly of the mitotic spindle

    Interactomics-Based Functional Analysis: Using Interaction Conservation To Probe Bacterial Protein Functions

    Get PDF
    The emergence of genomics as a discrete field of biology has changed humanity’s understanding of our relationship with bacteria. Sequencing the genome of each newly-discovered bacterial species can reveal novel gene sequences, though the genome may contain genes coding for hundreds or thousands of proteins of unknown function (PUFs). In some cases, these coding sequences appear to be conserved across nearly all bacteria. Exploring the functional roles of these cases ideally requires an integrative, cross-species approach involving not only gene sequences but knowledge of interactions among their products. Protein interactions, studied at genome scale, extend genomics into the field of interactomics. I have employed novel computational methods to provide context for bacterial PUFs and to leverage the rich genomic, proteomic, and interactomic data available for hundreds of bacterial species. The methods employed in this study began with sets of protein complexes. I initially hypothesized that, if protein interactions reveal protein functions and interactions are frequently conserved through protein complexes, then conserved protein functions should be revealed through the extent of conservation of protein complexes and their components. The subsequent analyses revealed how partial protein complex conservation may, unexpectedly, be the rule rather than the exception. Next, I expanded the analysis by combining sets of thousands of experimental protein-protein interactions. Progressing beyond the scope of protein complexes into interactions across full proteomes revealed novel evolutionary consistencies across bacteria but also exposed deficiencies among interactomics-based approaches. I have concluded this study with an expansion beyond bacterial protein interactions and into those involving bacteriophage-encoded proteins. This work concerns emergent evolutionary properties among bacterial proteins. It is primarily intended to serve as a resource for microbiologists but is relevant to any research into evolutionary biology. As microbiomes and their occupants become increasingly critical to human health, similar approaches may become increasingly necessary

    Searching for novel gene functions in yeast : identification of thousands of novel molecular interactions by protein-fragment complementation assay followed by automated gene function prediction and high-throughput lipidomics

    Get PDF
    La compréhension de processus biologiques complexes requiert des approches expérimentales et informatiques sophistiquées. Les récents progrès dans le domaine des stratégies génomiques fonctionnelles mettent dorénavant à notre disposition de puissants outils de collecte de données sur l’interconnectivité des gènes, des protéines et des petites molécules, dans le but d’étudier les principes organisationnels de leurs réseaux cellulaires. L’intégration de ces connaissances au sein d’un cadre de référence en biologie systémique permettrait la prédiction de nouvelles fonctions de gènes qui demeurent non caractérisées à ce jour. Afin de réaliser de telles prédictions à l’échelle génomique chez la levure Saccharomyces cerevisiae, nous avons développé une stratégie innovatrice qui combine le criblage interactomique à haut débit des interactions protéines-protéines, la prédiction de la fonction des gènes in silico ainsi que la validation de ces prédictions avec la lipidomique à haut débit. D’abord, nous avons exécuté un dépistage à grande échelle des interactions protéines-protéines à l’aide de la complémentation de fragments protéiques. Cette méthode a permis de déceler des interactions in vivo entre les protéines exprimées par leurs promoteurs naturels. De plus, aucun biais lié aux interactions des membranes n’a pu être mis en évidence avec cette méthode, comparativement aux autres techniques existantes qui décèlent les interactions protéines-protéines. Conséquemment, nous avons découvert plusieurs nouvelles interactions et nous avons augmenté la couverture d’un interactome d’homéostasie lipidique dont la compréhension demeure encore incomplète à ce jour. Par la suite, nous avons appliqué un algorithme d’apprentissage afin d’identifier huit gènes non caractérisés ayant un rôle potentiel dans le métabolisme des lipides. Finalement, nous avons étudié si ces gènes et un groupe de régulateurs transcriptionnels distincts, non préalablement impliqués avec les lipides, avaient un rôle dans l’homéostasie des lipides. Dans ce but, nous avons analysé les lipidomes des délétions mutantes de gènes sélectionnés. Afin d’examiner une grande quantité de souches, nous avons développé une plateforme à haut débit pour le criblage lipidomique à contenu élevé des bibliothèques de levures mutantes. Cette plateforme consiste en la spectrométrie de masse à haute resolution Orbitrap et en un cadre de traitement des données dédié et supportant le phénotypage des lipides de centaines de mutations de Saccharomyces cerevisiae. Les méthodes expérimentales en lipidomiques ont confirmé les prédictions fonctionnelles en démontrant certaines différences au sein des phénotypes métaboliques lipidiques des délétions mutantes ayant une absence des gènes YBR141C et YJR015W, connus pour leur implication dans le métabolisme des lipides. Une altération du phénotype lipidique a également été observé pour une délétion mutante du facteur de transcription KAR4 qui n’avait pas été auparavant lié au métabolisme lipidique. Tous ces résultats démontrent qu’un processus qui intègre l’acquisition de nouvelles interactions moléculaires, la prédiction informatique des fonctions des gènes et une plateforme lipidomique innovatrice à haut débit , constitue un ajout important aux méthodologies existantes en biologie systémique. Les développements en méthodologies génomiques fonctionnelles et en technologies lipidomiques fournissent donc de nouveaux moyens pour étudier les réseaux biologiques des eucaryotes supérieurs, incluant les mammifères. Par conséquent, le stratégie présenté ici détient un potentiel d’application au sein d’organismes plus complexes.Understanding complex biological processes requires sophisticated experimental and computational approaches. The advances in functional genomics strategies provide powerful tools for collecting diverse types of information on interconnectivity of genes, proteins and small molecules for studying organizational principles of cellular networks. Integration of that knowledge into a systems biology framework enables prediction of novel functions of uncharacterized genes. For performing such predictions on a genome-wide scale in the yeast Saccharomyces cerevisiae, we have developed a novel strategy that combines high-throughput interactomics screen for protein-protein interactions, in silico gene function prediction, and validation of predictions with high-throughput lipidomics. We started by performing a large-scale screen for protein-protein interactions using a protein-fragment complementation assay. The method allowed to monitor interactions in vivo between proteins expressed from their natural promoters. Furthermore, the method did not suffer from bias against membrane interactions comparing to established genome-wide techniques for detecting protein interactions. As a result, we detected many novel interactions and increased coverage of an interactome of lipid homeostasis that has not been yet comprehensively explored. Next, we applied a machine learning algorithm to identify eight previously uncharacterized genes with a potential role in lipid metabolism. Finally, we investigated whether these genes and a set of distinct transcriptional regulators, not implicated previously with lipids, have a role in lipid homeostasis. For that purpose, we analyzed lipidome of deletion mutants of the selected genes. In order to probe a large number of strains, we have developed a high-throughput platform for high-content lipidomic screening of yeast mutant libraries that consists of high-resolution Orbitrap mass spectrometry and a dedicated data processing framework to support lipid phenotyping across hundreds of Saccharomyces cerevisiae mutants. Lipidomics experiments confirmed functional predictions by demonstrating differences of the lipid metabolic phenotypes of deletion mutants lacking YBR141C and YJR015W genes predicted to be involved in lipid metabolism. An altered lipid phenotype was also observed for a deletion mutant of the transcription factor KAR4 that has not been linked previously with lipid metabolism. These results demonstrate that a workflow that integrates the acquisition of novel molecular interactions, computational gene function prediction and novel high-throughput shotgun lipidomics platform is a valuable contribution to an arsenal of methods for systems biology. The developments of functional genomic methods and lipidomics technologies provide means to study biological networks of higher eukaryotes, including mammals. Therefore, the presented workflow has a potential to find its applications in more complex organisms
    corecore