Search CORE

Texas ScholarWorks

Network rewiring is an important mechanism of gene essentiality change.

Author: Bowie James U
Han Seong Kyu
Kim Inhae
Kim Jinho
Kim Sanguk
Publication venue: eScholarship, University of California
Publication date: 01/01/2012
Field of study

Gene essentiality changes are crucial for organismal evolution. However, it is unclear how essentiality of orthologs varies across species. We investigated the underlying mechanism of gene essentiality changes between yeast and mouse based on the framework of network evolution and comparative genomic analysis. We found that yeast nonessential genes become essential in mouse when their network connections rapidly increase through engagement in protein complexes. The increased interactions allowed the previously nonessential genes to become members of vital pathways. By accounting for changes in gene essentiality, we firmly reestablished the centrality-lethality rule, which proposed the relationship of essential genes and network hubs. Furthermore, we discovered that the number of connections associated with essential and non-essential genes depends on whether they were essential in ancestral species. Our study describes for the first time how network evolution occurs to change gene essentiality

eScholarship - University of California

Family-specific scaling laws in bacterial genomes

Author: de Lazzari Eleonora
Grilli Jacopo
Lagomarsino Marco Cosentino
Maslov Sergei
Publication venue
Publication date: 01/01/2017
Field of study

Among several quantitative invariants found in evolutionary genomics, one of the most striking is the scaling of the overall abundance of proteins, or protein domains, sharing a specific functional annotation across genomes of given size. The size of these functional categories change, on average, as power-laws in the total number of protein-coding genes. Here, we show that such regularities are not restricted to the overall behavior of high-level functional categories, but also exist systematically at the level of single evolutionary families of protein domains. Specifically, the number of proteins within each family follows family-specific scaling laws with genome size. Functionally similar sets of families tend to follow similar scaling laws, but this is not always the case. To understand this systematically, we provide a comprehensive classification of families based on their scaling properties. Additionally, we develop a quantitative score for the heterogeneity of the scaling of families belonging to a given category or predefined group. Under the common reasonable assumption that selection is driven solely or mainly by biological function, these findings point to fine-tuned and interdependent functional roles of specific protein domains, beyond our current functional annotations. This analysis provides a deeper view on the links between evolutionary expansion of protein families and the functional constraints shaping the gene repertoire of bacterial genomes.Comment: 41 pages, 16 figure

arXiv.org e-Print Archive

AIR Universita degli studi di Milano

The SUPERFAMILY database in 2007: families and functions

Author: Chothia Cyrus
Gough Julian
Madera Martin
Vogel Christine
Wilson Derek
Publication venue: Oxford University Press
Publication date: 10/11/2006
Field of study

The SUPERFAMILY database provides protein domain assignments, at the SCOP ‘superfamily’ level, for the predicted protein sequences in over 400 completed genomes. A superfamily groups together domains of different families which have a common evolutionary ancestor based on structural, functional and sequence data. SUPERFAMILY domain assignments are generated using an expert curated set of profile hidden Markov models. All models and structural assignments are available for browsing and download from . The web interface includes services such as domain architectures and alignment details for all protein assignments, searchable domain combinations, domain occurrence network visualization, detection of over- or under-represented superfamilies for a given genome by comparison with other genomes, assignment of manually submitted sequences and keyword searches. In this update we describe the SUPERFAMILY database and outline two major developments: (i) incorporation of family level assignments and (ii) a superfamily-level functional annotation. The SUPERFAMILY database can be used for general protein evolution and superfamily-specific studies, genomic annotation, and structural genomics target suggestion and assessment

CiteSeerX

Explore Bristol Research

SUPERFAMILY—sophisticated comparative genomics, data mining, visualization and phylogeny

Author: Altschul
Altschul
Andreeva
Ashburner
Attwood
Benson
Berman
Brinkrolf
Bru
Chandonia
Charles Talbot
Chothia
Christine Vogel
Cyrus Chothia
Derek Wilson
Dowell
Eddy
Eichinger
Finn
Haft
Hubbard
Hulo
Julian Gough
Karplus
Letunic
Loewenstein
Madera
Martin Madera
Mi
Mulder
Pereira-Leal
Ralph Pethica
Ranea
Rasteiro
Rost
Stein
Swarbreck
Virel
Vogel
Vogel
Vogel
Wang
Wilson
Wilson
Wu
Yang
Yeats
Yiduo Zhou
Publication venue: Oxford University Press
Publication date: 01/11/2008
Field of study

SUPERFAMILY provides structural, functional and evolutionary information for proteins from all completely sequenced genomes, and large sequence collections such as UniProt. Protein domain assignments for over 900 genomes are included in the database, which can be accessed at http://supfam.org/. Hidden Markov models based on Structural Classification of Proteins (SCOP) domain definitions at the superfamily level are used to provide structural annotation. We recently produced a new model library based on SCOP 1.73. Family level assignments are also available. From the web site users can submit sequences for SCOP domain classification; search for keywords such as superfamilies, families, organism names, models and sequence identifiers; find over- and underrepresented families or superfamilies within a genome relative to other genomes or groups of genomes; compare domain architectures across selections of genomes and finally build multiple sequence alignments between Protein Data Bank (PDB), genomic and custom sequences. Recent extensions to the database include InterPro abstracts and Gene Ontology terms for superfamiles, taxonomic visualization of the distribution of families across the tree of life, searches for functionally similar domain architectures and phylogenetic trees. The database, models and associated scripts are available for download from the ftp site

CiteSeerX

Public Library of Science (PLOS)

Explore Bristol Research

Inferring PDZ Domain Multi-Mutant Binding Preferences from Single-Mutant Data

Author: A Ceol
A Ernst
BZ Harris
C Nourry
C Vogel
Chen Yanover
E Beitz
E Kim
Elena Zaslavsky
G Stolovitzky
JR Chen
M Venkatarajan
MA Stiffler
Mark Isalan
N Habib
P Beltrao
Philip Bradley
R Tonikian
T Beuming
T Hertz
T Pawson
Publication venue: Public Library of Science
Publication date: 30/09/2010
Field of study

Many important cellular protein interactions are mediated by peptide recognition domains. The ability to predict a domain's binding specificity directly from its primary sequence is essential to understanding the complexity of protein-protein interaction networks. One such recognition domain is the PDZ domain, functioning in scaffold proteins that facilitate formation of signaling networks. Predicting the PDZ domain's binding specificity was a part of the DREAM4 Peptide Recognition Domain challenge, the goal of which was to describe, as position weight matrices, the specificity profiles of five multi-mutant ERBB2IP-1 domains. We developed a method that derives multi-mutant binding preferences by generalizing the effects of single point mutations on the wild type domain's binding specificities. Our approach, trained on publicly available ERBB2IP-1 single-mutant phage display data, combined linear regression-based prediction for ligand positions whose specificity is determined by few PDZ positions, and single-mutant position weight matrix averaging for all other ligand columns. The success of our method as the winning entry of the DREAM4 competition, as well as its superior performance over a general PDZ-ligand binding model, demonstrates the advantages of training a model on a well-selected domain-specific data set

Genetic identity and differential gene expression between Trichomonas vaginalis and Trichomonas tenax

Author: Alderete JF
Kucknoor Ashwini S
Mundodi Vasanthakrishna
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background <it>Trichomonas vaginalis </it>is a human urogenital pathogen responsible for trichomonosis, the number-one, non-viral sexually transmitted disease (STD) worldwide, while <it>T. tenax </it>is a commensal of the human oral cavity, found particularly in patients with poor oral hygiene and advanced periodontal disease. The extent of genetic identity between <it>T. vaginalis </it>and its oral commensal counterpart is unknown. Results Genes that were differentially expressed in <it>T. vaginalis </it>were identified by screening three independent subtraction cDNA libraries enriched for <it>T. vaginalis </it>genes. The same thirty randomly selected cDNA clones encoding for proteins with specific functions associated with colonization were identified from each of the subtraction cDNA libraries. In addition, a <it>T. vaginalis </it>cDNA expression library was screened with patient sera that was first pre-adsorbed with an extract of <it>T. tenax </it>antigens, and seven specific cDNA clones were identified from this cDNA library. Interestingly, some of the clones identified by the subtraction cDNA screening were also obtained from the cDNA expression library with the pre-adsorbed sera. Moreover and noteworthy, clones identified by both the procedures were found to be up-regulated in expression in <it>T. vaginalis </it>upon contact with vaginal epithelial cells, suggesting a role for these gene products in host colonization. Semi-quantitative RT-PCR analysis of select clones showed that the genes were not unique to <it>T. vaginalis </it>and that these genes were also present in <it>T. tenax</it>, albeit at very low levels of expression. Conclusion These results suggest that <it>T. vaginalis </it>and <it>T. tenax </it>have remarkable genetic identity and that <it>T. vaginalis </it>has higher levels of gene expression when compared to that of <it>T. tenax</it>. The data may suggest that <it>T. tenax </it>could be a variant of <it>T. vaginalis</it>.</p

Springer - Publisher Connector

Public Library of Science (PLOS)

Neural immunoglobulin superfamily interaction networks

Author: Zinn Kai
Özkan Engin
Publication venue: 'Elsevier BV'
Publication date: 01/08/2017
Field of study

The immunoglobulin superfamily (IgSF) encompasses hundreds of cell surface proteins containing multiple immunoglobulin-like (Ig) domains. Among these are neural IgCAMs, which are cell adhesion molecules that mediate interactions between cells in the nervous system. IgCAMs in some vertebrate IgSF subfamilies bind to each other homophilically and heterophilically, forming small interaction networks. In Drosophila, a global ‘interactome’ screen identified two larger networks in which proteins in one IgSF subfamily selectively interact with proteins in a different subfamily. One of these networks, the ‘Dpr-ome’, includes 30 IgSF proteins, each of which is expressed in a unique subset of neurons. Recent evidence shows that one interacting protein pair within the Dpr-ome network is required for development of the brain and neuromuscular system

Caltech Authors

Structural Disorder in Eukaryotes

Author: Pancsa Rita
Tompa Peter
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

Based on early bioinformatic studies on a handful of species, the frequency of structural disorder of proteins is generally thought to be much higher in eukaryotes than in prokaryotes. To refine this view, we present here a comparative prediction study and analysis of 194 fully described eukaryotic proteomes and 87 reference prokaryotes for structural disorder. We found that structural disorder does distinguish eukaryotes from prokaryotes, but its frequency spans a very wide range in the two superkingdoms that largely overlap. The number of disordered binding regions and different Pfam domain types also contribute to distinguish eukaryotes from prokaryotes. Unexpectedly, the highest levels – and highest variability – of predicted disorder is found in protists, i.e. single-celled eukaryotes, often surpassing more complex eukaryote organisms, plants and animals. This trend contrasts with that of the number of domain types, which increases rather monotonously toward more complex organisms. The level of structural disorder appears to be strongly correlated with lifestyle, because some obligate intracellular parasites and endosymbionts have the lowest levels, whereas host-changing parasites have the highest level of predicted disorder. We conclude that protists have been the evolutionary hot-bed of experimentation with structural disorder, in a period when structural disorder was actively invented and the major functional classes of disordered proteins established

CiteSeerX