41 research outputs found

    CODA: Accurate Detection of Functional Associations between Proteins in Eukaryotic Genomes Using Domain Fusion

    Get PDF
    Background: In order to understand how biological systems function it is necessary to determine the interactions and associations between proteins. Gene fusion prediction is one approach to detection of such functional relationships. Its use is however known to be problematic in higher eukaryotic genomes due to the presence of large homologous domain families. Here we introduce CODA (Co-Occurrence of Domains Analysis), a method to predict functional associations based on the gene fusion idiom.Methodology/Principal Findings: We apply a novel scoring scheme which takes account of the genome-specific size of homologous domain families involved in fusion to improve accuracy in predicting functional associations. We show that CODA is able to accurately predict functional similarities in human with comparison to state-of-the-art methods and show that different methods can be complementary. CODA is used to produce evidence that a currently uncharacterised human protein may be involved in pathways related to depression and that another is involved in DNA replication.Conclusions/Significance: The relative performance of different gene fusion methodologies has not previously been explored. We find that they are largely complementary, with different methods being more or less appropriate in different genomes. Our method is the only one currently available for download and can be run on an arbitrary dataset by the user. The CODA software and datasets are freely available from ftp://ftp.biochem.ucl.ac.uk/pub/gene3d_data/v6.1.0/CODA/. Predictions are also available via web services from http://funcnet.eu/

    Fusion and Fission of Genes Define a Metric between Fungal Genomes

    Get PDF
    Gene fusion and fission events are key mechanisms in the evolution of gene architecture, whose effects are visible in protein architecture when they occur in coding sequences. Until now, the detection of fusion and fission events has been performed at the level of protein sequences with a post facto removal of supernumerary links due to paralogy, and often did not include looking for events defined only in single genomes. We propose a method for the detection of these events, defined on groups of paralogs to compensate for the gene redundancy of eukaryotic genomes, and apply it to the proteomes of 12 fungal species. We collected an inventory of 1,680 elementary fusion and fission events. In half the cases, both composite and element genes are found in the same species. Per-species counts of events correlate with the species genome size, suggesting a random mechanism of occurrence. Some biological functions of the genes involved in fusion and fission events are slightly over- or under-represented. As already noted in previous studies, the genes involved in an event tend to belong to the same functional category. We inferred the position of each event in the evolution tree of the 12 fungal species. The event localization counts for all the segments of the tree provide a metric that depicts the β€œrecombinational” phylogeny among fungi. A possible interpretation of this metric as distance in adaptation space is proposed

    Just how versatile are domains?

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Creating new protein domain arrangements is a frequent mechanism of evolutionary innovation. While some domains always form the same combinations, others form many different arrangements. This ability, which is often referred to as versatility or promiscuity of domains, its a random evolutionary model in which a domain's promiscuity is based on its relative frequency of domains.</p> <p>Results</p> <p>We show that there is a clear relationship across genomes between the promiscuity of a given domain and its frequency. However, the strength of this relationship differs for different domains. We thus redefine domain promiscuity by defining a new index, <it>DV I </it>("domain versatility index"), which eliminates the effect of domain frequency. We explore links between a domain's versatility, when unlinked from abundance, and its biological properties.</p> <p>Conclusion</p> <p>Our results indicate that domains occurring as single domain proteins and domains appearing frequently at protein termini have a higher <it>DV I</it>. This is consistent with previous observations that the evolution of domain re-arrangements is primarily driven by fusion of pre-existing arrangements and single domains as well as loss of domains at protein termini. Furthermore, we studied the link between domain age, defined as the first appearance of a domain in the species tree, and the <it>DV I</it>. Contrary to previous studies based on domain promiscuity, it seems as if the <it>DV I </it>is age independent. Finally, we find that contrary to previously reported findings, versatility is lower in Eukaryotes. In summary, our measure of domain versatility indicates that a random attachment process is sufficient to explain the observed distribution of domain arrangements and that several views on domain promiscuity need to be revised.</p

    Evolution of protein domain architectures

    Get PDF
    This chapter reviews current research on how protein domain architectures evolve. We begin by summarizing work on the phylogenetic distribution of proteins, as this will directly impact which domain architectures can be formed in different species. Studies relating domain family size to occurrence have shown that they generally follow power law distributions, both within genomes and larger evolutionary groups. These findings were subsequently extended to multi-domain architectures. Genome evolution models that have been suggested to explain the shape of these distributions are reviewed, as well as evidence for selective pressure to expand certain domain families more than others. Each domain has an intrinsic combinatorial propensity, and the effects of this have been studied using measures of domain versatility or promiscuity. Next, we study the principles of protein domain architecture evolution and how these have been inferred from distributions of extant domain arrangements. Following this, we review inferences of ancestral domain architecture and the conclusions concerning domain architecture evolution mechanisms that can be drawn from these. Finally, we examine whether all known cases of a given domain architecture can be assumed to have a single common origin (monophyly) or have evolved convergently (polyphyly). We end by a discussion of some available tools for computational analysis or exploitation of protein domain architectures and their evolution

    A Phenotypic Profile of the Candida albicans Regulatory Network

    Get PDF
    Candida albicans is a normal resident of the gastrointestinal tract and also the most prevalent fungal pathogen of humans. It last shared a common ancestor with the model yeast Saccharomyces cerevisiae over 300 million years ago. We describe a collection of 143 genetically matched strains of C. albicans, each of which has been deleted for a specific transcriptional regulator. This collection represents a large fraction of the non-essential transcription circuitry. A phenotypic profile for each mutant was developed using a screen of 55 growth conditions. The results identify the biological roles of many individual transcriptional regulators; for many, this work represents the first description of their functions. For example, a quarter of the strains showed altered colony formation, a phenotype reflecting transitions among yeast, pseudohyphal, and hyphal cell forms. These transitions, which have been closely linked to pathogenesis, have been extensively studied, yet our work nearly doubles the number of transcriptional regulators known to influence them. As a second example, nearly a quarter of the knockout strains affected sensitivity to commonly used antifungal drugs; although a few transcriptional regulators have previously been implicated in susceptibility to these drugs, our work indicates many additional mechanisms of sensitivity and resistance. Finally, our results inform how transcriptional networks evolve. Comparison with the existing S. cerevisiae data (supplemented by additional S. cerevisiae experiments reported here) allows the first systematic analysis of phenotypic conservation by orthologous transcriptional regulators over a large evolutionary distance. We find that, despite the many specific wiring changes documented between these species, the general phenotypes of orthologous transcriptional regulator knockouts are largely conserved. These observations support the idea that many wiring changes affect the detailed architecture of the circuit, but not its overall output

    Molecular evolution of the LNX gene family

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>LNX (Ligand of Numb Protein-X) proteins typically contain an amino-terminal RING domain adjacent to either two or four PDZ domains - a domain architecture that is unique to the LNX family. LNX proteins function as E3 ubiquitin ligases and their domain organisation suggests that their ubiquitin ligase activity may be targeted to specific substrates or subcellular locations by PDZ domain-mediated interactions. Indeed, numerous interaction partners for LNX proteins have been identified, but the <it>in vivo </it>functions of most family members remain largely unclear.</p> <p>Results</p> <p>To gain insights into their function we examined the phylogenetic origins and evolution of the <it>LNX </it>gene family. We find that a <it>LNX1/LNX2</it>-like gene arose in an early metazoan lineage by gene duplication and fusion events that combined a RING domain with four PDZ domains. These PDZ domains are closely related to the four carboxy-terminal domains from multiple PDZ domain containing protein-1 (MUPP1). Duplication of the <it>LNX1/LNX2</it>-like gene and subsequent loss of PDZ domains appears to have generated a gene encoding a LNX3/LNX4-like protein, with just two PDZ domains. This protein has novel carboxy-terminal sequences that include a potential modular LNX3 homology domain. The two ancestral <it>LNX </it>genes are present in some, but not all, invertebrate lineages. They were, however, maintained in the vertebrate lineage, with further duplication events giving rise to five LNX family members in most mammals. In addition, we identify novel interactions of LNX1 and LNX2 with three known MUPP1 ligands using yeast two-hybrid asssays. This demonstrates conservation of binding specificity between LNX and MUPP1 PDZ domains.</p> <p>Conclusions</p> <p>The <it>LNX </it>gene family has an early metazoan origin with a LNX1/LNX2-like protein likely giving rise to a LNX3/LNX4-like protein through the loss of PDZ domains. The absence of LNX orthologs in some lineages indicates that LNX proteins are not essential in invertebrates. In contrast, the maintenance of both ancestral <it>LNX </it>genes in the vertebrate lineage suggests the acquisition of essential vertebrate specific functions. The revelation that the LNX PDZ domains are phylogenetically related to domains in MUPP1, and have common binding specificities, suggests that LNX and MUPP1 may have similarities in their cellular functions.</p
    corecore