124 research outputs found

    Florigen and its homologs of FT/CETS/PEBP/RKIP/YbhB family may be the enzymes of small molecule metabolism: review of the evidence.

    Get PDF
    BACKGROUND: Flowering signals are sensed in plant leaves and transmitted to the shoot apical meristems, where the formation of flowers is initiated. Searches for a diffusible hormone-like signaling entity ("florigen") went on for many decades, until a product of plant gene FT was identified as the key component of florigen in the 1990s, based on the analysis of mutants, genetic complementation evidence, and protein and RNA localization studies. Sequence homologs of FT protein are found throughout prokaryotes and eukaryotes; some eukaryotic family members appear to bind phospholipids or interact with the components of the signal transduction cascades. Most FT homologs are known to share a constellation of five charged residues, three of which, i.e., two histidines and an aspartic acid, are located at the rim of a well-defined cavity on the protein surface. RESULTS: We studied molecular features of the FT homologs in prokaryotes and analyzed their genome context, to find tentative evidence connecting the bacterial FT homologs with small molecule metabolism, often involving substrates that contain sugar or ribonucleoside moieties. We argue that the unifying feature of this protein family, i.e., a set of charged residues conserved at the sequence and structural levels, is more likely to be an enzymatic active center than a catalytically inert ligand-binding site. CONCLUSIONS: We propose that most of FT-related proteins are enzymes operating on small diffusible molecules. Those metabolites may constitute an overlooked essential ingredient of the florigen signal

    Measuring gene expression divergence: the distance to keep

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Gene expression divergence is a phenotypic trait reflecting evolution of gene regulation and characterizing dissimilarity between species and between cells and tissues within the same species. Several distance measures, such as Euclidean and correlation-based distances have been proposed for measuring expression divergence.</p> <p>Results</p> <p>We show that different distance measures identify different trends in gene expression patterns. When comparing orthologous genes in eight rat and human tissues, the Euclidean distance identified genes uniformly expressed in all tissues near the expression background as genes with the most conserved expression pattern. In contrast, correlation-based distance and generalized-average distance identified genes with concerted changes among homologous tissues as those most conserved. On the other hand, correlation-based distance, Euclidean distance and generalized-average distance highlight quite well the relatively high similarity of gene expression patterns in homologous tissues between species, compared to non-homologous tissues within species.</p> <p>Conclusions</p> <p>Different trends exist in the high-dimensional numeric data, and to highlight a particular trend an appropriate distance measure needs to be chosen. The choice of the distance measure for measuring expression divergence can be dictated by the expression patterns that are of interest in a particular study.</p> <p>Reviewers</p> <p>This article was reviewed by Mikhail Gelfand, Eugene Koonin and Subhajyoti De (nominated by Sarah Teichmann).</p

    A topological algorithm for identification of structural domains of proteins

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Identification of the structural domains of proteins is important for our understanding of the organizational principles and mechanisms of protein folding, and for insights into protein function and evolution. Algorithmic methods of dissecting protein of known structure into domains developed so far are based on an examination of multiple geometrical, physical and topological features. Successful as many of these approaches are, they employ a lot of heuristics, and it is not clear whether they illuminate any deep underlying principles of protein domain organization. Other well-performing domain dissection methods rely on comparative sequence analysis. These methods are applicable to sequences with known and unknown structure alike, and their success highlights a fundamental principle of protein modularity, but this does not directly improve our understanding of protein spatial structure.</p> <p>Results</p> <p>We present a novel graph-theoretical algorithm for the identification of domains in proteins with known three-dimensional structure. We represent the protein structure as an undirected, unweighted and unlabeled graph whose nodes correspond to the secondary structure elements and edges represent physical proximity of at least one pair of alpha carbon atoms from two elements. Domains are identified as constrained partitions of the graph, corresponding to sets of vertices obtained by the maximization of the cycle distributions found in the graph. When a partition is found, the algorithm is iteratively applied to each of the resulting subgraphs. The decision to accept or reject a tentative cut position is based on a specific classifier. The algorithm is applied iteratively to each of the resulting subgraphs and terminates automatically if partitions are no longer accepted. The distribution of cycles is the only type of information on which the decision about protein dissection is based. Despite the barebone simplicity of the approach, our algorithm approaches the best heuristic algorithms in accuracy.</p> <p>Conclusion</p> <p>Our graph-theoretical algorithm uses only topological information present in the protein structure itself to find the domains and does not rely on any geometrical or physical information about protein molecule. Perhaps unexpectedly, these drastic constraints on resources, which result in a seemingly approximate description of protein structures and leave only a handful of parameters available for analysis, do not lead to any significant deterioration of algorithm accuracy. It appears that protein structures can be rigorously treated as topological rather than geometrical objects and that the majority of information about protein domains can be inferred from the coarse-grained measure of pairwise proximity between elements of secondary structure elements.</p

    Similarity searches in genome-wide numerical data sets

    Get PDF
    We present psi-square, a program for searching the space of gene vectors. The program starts with a gene vector, i.e., the set of measurements associated with a gene, and finds similar vectors, derives a probabilistic model of these vectors, then repeats search using this model as a query, and continues to update the model and search again, until convergence. When applied to three different pathway-discovery problems, psi-square was generally more sensitive and sometimes more specific than the ad hoc methods developed for solving each of these problems before. REVIEWERS: This article was reviewed by King Jordan, Mikhail Gelfand, Nicolas Galtier and Sarah Teichmann

    Evolutionary history of bacteriophages with double-stranded DNA genomes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Reconstruction of evolutionary history of bacteriophages is a difficult problem because of fast sequence drift and lack of omnipresent genes in phage genomes. Moreover, losses and recombinational exchanges of genes are so pervasive in phages that the plausibility of phylogenetic inference in phage kingdom has been questioned.</p> <p>Results</p> <p>We compiled the profiles of presence and absence of 803 orthologous genes in 158 completely sequenced phages with double-stranded DNA genomes and used these gene content vectors to infer the evolutionary history of phages. There were 18 well-supported clades, mostly corresponding to accepted genera, but in some cases appearing to define new taxonomic groups. Conflicts between this phylogeny and trees constructed from sequence alignments of phage proteins were exploited to infer 294 specific acts of intergenome gene transfer.</p> <p>Conclusion</p> <p>A notoriously reticulate evolutionary history of fast-evolving phages can be reconstructed in considerable detail by quantitative comparative genomics.</p> <p>Open peer review</p> <p>This article was reviewed by Eugene Koonin, Nicholas Galtier and Martijn Huynen.</p

    Is Protein Folding a Thermodynamically Unfavorable, Active, Energy-Dependent Process?

    Get PDF
    The prevailing current view of protein folding is the thermodynamic hypothesis, under which the native folded conformation of a protein corresponds to the global minimum of Gibbs free energy G. We question this concept and show that the empirical evidence behind the thermodynamic hypothesis of folding is far from strong. Furthermore, physical theory-based approaches to the prediction of protein folds and their folding pathways so far have invariably failed except for some very small proteins, despite decades of intensive theory development and the enormous increase of computer power. The recent spectacular successes in protein structure prediction owe to evolutionary modeling of amino acid sequence substitutions enhanced by deep learning methods, but even these breakthroughs provide no information on the protein folding mechanisms and pathways. We discuss an alternative view of protein folding, under which the native state of most proteins does not occupy the global free energy minimum, but rather, a local minimum on a fluctuating free energy landscape. We further argue that ΔG of folding is likely to be positive for the majority of proteins, which therefore fold into their native conformations only through interactions with the energy-dependent molecular machinery of living cells, in particular, the translation system and chaperones. Accordingly, protein folding should be modeled as it occurs in vivo, that is, as a non-equilibrium, active, energy-dependent process

    The Origin and Evolution of G Protein-Coupled Receptor Kinases

    Get PDF
    G protein-coupled receptor (GPCR) kinases (GRKs) play key role in homologous desensitization of GPCRs. GRKs phosphorylate activated receptors, promoting high affinity binding of arrestins, which precludes G protein coupling. Direct binding to active GPCRs activates GRKs, so that they selectively phosphorylate only the activated form of the receptor regardless of the accessibility of the substrate peptides within it and their Ser/Thr-containing sequence. Mammalian GRKs were classified into three main lineages, but earlier GRK evolution has not been studied. Here we show that GRKs emerged at the early stages of eukaryotic evolution via an insertion of a kinase similar to ribosomal protein S6 kinase into a loop in RGS domain. GRKs in Metazoa fall into two clades, one including GRK2 and GRK3, and the other consisting of all remaining GRKs, split into GRK1-GRK7 lineage and GRK4-GRK5-GRK6 lineage in vertebrates. One representative of each of the two ancient clades is found as early as placozoan Trichoplax adhaerens. Several protists, two oomycetes and unicellular brown algae have one GRK-like protein, suggesting that the insertion of a kinase domain into the RGS domain preceded the origin of Metazoa. The two GRK families acquired distinct structural units in the N- and C-termini responsible for membrane recruitment and receptor association. Thus, GRKs apparently emerged before animals and rapidly expanded in true Metazoa, most likely due to the need for rapid signalling adjustments in fast-moving animals

    The evolution of Runx genes I. A comparative study of sequences from phylogenetically diverse model organisms

    Get PDF
    BACKGROUND: Runx genes encode proteins defined by the highly conserved Runt DNA-binding domain. Studies of Runx genes and proteins in model organisms indicate that they are key transcriptional regulators of animal development. However, little is known about Runx gene evolution. RESULTS: A phylogenetically broad sampling of publicly available Runx gene sequences was collected. In addition to the published sequences from mouse, sea urchin, Drosophila melanogaster and Caenorhabditis elegans, we collected several previously uncharacterised Runx sequences from public genome sequence databases. Among deuterostomes, mouse and pufferfish each contain three Runx genes, while the tunicate Ciona intestinalis and the sea urchin Strongylocentrotus purpuratus were each found to have only one Runx gene. Among protostomes, C. elegans has a single Runx gene, while Anopheles gambiae has three and D. melanogaster has four, including two genes that have not been previously described. Comparative sequence analysis reveals two highly conserved introns, one within and one just downstream of the Runt domain. All vertebrate Runx genes utilize two alternative promoters. CONCLUSIONS: In the current public sequence database, the Runt domain is found only in bilaterians, suggesting that it may be a metazoan invention. Bilaterians appear to ancestrally contain a single Runx gene, suggesting that the multiple Runx genes in vertebrates and insects arose by independent duplication events within those respective lineages. At least two introns were present in the primordial bilaterian Runx gene. Alternative promoter usage arose prior to the duplication events that gave rise to three Runx genes in vertebrates
    corecore