4,589 research outputs found

    Structure and functional motifs of GCR1, the only plant protein with a GPCR fold?

    Get PDF
    Whether GPCRs exist in plants is a fundamental biological question. Interest in deorphanizing new G protein coupled receptors (GPCRs), arises because of their importance in signaling. Within plants, this is controversial as genome analysis has identified 56 putative GPCRs, including GCR1 which is reportedly a remote homologue to class A, B and E GPCRs. Of these, GCR2, is not a GPCR; more recently it has been proposed that none are, not even GCR1. We have addressed this disparity between genome analysis and biological evidence through a structural bioinformatics study, involving fold recognition methods, from which only GCR1 emerges as a strong candidate. To further probe GCR1, we have developed a novel helix alignment method, which has been benchmarked against the the class A – class B - class F GPCR alignments. In addition, we have presented a mutually consistent set of alignments of GCR1 homologues to class A, class B and class F GPCRs, and shown that GCR1 is closer to class A and /or class B GPCRs than class A, class B or class F GPCRs are to each other. To further probe GCR1, we have aligned transmembrane helix 3 of GCR1 to each of the 6 GPCR classes. Variability comparisons provide additional evidence that GCR1 homologues have the GPCR fold. From the alignments and a GCR1 comparative model we have identified motifs that are common to GCR1, class A, B and E GPCRs. We discuss the possibilities that emerge from this controversial evidence that GCR1 has a GPCR fol

    Data-mining the FlyAtlas online resource to identify core functional motifs across transporting epithelia

    Get PDF
    <p>Background Comparative analysis of tissue-specific transcriptomes is a powerful technique to uncover tissue functions. Our FlyAtlas.org provides authoritative gene expression levels for multiple tissues of Drosophila melanogaster (1). Although the main use of such resources is single gene lookup, there is the potential for powerful meta-analysis to address questions that could not easily be framed otherwise. Here, we illustrate the power of data-mining of FlyAtlas data by comparing epithelial transcriptomes to identify a core set of highly-expressed genes, across the four major epithelial tissues (salivary glands, Malpighian tubules, midgut and hindgut) of both adults and larvae.</p> <p>Method Parallel hypothesis-led and hypothesis-free approaches were adopted to identify core genes that underpin insect epithelial function. In the former, gene lists were created from transport processes identified in the literature, and their expression profiles mapped from the flyatlas.org online dataset. In the latter, gene enrichment lists were prepared for each epithelium, and genes (both transport related and unrelated) consistently enriched in transporting epithelia identified.</p> <p>Results: A key set of transport genes, comprising V-ATPases, cation exchangers, aquaporins, potassium and chloride channels, and carbonic anhydrase, was found to be highly enriched across the epithelial tissues, compared with the whole fly. Additionally, a further set of genes that had not been predicted to have epithelial roles, were co-expressed with the core transporters, extending our view of what makes a transporting epithelium work. Further insights were obtained by studying the genes uniquely overexpressed in each epithelium; for example, the salivary gland expresses lipases, the midgut organic solute transporters, the tubules specialize for purine metabolism and the hindgut overexpresses still unknown genes.</p> <p>Conclusion Taken together, these data provide a unique insight into epithelial function in this key model insect, and a framework for comparison with other species. They also provide a methodology for function-led datamining of FlyAtlas.org and other multi-tissue expression datasets.</p&gt

    Development of a Novel Algorithm to Remove Spurious Edges from Biological Networks Through Functional Enrichment

    Get PDF
    The field of systems biology has facilitated the modelling of large and complex biological networks. These networks, generated from prior knowledge contained in the corpus of medical and scientific literature, or from experimental data are being used to model differing macromolecule networks associated with distinct disease states. While these networks are vital in understanding disease pathology and possible treatment options, they are rife with spurious interactions. These interactions arise from the methods used to create such networks, where the ability to discriminate between direct and indirect relationships is a challenge. To combat these spurious interactions an algorithm that leverages functional enrichment in biological networks was developed. Here, functional enrichment refers to two or three node functional motifs that are ubiquitous in biological networks. The algorithm developed removes edges from an existing network based on that edge’s involvement in functional motifs relative to every other edge’s involvement. In this work, the application of this algorithm was explored using real-world clinical disease networks. Furthermore, a software package was developed to identify an edge’s membership in functional motifs with respect to the network being explored. The tools developed in this work are the first to critically analyze an edge’s relationship to functional motifs in terms of network inclusion. Therefore, the principles outlined in this work can be employed in future works aimed at removing spurious edges. These principles will also produce higher quality biological networks for the understanding of disease pathology and the development of more effective treatment options

    Recognition of short functional motifs in protein sequences

    Get PDF
    The main goal of this study was to develop a method for computational de novo prediction of short linear motifs (SLiMs) in protein sequences that would provide advantages over existing solutions for the users. The users are typically biological laboratory researchers, who want to elucidate the function of a protein that is possibly mediated by a short motif. Such a process can be subcellular localization, secretion, post-translational modification or degradation of proteins. Conducting such studies only with experimental techniques is often associated with high costs and risks of uncertainty. Preliminary prediction of putative motifs with computational methods, them being fast and much less expensive, provides possibilities for generating hypotheses and therefore, more directed and efficient planning of experiments. To meet this goal, I have developed HH-MOTiF – a web-based tool for de novo discovery of SLiMs in a set of protein sequences. While working on the project, I have also detected patterns in sequence properties of certain SLiMs that make their de novo prediction easier. As some of these patterns are not yet described in the literature, I am sharing them in this thesis. While evaluating and comparing motif prediction results, I have identified conceptual gaps in theoretical studies, as well as existing practical solutions for comparing two sets of positional data annotating the same set of biological sequences. To close this gap and to be able to carry out in-depth performance analyses of HH-MOTiF in comparison to other predictors, I have developed a corresponding statistical method, SLALOM (for StatisticaL Analysis of Locus Overlap Method). It is currently available as a standalone command line tool

    Salmonella Pathogenesis and Processing of Secreted Effectors by Caspase-3

    Get PDF
    The enteric pathogen Salmonella enterica serovar Typhimurium causes food poisoning resulting in gastroenteritis. The S. Typhimurium effector Salmonella invasion protein A (SipA) promotes gastroenteritis by functional motifs that trigger either mechanisms of inflammation or bacterial entry. During infection of intestinal epithelial cells, SipA was found to be responsible for the early activation of caspase-3, an enzyme that is required for SipA cleavage at a specific recognition motif that divided the protein into its two functional domains and activated SipA in a manner necessary for pathogenicity. Other caspase-3 cleavage sites identified in S. Typhimurium appeared to be restricted to secreted effector proteins, which indicates that this may be a general strategy used by this pathogen for processing of its secreted effectors

    Evolution and diversity of secretome genes in the apicomplexan parasite Theileria annulata

    Get PDF
    <b>BACKGROUND</b>: Little is known about how apicomplexan parasites have evolved to infect different host species and cell types. Theileria annulata and Theileria parva invade and transform bovine leukocytes but each species favours a different host cell lineage. Parasite-encoded proteins secreted from the intracellular macroschizont stage within the leukocyte represent a critical interface between host and pathogen systems. Genome sequencing has revealed that several Theileria-specific gene families encoding secreted proteins are positively selected at the inter-species level, indicating diversification between the species. We extend this analysis to the intra-species level, focusing on allelic diversity of two major secretome families. These families represent a well-characterised group of genes implicated in control of the host cell phenotype and a gene family of unknown function. To gain further insight into their evolution and function, this study investigates whether representative genes of these two families are diversifying or constrained within the T. annulata population. <b>RESULTS</b>: Strong evidence is provided that the sub-telomerically encoded SVSP family and the host-nucleus targeted TashAT family have evolved under contrasting pressures within natural T. annulata populations. SVSP genes were found to possess atypical codon usage and be evolving neutrally, with high levels of nucleotide substitutions and multiple indels. No evidence of geographical sub-structuring of allelic sequences was found. In contrast, TashAT family genes, implicated in control of host cell gene expression, are strongly conserved at the protein level and geographically sub-structured allelic sequences were identified among Tunisian and Turkish isolates. Although different copy numbers of DNA binding motifs were identified in alleles of TashAT proteins, motif periodicity was strongly maintained, implying conserved functional activity of these sites. <b>CONCLUSIONS</b>: This analysis provides evidence that two distinct secretome genes families have evolved under contrasting selective pressures. The data supports current hypotheses regarding the biological role of TashAT family proteins in the management of host cell phenotype that may have evolved to allow adaptation of T. annulata to a specific host cell lineage. We provide new evidence of extensive allelic diversity in representative members of the enigmatic SVSP gene family, which supports a putative role for the encoded products in subversion of the host immune response

    Finding functional motifs in protein sequences with deep learning and natural language models

    Get PDF
    Recently, prediction of structural/functional motifs in protein sequences takes advantage of powerful machine learning based approaches. Protein encoding adopts protein language models overpassing standard procedures. Different combinations of machine learning and encoding schemas are available for predicting different structural/functional motifs. Particularly interesting is the adoption of protein language models to encode proteins in addition to evolution information and physicochemical parameters. A thorough analysis of recent predictors developed for annotating transmembrane regions, sorting signals, lipidation and phosphorylation sites allows to investigate the state-of-the-art focusing on the relevance of protein language models for the different tasks. This highlights that more experimental data are necessary to exploit available powerful machine learning methods

    Sparse approaches for the exact distribution of patterns in long state sequences generated by a Markov source

    Get PDF
    We present two novel approaches for the computation of the exact distribution of a pattern in a long sequence. Both approaches take into account the sparse structure of the problem and are two-part algorithms. The first approach relies on a partial recursion after a fast computation of the second largest eigenvalue of the transition matrix of a Markov chain embedding. The second approach uses fast Taylor expansions of an exact bivariate rational reconstruction of the distribution. We illustrate the interest of both approaches on a simple toy-example and two biological applications: the transcription factors of the Human Chromosome 5 and the PROSITE signatures of functional motifs in proteins. On these example our methods demonstrate their complementarity and their hability to extend the domain of feasibility for exact computations in pattern problems to a new level

    Functional Motifs in SIAMESE, a Plant Cyclin-Dependent Kinase Inhibitor

    Get PDF
    SIAMESE (SIM) and SIAMESE-RELATED-PROTEIN1 (SMR1), the founding members of the SIM/SMRs gene family, suppress mitosis and onset of endoreplication in the Arabidopsis’s trichome and sepal development, respectively, and hence have been suggested to be CDK inhibitors. In this study, I have investigated the exact role of SIM and SMRs and their evolutionarily conserved function throughout land plant evolution. Using split luciferase complementation (SLC), I have shown that both SIM and a distantly related a bryophyte “Physcomitrella patens” SMR (pSMR1) interacts with multiple types of Cyclin Dependent Kinases (CDKs). I have multiple lines of evidence that establish SIM and SMRs as CDKs inhibitors and demonstrating that their evolutionary function is conserved. Almost all SIAMESE-RELATED PROTEINS (SMRs) of Arabidopsis as well as a SMR from the bryophyte Physcomitrella patens complement the sim mutant phenotype strongly. Genetic studies of sim mutants in combination with cyclind and cdkb1 mutants also support the conclusion that SIM inhibits the activity of both CDKA;1 and CDKB1;1-containing complexes. In an in vitro kinase assay, SIM inhibits CDK kinase activity; moreover, the Physcomitrella SMR also inhibits the same set of CYC/CDK complexes as SIM. These results indicate that SIM and other SMRs inhibit multiple CDK complexes and share a molecular mechanism that is conserved among all land plants. Finally, we have investigated the functional role of conserved protein sequence motifs in SIM. Two motifs, termed Motif-1 and Motif-2, play important roles in SIM function. Surprisingly, a motif previously thought to be a putative cyclin-binding motif is not essential for function of SIM. We have also identified a putative CDK phosphorylation site in Motif-1, and two nuclear localization sequences that are essential for SIM function. The work described here gives new insights into the biochemical role of SIM in regulating the cell cycle. The conserved function of widely divergent SMRs indicates that this protein family plays important roles in all land plants. These studies will provide a foundation for future work on the biochemical functions of SIM in the cell cycle, as well as for understanding the roles of individual SMRs in plant growth and development
    corecore