55,589 research outputs found

    Tracking repeats using significance and transitivty.

    Get PDF
    transitivity; extreme value distribution Motivation: Internal repeats in coding sequences correspond to structural and functional units of proteins. Moreover, duplication of fragments of coding sequences is known to be a mechanism to facilitate evolution. Identification of repeats is crucial to shed light on the function and structure of proteins, and explain their evolutionary past. The task is difficult because during the course of evolution many repeats diverged beyond recognition. Results: We introduce a new method TRUST, for ab-initio determination of internal repeats in proteins. It provides an improvement in prediction quality as compared to alternative state-of-the-art methods. The increased sensitivity and accuracy of the method is achieved by exploiting the concept of transitivity of alignments. Starting from significant local suboptimal alignments, the application of transitivity allows us to: 1) identify distant repeat homologues for which no alignments were found; 2) gain confidence about consistently well-aligned regions; and 3) recognize and reduce the contribution of nonhomologous repeats. This reassessment step enables us to derive a virtually noise-free profile representing a generalized repeat with high fidelity. We also obtained superior specificity by employing rigid statistical testing for self-sequence and profile-sequence alignments. Assessment was done using a database of repeat annotations based on structural superpositioning. The results show that TRUST is a useful and reliable tool for mining tandem and non-tandem repeats in protein sequence databases, able to predict multiple repeat types with varying intervening segments within a single sequence

    Functional and Immunological Relevance of Anaplasma marginale Major Surface Protein 1a Sequence and Structural Analysis.

    Get PDF
    Bovine anaplasmosis is caused by cattle infection with the tick-borne bacterium, Anaplasma marginale. The major surface protein 1a (MSP1a) has been used as a genetic marker for identifying A. marginale strains based on N-terminal tandem repeats and a 5'-UTR microsatellite located in the msp1a gene. The MSP1a tandem repeats contain immune relevant elements and functional domains that bind to bovine erythrocytes and tick cells, thus providing information about the evolution of host-pathogen and vector-pathogen interactions. Here we propose one nomenclature for A. marginale strain classification based on MSP1a. All tandem repeats among A. marginale strains were classified and the amino acid variability/frequency in each position was determined. The sequence variation at immunodominant B cell epitopes was determined and the secondary (2D) structure of the tandem repeats was modeled. A total of 224 different strains of A. marginale were classified, showing 11 genotypes based on the 5'-UTR microsatellite and 193 different tandem repeats with high amino acid variability per position. Our results showed phylogenetic correlation between MSP1a sequence, secondary structure, B-cell epitope composition and tick transmissibility of A. marginale strains. The analysis of MSP1a sequences provides relevant information about the biology of A. marginale to design vaccines with a cross-protective capacity based on MSP1a B-cell epitopes

    The physicist's guide to one of biotechnology's hottest new topics: CRISPR-Cas

    Full text link
    Clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated proteins (Cas) constitute a multi-functional, constantly evolving immune system in bacteria and archaea cells. A heritable, molecular memory is generated of phage, plasmids, or other mobile genetic elements that attempt to attack the cell. This memory is used to recognize and interfere with subsequent invasions from the same genetic elements. This versatile prokaryotic tool has also been used to advance applications in biotechnology. Here we review a large body of CRISPR-Cas research to explore themes of evolution and selection, population dynamics, horizontal gene transfer, specific and cross-reactive interactions, cost and regulation, non-immunological CRISPR functions that boost host cell robustness, as well as applicable mechanisms for efficient and specific genetic engineering. We offer future directions that can be addressed by the physics community. Physical understanding of the CRISPR-Cas system will advance uses in biotechnology, such as developing cell lines and animal models, cell labeling and information storage, combatting antibiotic resistance, and human therapeutics.Comment: 75 pages, 15 figures, Physical Biology (2018

    A new census of protein tandem repeats and their relationship with intrinsic disorder

    Get PDF
    Protein tandem repeats (TRs) are often associated with immunity-related functions and diseases. Since that last census of protein TRs in 1999, the number of curated proteins increased more than seven-fold and new TR prediction methods were published. TRs appear to be enriched with intrinsic disorder and vice versa. The significance and the biological reasons for this association are unknown. Here, we characterize protein TRs across all kingdoms of life and their overlap with intrinsic disorder in unprecedented detail. Using state-of-the-art prediction methods, we estimate that 50.9% of proteins contain at least one TR, often located at the sequence flanks. Positive linear correlation between the proportion of TRs and the protein length was observed universally, with Eukaryotes in general having more TRs, but when the difference in length is taken into account the difference is quite small. TRs were enriched with disorder-promoting amino acids and were inside intrinsically disordered regions. Many such TRs were homorepeats. Our results support that TRs mostly originate by duplication and are involved in essential functions such as transcription processes, structural organization, electron transport and iron-binding. In viruses, TRs are found in proteins essential for virulence

    Evolution of genes and repeats in the Nimrod superfamily

    Get PDF
    The recently identified Nimrod superfamily is characterized by the presence of a special type of EGF repeat, the NIM repeat, located right after a typical CCXGY/W amino acid motif. On the basis of structural features, nimrod genes can be divided into three types. The proteins encoded by Draper-type genes have an EMI domain at the N-terminal part and only one copy of the NIM motif, followed by a variable number of EGF-like repeats. The products of Nimrod B-type and Nimrod C-type genes (including the eater gene) have different kinds of N-terminal domains, and lack EGF-like repeats but contain a variable number of NIM repeats. Draper and Nimrod C-type (but not Nimrod B-type) proteins carry a transmembrane domain. Several members of the superfamily were claimed to function as receptors in phagocytosis and/or binding of bacteria, which indicates an important role in the cellular immunity and the elimination of apoptotic cells. In this paper, the evolution of the Nimrod superfamily is studied with various methods on the level of genes and repeats. A hypothesis is presented in which the NIM repeat, along with the EMI domain, emerged by structural reorganizations at the end of an EGF-like repeat chain, suggesting a mechanism for the formation of novel types of repeats. The analyses revealed diverse evolutionary patterns in the sequences containing multiple NIM repeats. Although in the Nimrod B and Nimrod C proteins show characteristics of independent evolution, many internal NIM repeats in Eater sequences seem to have undergone concerted evolution. An analysis of the nimrod genes has been performed using phylogenetic and other methods and an evolutionary scenario of the origin and diversification of the Nimrod superfamily is proposed. Our study presents an intriguing example how the evolution of multigene families may contribute to the complexity of the innate immune response

    Structure of the saxiphilin:saxitoxin (STX) complex reveals a convergent molecular recognition strategy for paralytic toxins.

    Get PDF
    Dinoflagelates and cyanobacteria produce saxitoxin (STX), a lethal bis-guanidinium neurotoxin causing paralytic shellfish poisoning. A number of metazoans have soluble STX-binding proteins that may prevent STX intoxication. However, their STX molecular recognition mechanisms remain unknown. Here, we present structures of saxiphilin (Sxph), a bullfrog high-affinity STX-binding protein, alone and bound to STX. The structures reveal a novel high-affinity STX-binding site built from a "proto-pocket" on a transferrin scaffold that also bears thyroglobulin domain protease inhibitor repeats. Comparison of Sxph and voltage-gated sodium channel STX-binding sites reveals a convergent toxin recognition strategy comprising a largely rigid binding site where acidic side chains and a cation-Ď€ interaction engage STX. These studies reveal molecular rules for STX recognition, outline how a toxin-binding site can be built on a naĂŻve scaffold, and open a path to developing protein sensors for environmental STX monitoring and new biologics for STX intoxication mitigation

    Structures of Phytophthora RXLR Effector Proteins: a conserved but adaptable fold underpins functional diversity

    Get PDF
    Phytopathogens deliver effector proteins inside host plant cells to promote infection. These proteins can also be sensed by the plant immune system, leading to restriction of pathogen growth. Effector genes can display signatures of positive selection and rapid evolution, presumably a consequence of their co-evolutionary arms race with plants. The molecular mechanisms underlying how effectors evolve to gain new virulence functions and/or evade the plant immune system are poorly understood. Here, we report the crystal structures of the effector domains from two oomycete RXLR proteins, Phytophthora capsici AVR3a11 and Phytophthora infestans PexRD2. Despite sharin

    Telomeres in Evolution and Development from Biosemiotic Perspective

    Get PDF
    Telomeres identify natural chromosome ends being different from broken DNA through differences in their "molecular syntax" (M.Eigen) which determines the functions of reverse transcriptase and its integrated RNA template, telomerase. Although telomeres play a crucial role in the linear chromosome organization of eukaryotic cells, their molecular syntax descended from an ancient retroviral competence. This is an indicator for the early retroviral colonization of large double stranded DNA viruses, which are putative ancestors of the eukaryotic nucleus.
This talk will demonstrate certain advantages of the biosemiotic approach towards our evolutionary understanding of telomeres: focus on the genetic/genomic structures as language-like text which follows combinatorial (syntactic), context-sensitive (pragmatic) and
content-specific (semantic) semiotic rules. Genetic/genomic organization from the biosemiotic perspective is not seen any longer as an object of randomly derived alterations (mutations) but as functional innovation coherent with the broad variety of natural genome editing competences of viruses.
&#xa

    Identification of the het-r vegetative incompatibility gene of Podospora anserina as a member of the fast evolving HNWD gene family

    Get PDF
    In fungi, vegetative incompatibility is a conspecific non-self recognition mechanism that restricts formation of viable heterokaryons when incompatible alleles of specific het loci interact. In Podospora anserina, three non-allelic incompatibility systems have been genetically defined involving interactions between het-c and het-d, het-c and het-e, het-r and het-v. het-d and het-e are paralogues belonging to the HNWD gene family that encode proteins of the STAND class. HET-D and HET-E proteins comprise an N-terminal HET effector domain, a central GTP binding site and a C-terminal WD repeat domain constituted of tandem repeats of highly conserved WD40 repeat units that define the specificity of alleles during incompatibility. The WD40 repeat units of the members of this HNWD family are undergoing concerted evolution. By combining genetic analysis and gain of function experiments, we demonstrate that an additional member of this family, HNWD2, corresponds to the het-r non-allelic incompatibility gene. As for het-d and het-e, allele specificity at the het-r locus is determined by the WD repeat domain. Natural isolates show allelic variation for het-
    • …
    corecore