670 research outputs found

    Ab initio detection of fuzzy amino acid tandem repeats in protein sequences

    Get PDF
    Background Tandem repetitions within protein amino acid sequences often correspond to regular secondary structures and form multi-repeat 3D assemblies of varied size and function. Developing internal repetitions is one of the evolutionary mechanisms that proteins employ to adapt their structure and function under evolutionary pressure. While there is keen interest in understanding such phenomena, detection of repeating structures based only on sequence analysis is considered an arduous task, since structure and function is often preserved even under considerable sequence divergence (fuzzy tandem repeats). Results In this paper we present PTRStalker, a new algorithm for ab-initio detection of fuzzy tandem repeats in protein amino acid sequences. In the reported results we show that by feeding PTRStalker with amino acid sequences from the UniProtKB/Swiss-Prot database we detect novel tandemly repeated structures not captured by other state-of-the-art tools. Experiments with membrane proteins indicate that PTRStalker can detect global symmetries in the primary structure which are then reflected in the tertiary structure. Conclusions PTRStalker is able to detect fuzzy tandem repeating structures in protein sequences, with performance beyond the current state-of-the art. Such a tool may be a valuable support to investigating protein structural properties when tertiary X-ray data is not available

    Tandem Repeats in Proteins: Prediction Algorithms and Biological Role

    Get PDF
    Tandem repetitions in protein sequence and structure is a fascinating subject of research which has been a focus of study since the late 1990s. In this survey, we give an overview on the multi-faceted aspects of research on protein tandem repeats (PTR for short), including prediction algorithms, databases, early classification efforts, mechanisms of PTR formation and evolution, and synthetic PTR design. We also touch on the rather open issue of the relationship between PTR and flexibility (or disorder) in proteins. Detection of PTR either from protein sequence or structure data is challenging due to inherent high (biological) signal-to-noise ratio that is a key feature of this problem. As early in silico analytic tools have been key enablers for starting this field of study, we expect that current and future algorithmic and statistical breakthroughs will have a high impact on the investigations of the biological role of PTR

    Tandem repeats discovery service (TReaDS) applied to finding novel cis-acting factors in repeat expansion diseases

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Tandem repeats are multiple duplications of substrings in the DNA that occur contiguously, or at a short distance, and may involve some mutations (such as substitutions, insertions, and deletions). Tandem repeats have been extensively studied also for their association with the class of repeat expansion diseases (mostly affecting the nervous system). Comparative studies on the output of different tools for finding tandem repeats highlighted significant differences among the sets of detected tandem repeats, while many authors pointed up how critical it is the right choice of parameters.</p> <p>Results</p> <p>In this paper we present <it>TReaDS - Tandem Repeats Discovery Service</it>, a <it>tandem repeat meta search engine</it>. <it>TReaDS </it>forwards user requests to several state of the art tools for finding tandem repeats and merges their outcome into a single report, providing a global, synthetic, and comparative view of the results. In particular, <it>TReaDS </it>allows the user to (<it>i</it>) simultaneously run different algorithms on the same data set, (<it>ii</it>) choose for each algorithm a different setting of parameters, and (<it>iii</it>) obtain a report that can be downloaded for further, off-line, investigations. We used <it>TReaDS </it>to investigate sequences associated with repeat expansion diseases.</p> <p>Conclusions</p> <p>By using the tool <it>TReaDS </it>we discover that, for 27 repeat expansion diseases out of a currently known set of 29, <it>long fuzzy tandem repeats </it>are covering the expansion loci. Tests with control sets confirm the specificity of this association. This finding suggests that long fuzzy tandem repeats can be a new class of cis-acting elements involved in the mechanisms leading to the expansion instability.</p> <p>We strongly believe that biologists can be interested in a tool that, not only gives them the possibility of using multiple search algorithm at the same time, with the same effort exerted in using just one of the systems, but also simplifies the burden of comparing and merging the results, thus expanding our capabilities in detecting important phenomena related to tandem repeats.</p

    A nanny model for intrinsically disordered proteins

    Get PDF
    Proteins without a well-defined tertiary structure are intrinsically unstable and are prone to degradation by the 20S proteasome. In my thesis, I investigate the protection mechanisms of intrinsically disordered (ID) protein regions via protein interactions using the AP-1 complex as a model system. AP-1 is composed of c-Fos and c-Jun proteins, out of which c-Fos has a shorter half-life than c-Jun. Interactions by c-Jun were shown to prolong the lifetime of c-Fos, leading to the proposal of the nanny model. This mechanism, where weak protein interactions protect unstructured regions without an induced folding, however, has never been probed directly. Here I investigate the nature of the interactions of c-Fos with c-Jun and how changes in disordered regions contribute to changes in half-life. I use mutational analysis to provide insight into changes in degradation rate as a function of the binding affinity in the bound form with c-Jun. I designed five mutants at the structured regions of c-Fos affecting specific contact sites (L165V, L172V) or charge separation (E175D, E189D, K190R) with c-Jun of which both modulate c-Fos turnover, proportionally to their impact on binding affinity. Interestingly, removal of the disordered region in the complex beyond the structured domain is observed to decrease c-Fos half-life indicating their role in the stability of the complex. The finding suggests that the protein turnover by the 20S proteasome can be fine-tuned by both structured and unstructured regions between c-Fos and c-Jun, consistent with the proposed 'nanny' model. These results highlight a novel aspect of disordered regions present in the bound form (fuzziness) in regulating protein half-life via fine-tuning the association rates between the two proteins. First, it demonstrates that the protection of disordered regions from degradation could be achieved without inducing a stable structure as confirmed by ECD spectroscopy. Binding to a partner generates a fuzzy complex, where fuzzy regions in protein complexes can serve as a nonspecific transient anchor. Second, the protection of disordered regions can be achieved with many binding configurations in the bound state without decreasing the conformational entropy. Thus, the protective role of fuzzy interactions from the 20S proteasome could also provide a possible explanation for how low-complexity sequence motifs involved in higher-order protein structures might serve as selective inhibitors of proteolysis.d

    TRStalker: an efficient heuristic for finding fuzzy tandem repeats

    Get PDF
    Motivation: Genomes in higher eukaryotic organisms contain a substantial amount of repeated sequences. Tandem Repeats (TRs) constitute a large class of repetitive sequences that are originated via phenomena such as replication slippage and are characterized by close spatial contiguity. They play an important role in several molecular regulatory mechanisms, and also in several diseases (e.g. in the group of trinucleotide repeat disorders). While for TRs with a low or medium level of divergence the current methods are rather effective, the problem of detecting TRs with higher divergence (fuzzy TRs) is still open. The detection of fuzzy TRs is propaedeutic to enriching our view of their role in regulatory mechanisms and diseases. Fuzzy TRs are also important as tools to shed light on the evolutionary history of the genome, where higher divergence correlates with more remote duplication events

    Simple sequence repeats in Neurospora crassa: distribution, polymorphism and evolutionary inference

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Simple sequence repeats (SSRs) have been successfully used for various genetic and evolutionary studies in eukaryotic systems. The eukaryotic model organism <it>Neurospora crassa </it>is an excellent system to study evolution and biological function of SSRs.</p> <p>Results</p> <p>We identified and characterized 2749 SSRs of 963 SSR types in the genome of <it>N. crassa</it>. The distribution of tri-nucleotide (nt) SSRs, the most common SSRs in <it>N. crassa</it>, was significantly biased in exons. We further characterized the distribution of 19 abundant SSR types (AST), which account for 71% of total SSRs in the <it>N. crassa </it>genome, using a Poisson log-linear model. We also characterized the size variation of SSRs among natural accessions using Polymorphic Index Content (PIC) and ANOVA analyses and found that there are genome-wide, chromosome-dependent and local-specific variations. Using polymorphic SSRs, we have built linkage maps from three line-cross populations.</p> <p>Conclusion</p> <p>Taking our computational, statistical and experimental data together, we conclude that 1) the distributions of the SSRs in the sequenced N. crassa genome differ systematically between chromosomes as well as between SSR types, 2) the size variation of tri-nt SSRs in exons might be an important mechanism in generating functional variation of proteins in <it>N. crassa</it>, 3) there are different levels of evolutionary forces in variation of amino acid repeats, and 4) SSRs are stable molecular markers for genetic studies in <it>N. crassa</it>.</p
    corecore