194 research outputs found

    Graph-representation of oxidative folding pathways

    Get PDF
    BACKGROUND: The process of oxidative folding combines the formation of native disulfide bond with conformational folding resulting in the native three-dimensional fold. Oxidative folding pathways can be described in terms of disulfide intermediate species (DIS) which can also be isolated and characterized. Each DIS corresponds to a family of folding states (conformations) that the given DIS can adopt in three dimensions. RESULTS: The oxidative folding space can be represented as a network of DIS states interconnected by disulfide interchange reactions that can either create/abolish or rearrange disulfide bridges. We propose a simple 3D representation wherein the states having the same number of disulfide bridges are placed on separate planes. In this representation, the shuffling transitions are within the planes, and the redox edges connect adjacent planes. In a number of experimentally studied cases (bovine pancreatic trypsin inhibitor, insulin-like growth factor and epidermal growth factor), the observed intermediates appear as part of contiguous oxidative folding pathways. CONCLUSIONS: Such networks can be used to visualize folding pathways in terms of the experimentally observed intermediates. A simple visualization template written for the Tulip package can be obtained from V.A

    Application of compression-based distance measures to protein sequence classification: a methodological study

    Get PDF
    Abstract Motivation: Distance measures built on the notion of text compression have been used for the comparison and classification of entire genomes and mitochondrial genomes. The present study was undertaken in order to explore their utility in the classification of protein sequences. Results: We constructed compression-based distance measures (CBMs) using the Lempel-Zlv and the PPMZ compression algorithms and compared their performance with that of the Smith–Waterman algorithm and BLAST, using nearest neighbour or support vector machine classification schemes. The datasets included a subset of the SCOP protein structure database to test distant protein similarities, a 3-phosphoglycerate-kinase sequences selected from archaean, bacterial and eukaryotic species as well as low and high-complexity sequence segments of the human proteome, CBMs values show a dependence on the length and the complexity of the sequences compared. In classification tasks CBMs performed especially well on distantly related proteins where the performance of a combined measure, constructed from a CBM and a BLAST score, approached or even slightly exceeded that of the Smith–Waterman algorithm and two hidden Markov model-based algorithms. Contact: [email protected] Supplementary information

    Genetic diversity of species Fowl aviadenovirus D and Fowl aviadenovirus E

    Get PDF
    Complete genomes of eight reference strains representing different serotypes within species Fowl aviadenovirus D (FAdV-D) and Fowl aviadenovirus E (FAdV-E) were sequenced. The sequenced genomes of FAdV-D and FAdV-E members comprise 43,287 to 44,336 bp, and have a gene organization identical to that of an earlier sequenced FAdV-D member (strain A-2A). Highest diversity was noticed in the hexon and fiber genes and ORF19. All genomes, sequenced in this study, contain one fiber gene. Phylogenetic analyses and G+C content support the division of the genus Aviadenovirus into the currently recognized species. Our data also suggest that the strain SR48 should be considered as FAdV-11 instead of FAdV-2 and similarly the strain HG as FAdV-8b. The present results complete the list of genome sequences of reference strains representing all serotypes in species FAdV-D and FAdV-E

    Genetic Diversity of Serine Protease Inhibitors in Myxozoan (Cnidaria, Myxozoa) Fish Parasites

    Get PDF
    We studied the genetic variability of serine protease inhibitors (serpins) of Myxozoa, microscopic endoparasites of fish. Myxozoans affect the health of both farmed and wild fish populations, causing diseases and mortalities. Despite their global impact, no effective protection exists against these parasites. Serpins were reported as important factors for host invasion and immune evasion, and as promising targets for the development of antiparasitic therapies. For the first time, we identified and aligned serpin sequences from high throughput sequencing datasets of ten myxozoan species, and analyzed 146 serpins from this parasite group together with those of other taxa phylogenetically, to explore their relationship and origins. High intra- and interspecific variability was detected among the examined serpins. The average sequence identity was 25-30% only. The conserved domains (i.e. motif and signature) showed taxon-level differences. Serpins clustered according to taxonomy rather than to serpin types, and myxozoan serpins seemed to be highly divergent from that of other taxa. None of them clustered with their closest relative free-living cnidarians. The genetic distinction of myxozoan serpins further strengthens the idea of an independent origin of Myxozoa, and may indicate novel protein functions potentially related to parasitism in this animal group

    FreeContact: fast and free software for protein contact prediction from residue co-evolution

    Get PDF
    Background: 20 years of improved technology and growing sequences now renders residue-residue contact constraints in large protein families through correlated mutations accurate enough to drive de novo predictions of protein three-dimensional structure. The method EVfold broke new ground using mean-field Direct Coupling Analysis (EVfold-mfDCA); the method PSICOV applied a related concept by estimating a sparse inverse covariance matrix. Both methods (EVfold-mfDCA and PSICOV) are publicly available, but both require too much CPU time for interactive applications. On top, EVfold-mfDCA depends on proprietary software. Results: Here, we present FreeContact, a fast, open source implementation of EVfold-mfDCA and PSICOV. On a test set of 140 proteins, FreeContact was almost eight times faster than PSICOV without decreasing prediction performance. The EVfold-mfDCA implementation of FreeContact was over 220 times faster than PSICOV with negligible performance decrease. EVfold-mfDCA was unavailable for testing due to its dependency on proprietary software. FreeContact is implemented as the free C++ library “libfreecontact”, complete with command line tool “freecontact”, as well as Perl and Python modules. All components are available as Debian packages. FreeContact supports the BioXSD format for interoperability. Conclusions: FreeContact provides the opportunity to compute reliable contact predictions in any environment (desktop or cloud)
    corecore