446 research outputs found

    PCI-SS: MISO dynamic nonlinear protein secondary structure prediction

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Since the function of a protein is largely dictated by its three dimensional configuration, determining a protein's structure is of fundamental importance to biology. Here we report on a novel approach to determining the one dimensional secondary structure of proteins (distinguishing α-helices, β-strands, and non-regular structures) from primary sequence data which makes use of Parallel Cascade Identification (PCI), a powerful technique from the field of nonlinear system identification.</p> <p>Results</p> <p>Using PSI-BLAST divergent evolutionary profiles as input data, dynamic nonlinear systems are built through a black-box approach to model the process of protein folding. Genetic algorithms (GAs) are applied in order to optimize the architectural parameters of the PCI models. The three-state prediction problem is broken down into a combination of three binary sub-problems and protein structure classifiers are built using 2 layers of PCI classifiers. Careful construction of the optimization, training, and test datasets ensures that no homology exists between any training and testing data. A detailed comparison between PCI and 9 contemporary methods is provided over a set of 125 new protein chains guaranteed to be dissimilar to all training data. Unlike other secondary structure prediction methods, here a web service is developed to provide both human- and machine-readable interfaces to PCI-based protein secondary structure prediction. This server, called PCI-SS, is available at <url>http://bioinf.sce.carleton.ca/PCISS</url>. In addition to a dynamic PHP-generated web interface for humans, a Simple Object Access Protocol (SOAP) interface is added to permit invocation of the PCI-SS service remotely. This machine-readable interface facilitates incorporation of PCI-SS into multi-faceted systems biology analysis pipelines requiring protein secondary structure information, and greatly simplifies high-throughput analyses. XML is used to represent the input protein sequence data and also to encode the resulting structure prediction in a machine-readable format. To our knowledge, this represents the only publicly available SOAP-interface for a protein secondary structure prediction service with published WSDL interface definition.</p> <p>Conclusion</p> <p>Relative to the 9 contemporary methods included in the comparison cascaded PCI classifiers perform well, however PCI finds greatest application as a consensus classifier. When PCI is used to combine a sequence-to-structure PCI-based classifier with the current leading ANN-based method, PSIPRED, the overall error rate (Q3) is maintained while the rate of occurrence of a particularly detrimental error is reduced by up to 25%. This improvement in BAD score, combined with the machine-readable SOAP web service interface makes PCI-SS particularly useful for inclusion in a tertiary structure prediction pipeline.</p

    Risk for human tick-borne encephalitis, borrelioses, and double infection in the pre-Ural region of Russia.

    Get PDF
    We assessed the risk for human tick-borne encephalitis (TBE), ixodid tick-borne borrelioses, and double infection from 1994 to 1998 in Perm, which has among the highest rates of reported cases in Russia. We studied 3,473 unfed adult Ixodes persulcatus ticks collected from vegetation in natural foci and 62,816 ticks removed from humans. TBE virus and Borrelia may coexist in ticks

    Optimized multilocus sequence analysis for laboratory identification of pathogens of ixodid tick-borne borreliosis

    Get PDF
    Introduction. The most common etiological agents of ixodid tick-borne borreliosis (ITBB) in Russia are Borrelia garinii, B. afzelii, B. bavariensis. Multilocus sequence typing and multilocus sequence analysis (MLSA) have been used in recent studies for Borrelia species identification. The results of using the MLSA scheme for identification of pathogens causing erythemic forms of ITBB have been presented earlier. The purpose of the study was to explore the possibility of MLSA optimization for laboratory identification of ITBB pathogens. Objectives: comparative analysis of nucleotide sequences of 6 conserved genes (rrs, hbb, fla, groEL, recA, ospA) and the rrfA-rrlB intergenic spacer, which are recommended by the MLSA protocol; identification of the minimum set of genes, the concatenated sequences of which are essential for species identification of Borrelia isolates. Materials and methods. The sequences of the above loci of 23 reference isolates collected from patients with ITBB and assigned, using MLSA, to B. bavariensis were compared with the sequences of similar genes of other Borrelia species available in international databases. The UPGMA method was used to build and analyze dendrograms based on the obtained data. Results. The sequences of ospA gene loci of reference species demonstrated the greatest difference (not less than 8.5%) from the sequences of the above gene in other analyzed species of Borrelia; approximately similar species-related differences (not less than 6.7%) were demonstrated by the comparison of recA gene sequences. The sequences of the identified variants of these two genes in B. bavariensis differed from the sequences of the similar genes in the most closely related species B. garinii. The dendrogram of the concatenated nucleotide sequences of recA and ospA genes demonstrated that it was totally consistent with the results of identification of the isolates based on the MLSA protocol. Conclusion. The optimized approach to MLSA of the B. burgdorferi sensu lato group suggests that species identification should be based on the concatenated analysis of loci of only two genes (recA and ospA) out of 7 loci recommended by the MLSA protocol

    A non-invasive investigation of Limoges enamels using both Optical Coherence Tomography (OCT) and spectral imaging: a pilot study

    Get PDF
    This paper investigates the use of Optical Coherence Tomography (OCT) and Short-wave Infrared (SWIR) spectral imaging to study the deterioration of a Limoges enamel panel. Limoges enamels are formed of glass layers applied on a metal substrate and are prone to ‘glass disease’. However, the level of deterioration in Limoges enamels is generally difficult to assess visually. In this study, SWIR was used to produce a hydration level map of the enamel, which was coupled with virtual OCT cross-sections. The study shows a good correlation between levels of hydration and structural damage over the enamel panel. Hydration mapping allows visualisation of structural damage across the entire enamel in one image

    Tick-Borne Encephalitis with Hemorrhagic Syndrome, Novosibirsk Region, Russia, 1999

    Get PDF
    Eight fatal cases of tick-borne encephalitis with unusual hemorrhagic syndrome were identified in 1999 in the Novosibirsk Region, Russia. To study these strains, we sequenced cDNA fragments of protein E gene from six archival formalin-fixed brain samples. Phylogenetic analysis showed tick-borne encephalitis variants clustered with a Far Eastern subtype (homology 94.7%) but not with the Siberian subtype (82%)

    Enrichment analysis of Alu elements with different spatial chromatin proximity in the human genome

    Get PDF
    Transposable elements (TEs) have no longer been totally considered as “junk DNA” for quite a time since the continual discoveries of their multifunctional roles in eukaryote genomes. As one of the most important and abundant TEs that still active in human genome, Alu, a SINE family, has demonstrated its indispensable regulatory functions at sequence level, but its spatial roles are still unclear. Technologies based on 3C(chromosomeconformation capture) have revealed the mysterious three-dimensional structure of chromatin, and make it possible to study the distal chromatin interaction in the genome. To find the role TE playing in distal regulation in human genome, we compiled the new released Hi-C data, TE annotation, histone marker annotations, and the genome-wide methylation data to operate correlation analysis, and found that the density of Alu elements showed a strong positive correlation with the level of chromatin interactions (hESC: r=0.9, P<2.2×1016; IMR90 fibroblasts: r = 0.94, P < 2.2 × 1016) and also have a significant positive correlation withsomeremote functional DNA elements like enhancers and promoters (Enhancer: hESC: r=0.997, P=2.3×10−4; IMR90: r=0.934, P=2×10−2; Promoter: hESC: r = 0.995, P = 3.8 × 10−4; IMR90: r = 0.996, P = 3.2 × 10−4). Further investigation involving GC content and methylation status showed the GC content of Alu covered sequences shared a similar pattern with that of the overall sequence, suggesting that Alu elements also function as the GC nucleotide and CpG site provider. In all, our results suggest that the Alu elements may act as an alternative parameter to evaluate the Hi-C data, which is confirmed by the correlation analysis of Alu elements and histone markers. Moreover, the GC-rich Alu sequence can bring high GC content and methylation flexibility to the regions with more distal chromatin contact, regulating the transcription of tissue-specific genes
    corecore