65 research outputs found

    The whole and its parts : why and how to disentangle plant communities and synusiae in vegetation classification

    Get PDF
    Most plant communities consist of different structural and ecological subsets, ranging from cryptogams to different tree layers. The completeness and approach with which these subsets are sampled have implications for vegetation classification. Non‐vascular plants are often omitted or sometimes treated separately, referring to their assemblages as “synusiae” (e.g. epiphytes on bark, saxicolous species on rocks). The distinction of complete plant communities (phytocoenoses or holocoenoses) from their parts (synusiae or merocoenoses) is crucial to avoid logical problems and inconsistencies of the resulting classification systems. We here describe theoretical differences between the phytocoenosis as a whole and its parts, and outline consequences of this distinction for practise and terminology in vegetation classification. To implement a clearer separation, we call for modifications of the International Code of Phytosociological Nomenclature and the EuroVegChecklist. We believe that these steps will make vegetation classification systems better applicable and raise the recognition of the importance of non‐vascular plants in the vegetation as well as their interplay with vascular plants

    Prediction of backbone dihedral angles and protein secondary structure using support vector machines

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The prediction of the secondary structure of a protein is a critical step in the prediction of its tertiary structure and, potentially, its function. Moreover, the backbone dihedral angles, highly correlated with secondary structures, provide crucial information about the local three-dimensional structure.</p> <p>Results</p> <p>We predict independently both the secondary structure and the backbone dihedral angles and combine the results in a loop to enhance each prediction reciprocally. Support vector machines, a state-of-the-art supervised classification technique, achieve secondary structure predictive accuracy of 80% on a non-redundant set of 513 proteins, significantly higher than other methods on the same dataset. The dihedral angle space is divided into a number of regions using two unsupervised clustering techniques in order to predict the region in which a new residue belongs. The performance of our method is comparable to, and in some cases more accurate than, other multi-class dihedral prediction methods.</p> <p>Conclusions</p> <p>We have created an accurate predictor of backbone dihedral angles and secondary structure. Our method, called DISSPred, is available online at <url>http://comp.chem.nottingham.ac.uk/disspred/</url>.</p

    Probing Metagenomics by Rapid Cluster Analysis of Very Large Datasets

    Get PDF
    BACKGROUND: The scale and diversity of metagenomic sequencing projects challenge both our technical and conceptual approaches in gene and genome annotations. The recent Sorcerer II Global Ocean Sampling (GOS) expedition yielded millions of predicted protein sequences, which significantly altered the landscape of known protein space by more than doubling its size and adding thousands of new families (Yooseph et al., 2007 PLoS Biol 5, e16). Such datasets, not only by their sheer size, but also by many other features, defy conventional analysis and annotation methods. METHODOLOGY/PRINCIPAL FINDINGS: In this study, we describe an approach for rapid analysis of the sequence diversity and the internal structure of such very large datasets by advanced clustering strategies using the newly modified CD-HIT algorithm. We performed a hierarchical clustering analysis on the 17.4 million Open Reading Frames (ORFs) identified from the GOS study and found over 33 thousand large predicted protein clusters comprising nearly 6 million sequences. Twenty percent of these clusters did not match known protein families by sequence similarity search and might represent novel protein families. Distributions of the large clusters were illustrated on organism composition, functional class, and sample locations. CONCLUSION/SIGNIFICANCE: Our clustering took about two orders of magnitude less computational effort than the similar protein family analysis of original GOS study. This approach will help to analyze other large metagenomic datasets in the future. A Web server with our clustering results and annotations of predicted protein clusters is available online at http://tools.camera.calit2.net/gos under the CAMERA project

    NetTurnP – Neural Network Prediction of Beta-turns by Use of Evolutionary Information and Predicted Protein Sequence Features

    Get PDF
    UNLABELLED: β-turns are the most common type of non-repetitive structures, and constitute on average 25% of the amino acids in proteins. The formation of β-turns plays an important role in protein folding, protein stability and molecular recognition processes. In this work we present the neural network method NetTurnP, for prediction of two-class β-turns and prediction of the individual β-turn types, by use of evolutionary information and predicted protein sequence features. It has been evaluated against a commonly used dataset BT426, and achieves a Matthews correlation coefficient of 0.50, which is the highest reported performance on a two-class prediction of β-turn and not-β-turn. Furthermore NetTurnP shows improved performance on some of the specific β-turn types. In the present work, neural network methods have been trained to predict β-turn or not and individual β-turn types from the primary amino acid sequence. The individual β-turn types I, I', II, II', VIII, VIa1, VIa2, VIba and IV have been predicted based on classifications by PROMOTIF, and the two-class prediction of β-turn or not is a superset comprised of all β-turn types. The performance is evaluated using a golden set of non-homologous sequences known as BT426. Our two-class prediction method achieves a performance of: MCC=0.50, Qtotal=82.1%, sensitivity=75.6%, PPV=68.8% and AUC=0.864. We have compared our performance to eleven other prediction methods that obtain Matthews correlation coefficients in the range of 0.17-0.47. For the type specific β-turn predictions, only type I and II can be predicted with reasonable Matthews correlation coefficients, where we obtain performance values of 0.36 and 0.31, respectively. CONCLUSION: The NetTurnP method has been implemented as a webserver, which is freely available at http://www.cbs.dtu.dk/services/NetTurnP/. NetTurnP is the only available webserver that allows submission of multiple sequences

    Automatic structure classification of small proteins using random forest

    Get PDF
    <p>Abstract</p> <p><b>Background</b></p> <p>Random forest, an ensemble based supervised machine learning algorithm, is used to predict the SCOP structural classification for a target structure, based on the similarity of its structural descriptors to those of a template structure with an equal number of secondary structure elements (SSEs). An initial assessment of random forest is carried out for domains consisting of three SSEs. The usability of random forest in classifying larger domains is demonstrated by applying it to domains consisting of four, five and six SSEs.</p> <p><b>Result</b>s</p> <p>Random forest, trained on SCOP version 1.69, achieves a predictive accuracy of up to 94% on an independent and non-overlapping test set derived from SCOP version 1.73. For classification to the SCOP <it>Class, Fold, Super-family </it>or <it>Family </it>levels, the predictive quality of the model in terms of Matthew's correlation coefficient (MCC) ranged from 0.61 to 0.83. As the number of constituent SSEs increases the MCC for classification to different structural levels decreases.</p> <p>Conclusions</p> <p>The utility of random forest in classifying domains from the place-holder classes of SCOP to the true <it>Class, Fold, Super-family </it>or <it>Family </it>levels is demonstrated. Issues such as introduction of a new structural level in SCOP and the merger of singleton levels can also be addressed using random forest. A real-world scenario is mimicked by predicting the classification for those protein structures from the PDB, which are yet to be assigned to the SCOP classification hierarchy.</p

    NetMHCpan, a Method for Quantitative Predictions of Peptide Binding to Any HLA-A and -B Locus Protein of Known Sequence

    Get PDF
    Binding of peptides to Major Histocompatibility Complex (MHC) molecules is the single most selective step in the recognition of pathogens by the cellular immune system. The human MHC class I system (HLA-I) is extremely polymorphic. The number of registered HLA-I molecules has now surpassed 1500. Characterizing the specificity of each separately would be a major undertaking.Here, we have drawn on a large database of known peptide-HLA-I interactions to develop a bioinformatics method, which takes both peptide and HLA sequence information into account, and generates quantitative predictions of the affinity of any peptide-HLA-I interaction. Prospective experimental validation of peptides predicted to bind to previously untested HLA-I molecules, cross-validation, and retrospective prediction of known HIV immune epitopes and endogenous presented peptides, all successfully validate this method. We further demonstrate that the method can be applied to perform a clustering analysis of MHC specificities and suggest using this clustering to select particularly informative novel MHC molecules for future biochemical and functional analysis.Encompassing all HLA molecules, this high-throughput computational method lends itself to epitope searches that are not only genome- and pathogen-wide, but also HLA-wide. Thus, it offers a truly global analysis of immune responses supporting rational development of vaccines and immunotherapy. It also promises to provide new basic insights into HLA structure-function relationships. The method is available at http://www.cbs.dtu.dk/services/NetMHCpan

    Intravenous alteplase for stroke with unknown time of onset guided by advanced imaging: systematic review and meta-analysis of individual patient data

    Get PDF
    Background: Patients who have had a stroke with unknown time of onset have been previously excluded from thrombolysis. We aimed to establish whether intravenous alteplase is safe and effective in such patients when salvageable tissue has been identified with imaging biomarkers. Methods: We did a systematic review and meta-analysis of individual patient data for trials published before Sept 21, 2020. Randomised trials of intravenous alteplase versus standard of care or placebo in adults with stroke with unknown time of onset with perfusion-diffusion MRI, perfusion CT, or MRI with diffusion weighted imaging-fluid attenuated inversion recovery (DWI-FLAIR) mismatch were eligible. The primary outcome was favourable functional outcome (score of 0–1 on the modified Rankin Scale [mRS]) at 90 days indicating no disability using an unconditional mixed-effect logistic-regression model fitted to estimate the treatment effect. Secondary outcomes were mRS shift towards a better functional outcome and independent outcome (mRS 0–2) at 90 days. Safety outcomes included death, severe disability or death (mRS score 4–6), and symptomatic intracranial haemorrhage. This study is registered with PROSPERO, CRD42020166903. Findings: Of 249 identified abstracts, four trials met our eligibility criteria for inclusion: WAKE-UP, EXTEND, THAWS, and ECASS-4. The four trials provided individual patient data for 843 individuals, of whom 429 (51%) were assigned to alteplase and 414 (49%) to placebo or standard care. A favourable outcome occurred in 199 (47%) of 420 patients with alteplase and in 160 (39%) of 409 patients among controls (adjusted odds ratio [OR] 1·49 [95% CI 1·10–2·03]; p=0·011), with low heterogeneity across studies (I2=27%). Alteplase was associated with a significant shift towards better functional outcome (adjusted common OR 1·38 [95% CI 1·05–1·80]; p=0·019), and a higher odds of independent outcome (adjusted OR 1·50 [1·06–2·12]; p=0·022). In the alteplase group, 90 (21%) patients were severely disabled or died (mRS score 4–6), compared with 102 (25%) patients in the control group (adjusted OR 0·76 [0·52–1·11]; p=0·15). 27 (6%) patients died in the alteplase group and 14 (3%) patients died among controls (adjusted OR 2·06 [1·03–4·09]; p=0·040). The prevalence of symptomatic intracranial haemorrhage was higher in the alteplase group than among controls (11 [3%] vs two [&lt;1%], adjusted OR 5·58 [1·22–25·50]; p=0·024). Interpretation: In patients who have had a stroke with unknown time of onset with a DWI-FLAIR or perfusion mismatch, intravenous alteplase resulted in better functional outcome at 90 days than placebo or standard care. A net benefit was observed for all functional outcomes despite an increased risk of symptomatic intracranial haemorrhage. Although there were more deaths with alteplase than placebo, there were fewer cases of severe disability or death. Funding: None

    European Red List of Habitats Part 2. Terrestrial and freshwater habitats

    Get PDF
    • …
    corecore