14 research outputs found

    TANGLE: Two-Level Support Vector Regression Approach for Protein Backbone Torsion Angle Prediction from Primary Sequences

    Get PDF
    Protein backbone torsion angles (Phi) and (Psi) involve two rotation angles rotating around the Cα-N bond (Phi) and the Cα-C bond (Psi). Due to the planarity of the linked rigid peptide bonds, these two angles can essentially determine the backbone geometry of proteins. Accordingly, the accurate prediction of protein backbone torsion angle from sequence information can assist the prediction of protein structures. In this study, we develop a new approach called TANGLE (Torsion ANGLE predictor) to predict the protein backbone torsion angles from amino acid sequences. TANGLE uses a two-level support vector regression approach to perform real-value torsion angle prediction using a variety of features derived from amino acid sequences, including the evolutionary profiles in the form of position-specific scoring matrices, predicted secondary structure, solvent accessibility and natively disordered region as well as other global sequence features. When evaluated based on a large benchmark dataset of 1,526 non-homologous proteins, the mean absolute errors (MAEs) of the Phi and Psi angle prediction are 27.8° and 44.6°, respectively, which are 1% and 3% respectively lower than that using one of the state-of-the-art prediction tools ANGLOR. Moreover, the prediction of TANGLE is significantly better than a random predictor that was built on the amino acid-specific basis, with the p-value<1.46e-147 and 7.97e-150, respectively by the Wilcoxon signed rank test. As a complementary approach to the current torsion angle prediction algorithms, TANGLE should prove useful in predicting protein structural properties and assisting protein fold recognition by applying the predicted torsion angles as useful restraints. TANGLE is freely accessible at http://sunflower.kuicr.kyoto-u.ac.jp/~sjn/TANGLE/

    Causal Pathways from Enteropathogens to Environmental Enteropathy: Findings from the MAL-ED Birth Cohort Study

    Get PDF
    Background Environmental enteropathy (EE), the adverse impact of frequent and numerous enteric infections on the gut resulting in a state of persistent immune activation and altered permeability, has been proposed as a key determinant of growth failure in children in low- and middle-income populations. A theory-driven systems model to critically evaluate pathways through which enteropathogens, gut permeability, and intestinal and systemic inflammation affect child growth was conducted within the framework of the Etiology, Risk Factors and Interactions of Enteric Infections and Malnutrition and the Consequences for Child Health and Development (MAL-ED) birth cohort study that included children from eight countries. Methods Non-diarrheal stool samples (N = 22,846) from 1253 children from multiple sites were evaluated for a panel of 40 enteropathogens and fecal concentrations of myeloperoxidase, alpha-1-antitrypsin, and neopterin. Among these same children, urinary lactulose:mannitol (L:M) (N = 6363) and plasma alpha-1-acid glycoprotein (AGP) (N = 2797) were also measured. The temporal sampling design was used to create a directed acyclic graph of proposed mechanistic pathways between enteropathogen detection in non-diarrheal stools, biomarkers of intestinal permeability and inflammation, systemic inflammation and change in length- and weight- for age in children 0–2 years of age. Findings Children in these populations had frequent enteric infections and high levels of both intestinal and systemic inflammation. Higher burdens of enteropathogens, especially those categorized as being enteroinvasive or causing mucosal disruption, were associated with elevated biomarker concentrations of gut and systemic inflammation and, via these associations, indirectly associated with both reduced linear and ponderal growth. Evidence for the association with reduced linear growth was stronger for systemic inflammation than for gut inflammation; the opposite was true of reduced ponderal growth. Although Giardia was associated with reduced growth, the association was not mediated by any of the biomarkers evaluated. Interpretation The large quantity of empirical evidence contributing to this analysis supports the conceptual model of EE. The effects of EE on growth faltering in young children were small, but multiple mechanistic pathways underlying the attribution of growth failure to asymptomatic enteric infections had statistical support in the analysis. The strongest evidence for EE was the association between enteropathogens and linear growth mediated through systemic inflammation

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    Get PDF
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency–Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research
    corecore