9 research outputs found

    Contact map prediction using a large-scale ensemble of rule sets and the fusion of multiple predicted structural features

    Get PDF
    Motivation: The prediction of a protein’s contact map has become in recent years, a crucial stepping stone for the prediction of the com-plete 3D structure of a protein. In this article, we describe a method-ology for this problem that was shown to be successful in CASP8 and CASP9. The methodology is based on (i) the fusion of the prediction of a variety of structural aspects of protein residues, (ii) an ensemble strategy used to facilitate the training process and (iii) a rule-based machine learning system from which we can extract human-readable explanations of the predictor and derive useful information about the contact map representation. Results: The main part of the evaluation is the comparison against the sequence-based contact prediction methods from CASP9, where our method presented the best rank in five out of the six evaluated met-rics. We also assess the impact of the size of the ensemble used in our predictor to show the trade-off between performance and training time of our method. Finally, we also study the rule sets generated by our machine learning system. From this analysis, we are able to estimate the contribution of the attributes in our representation and how these interact to derive contact prediction

    Baseline clinical characteristics of predicted structural and pain progressors in the IMI-APPROACH knee OA cohort

    Get PDF
    [Abstract] Objectives: To describe the relations between baseline clinical characteristics of the Applied Public-Private Research enabling OsteoArthritis Clinical Headway (IMI-APPROACH) participants and their predicted probabilities for knee osteoarthritis (OA) structural (S) progression and/or pain (P) progression. Methods: Baseline clinical characteristics of the IMI-APPROACH participants were used for this study. Radiographs were evaluated according to Kellgren and Lawrence (K&L grade) and Knee Image Digital Analysis. Knee Injury and Osteoarthritis Outcome Score (KOOS) and Numeric Rating Scale (NRS) were used to evaluate pain. Predicted progression scores for each individual were determined using machine learning models. Pearson correlation coefficients were used to evaluate correlations between scores for predicted progression and baseline characteristics. T-tests and χ2 tests were used to evaluate differences between participants with high versus low progression scores. Results: Participants with high S progressions score were found to have statistically significantly less structural damage compared with participants with low S progression scores (minimum Joint Space Width, minJSW 3.56 mm vs 1.63 mm; p<0.001, K&L grade; p=0.028). Participants with high P progression scores had statistically significantly more pain compared with participants with low P progression scores (KOOS pain 51.71 vs 82.11; p<0.001, NRS pain 6.7 vs 2.4; p<0.001). Conclusions: The baseline minJSW of the IMI-APPROACH participants contradicts the idea that the (predicted) course of knee OA follows a pattern of inertia, where patients who have progressed previously are more likely to display further progression. In contrast, for pain progressors the pattern of inertia seems valid, since participants with high P score already have more pain at baseline compared with participants with a low P score

    Predicted and actual 2-year structural and pain progression in the IMI-APPROACH knee osteoarthritis cohort

    Get PDF
    ClinicalTrials.gov, https://clinicaltrials.gov, NCT03883568[Abstract] Objectives: The IMI-APPROACH knee osteoarthritis study used machine learning (ML) to predict structural and/or pain progression, expressed by a structural (S) and pain (P) predicted-progression score, to select patients from existing cohorts. This study evaluates the actual 2-year progression within the IMI-APPROACH, in relation to the predicted-progression scores. Methods: Actual structural progression was measured using minimum joint space width (minJSW). Actual pain (progression) was evaluated using the Knee injury and Osteoarthritis Outcomes Score (KOOS) pain questionnaire. Progression was presented as actual change (Δ) after 2 years, and as progression over 2 years based on a per patient fitted regression line using 0, 0.5, 1 and 2-year values. Differences in predicted-progression scores between actual progressors and non-progressors were evaluated. Receiver operating characteristic (ROC) curves were constructed and corresponding area under the curve (AUC) reported. Using Youden's index, optimal cut-offs were chosen to enable evaluation of both predicted-progression scores to identify actual progressors. Results: Actual structural progressors were initially assigned higher S predicted-progression scores compared with structural non-progressors. Likewise, actual pain progressors were assigned higher P predicted-progression scores compared with pain non-progressors. The AUC-ROC for the S predicted-progression score to identify actual structural progressors was poor (0.612 and 0.599 for Δ and regression minJSW, respectively). The AUC-ROC for the P predicted-progression score to identify actual pain progressors were good (0.817 and 0.830 for Δ and regression KOOS pain, respectively). Conclusion: The S and P predicted-progression scores as provided by the ML models developed and used for the selection of IMI-APPROACH patients were to some degree able to distinguish between actual progressors and non-progressors

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    Get PDF
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe

    Transcriptional dynamics of two seed compartments with opposing roles in Arabidopsis seed germination

    No full text
    Seed germination is a critical stage in the plant life cycle and the first step toward successful plant establishment. Therefore, understanding germination is of important ecological and agronomical relevance. Previous research revealed that different seed compartments (testa, endosperm, and embryo) control germination, but little is known about the underlying spatial and temporal transcriptome changes that lead to seed germination. We analyzed genome-wide expression in germinating Arabidopsis (Arabidopsis thaliana) seeds with both temporal and spatial detail and provide Web-accessible visualizations of the data reported (vseed.nottingham.ac.uk). We show the potential of this high-resolution data set for the construction of meaningful coexpression networks, which provide insight into the genetic control of germination. The data set reveals two transcriptional phases during germination that are separated by testa rupture. The first phase is marked by large transcriptome changes as the seed switches from a dry, quiescent state to a hydrated and active state. At the end of this first transcriptional phase, the number of differentially expressed genes between consecutive time points drops. This increases again at testa rupture, the start of the second transcriptional phase. Transcriptome data indicate a role for mechano-induced signaling at this stage and subsequently highlight the fates of the endosperm and radicle: senescence and growth, respectively. Finally, using a phylotranscriptomic approach, we show that expression levels of evolutionarily young genes drop during the first transcriptional phase and increase during the second phase. Evolutionarily old genes show an opposite pattern, suggesting a more conserved transcriptome prior to the completion of germination

    Predicted and actual 2-year structural and pain progression in the IMI-APPROACH knee osteoarthritis cohort

    No full text
    Abstract Objectives: The IMI-APPROACH knee osteoarthritis study used machine learning (ML) to predict structural and/or pain progression, expressed by a structural (S) and pain (P) predicted-progression score, to select patients from existing cohorts. This study evaluates the actual 2-year progression within the IMI-APPROACH, in relation to the predicted-progression scores. Methods: Actual structural progression was measured using minimum joint space width (minJSW). Actual pain (progression) was evaluated using the Knee injury and Osteoarthritis Outcomes Score (KOOS) pain questionnaire. Progression was presented as actual change (Δ) after 2 years, and as progression over 2 years based on a per patient fitted regression line using 0, 0.5, 1 and 2-year values. Differences in predicted-progression scores between actual progressors and non-progressors were evaluated. Receiver operating characteristic (ROC) curves were constructed and corresponding area under the curve (AUC) reported. Using Youden’s index, optimal cut-offs were chosen to enable evaluation of both predicted-progression scores to identify actual progressors. Results: Actual structural progressors were initially assigned higher S predicted-progression scores compared with structural non-progressors. Likewise, actual pain progressors were assigned higher P predicted-progression scores compared with pain non-progressors. The AUC-ROC for the S predicted-progression score to identify actual structural progressors was poor (0.612 and 0.599 for Δ and regression minJSW, respectively). The AUC-ROC for the P predicted-progression score to identify actual pain progressors were good (0.817 and 0.830 for Δ and regression KOOS pain, respectively). Conclusion: The S and P predicted-progression scores as provided by the ML models developed and used for the selection of IMI-APPROACH patients were to some degree able to distinguish between actual progressors and non-progressors. Trial registration: ClinicalTrials.gov, https://clinicaltrials.gov, NCT03883568

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    No full text

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    No full text
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical science. © The Author(s) 2019. Published by Oxford University Press
    corecore