35 research outputs found

    RegSNPs-intron: a computational framework for predicting pathogenic impact of intronic single nucleotide variants

    Get PDF
    Single nucleotide variants (SNVs) in intronic regions have yet to be systematically investigated for their disease-causing potential. Using known pathogenic and neutral intronic SNVs (iSNVs) as training data, we develop the RegSNPs-intron algorithm based on a random forest classifier that integrates RNA splicing, protein structure, and evolutionary conservation features. RegSNPs-intron showed excellent performance in evaluating the pathogenic impacts of iSNVs. Using a high-throughput functional reporter assay called ASSET-seq (ASsay for Splicing using ExonTrap and sequencing), we evaluate the impact of RegSNPs-intron predictions on splicing outcome. Together, RegSNPs-intron and ASSET-seq enable effective prioritization of iSNVs for disease pathogenesis

    Infectivity of Plasmodium falciparum in malaria-naive individuals is related to knob expression and cytoadherence of the parasite

    Get PDF
    Plasmodium falciparum is the most virulent human malaria parasite because of its ability to cytoadhere in the microvasculature. Nonhuman primate studies demonstrated relationships among knob expression, cytoadherence, and infectivity. This has not been examined in humans. Cultured clinical-grade P. falciparum parasites (NF54, 7G8, and 3D7B) and ex vivo-derived cell banks were characterized. Knob and knob-associated histidine-rich protein expression, CD36 adhesion, and antibody recognition of parasitized erythrocytes (PEs) were evaluated. Parasites from the cell banks were administered to malaria-naive human volunteers to explore infectivity. For the NF54 and 3D7B cell banks, blood was collected from the study participants for in vitro characterization. All parasites were infective in vivo. However, infectivity of NF54 was dramatically reduced. In vitro characterization revealed that unlike other cell bank parasites, NF54 PEs lacked knobs and did not cytoadhere. Recognition of NF54 PEs by immune sera was observed, suggesting P. falciparum erythrocyte membrane protein 1 expression. Subsequent recovery of knob expression and CD36-mediated adhesion were observed in PEs derived from participants infected with NF54. Knobless cell bank parasites have a dramatic reduction in infectivity and the ability to adhere to CD36. Subsequent infection of malaria-naive volunteers restored knob expression and CD36-mediated cytoadherence, thereby showing that the human environment can modulate virulence

    Trends in template/fragment-free protein structure prediction

    Get PDF
    Predicting the structure of a protein from its amino acid sequence is a long-standing unsolved problem in computational biology. Its solution would be of both fundamental and practical importance as the gap between the number of known sequences and the number of experimentally solved structures widens rapidly. Currently, the most successful approaches are based on fragment/template reassembly. Lacking progress in template-free structure prediction calls for novel ideas and approaches. This article reviews trends in the development of physical and specific knowledge-based energy functions as well as sampling techniques for fragment-free structure prediction. Recent physical- and knowledge-based studies demonstrated that it is possible to sample and predict highly accurate protein structures without borrowing native fragments from known protein structures. These emerging approaches with fully flexible sampling have the potential to move the field forward

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    Get PDF
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe

    Critical assessment of protein intrinsic disorder prediction

    Get PDF
    Abstract: Intrinsically disordered proteins, defying the traditional protein structure–function paradigm, are a challenge to study experimentally. Because a large part of our knowledge rests on computational predictions, it is crucial that their accuracy is high. The Critical Assessment of protein Intrinsic Disorder prediction (CAID) experiment was established as a community-based blind test to determine the state of the art in prediction of intrinsically disordered regions and the subset of residues involved in binding. A total of 43 methods were evaluated on a dataset of 646 proteins from DisProt. The best methods use deep learning techniques and notably outperform physicochemical methods. The top disorder predictor has Fmax = 0.483 on the full dataset and Fmax = 0.792 following filtering out of bona fide structured regions. Disordered binding regions remain hard to predict, with Fmax = 0.231. Interestingly, computing times among methods can vary by up to four orders of magnitude
    corecore