12 research outputs found

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    Get PDF
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe

    MEDICI: Mining Essentiality Data to Identify Critical Interactions for Cancer Drug Target Discovery and Development

    No full text
    <div><p>Protein-protein interactions (PPIs) mediate the transmission and regulation of oncogenic signals that are essential to cellular proliferation and survival, and thus represent potential targets for anti-cancer therapeutic discovery. Despite their significance, there is no method to experimentally disrupt and interrogate the essentiality of individual endogenous PPIs. The ability to computationally predict or infer <i>PPI essentiality</i> would help prioritize PPIs for drug discovery and help advance understanding of cancer biology. Here we introduce a computational method (MEDICI) to predict <i>PPI essentiality</i> by combining gene knockdown studies with network models of protein interaction pathways in an analytic framework. Our method uses network topology to model how gene silencing can disrupt PPIs, relating the unknown essentialities of individual PPIs to experimentally observed protein essentialities. This model is then deconvolved to recover the unknown essentialities of individual PPIs. We demonstrate the validity of our approach via prediction of sensitivities to compounds based on PPI essentiality and differences in essentiality based on genetic mutations. We further show that lung cancer patients have improved overall survival when specific PPIs are no longer present, suggesting that these PPIs may be potentially new targets for therapeutic development. Software is freely available at <a href="https://github.com/cooperlab/MEDICI" target="_blank">https://github.com/cooperlab/MEDICI</a>. Datasets are available at <a href="https://ctd2.nci.nih.gov/dataPortal" target="_blank">https://ctd2.nci.nih.gov/dataPortal</a>.</p></div

    Correlating interaction essentialities with drug sensitivity measures provides insights into mechanisms of action.

    No full text
    <p>We correlated drug sensitivity measures from CCLE with interaction essentiality scores to identify critical interactions that predict therapeutic sensitivity. Sensitivity to the MAPK inhibitor AZD6244 is highly correlated with PRKDC-TP53 interaction essentiality, which is consistent with the well established role of p38-MAPK in cell cycle arrest in response to DNA damage [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0170339#pone.0170339.ref051" target="_blank">51</a>–<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0170339#pone.0170339.ref053" target="_blank">53</a>].</p

    PPI Essentiality association with patient survival.

    No full text
    <p>(A) QQ plot of observed vs. expected log-rank p-values for LUAD patients split based on 5798 PPIs. (B) Network of PPIs with significant log-rank p-value for discriminating survival for TCGA LUAD patients that is centered on JAK1. (C) Kaplan-Meier curve of TCGA LUAD patients separated based on the presence or absence of the JAK1-PIK3R1 PPI. Patients without the JAK1-PIK3R1 PPI have improved survival compared to patients who retain this PPI.</p

    PPI networks associated with genetic mutations.

    No full text
    <p>(A) Networks of PPIs most increased in essentiality in cells with mutation or loss of the PTEN tumor suppressor gene. The 14 most significant PPIs are shown. Significant differences in PPI essentiality were computed in GenePattern [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0170339#pone.0170339.ref054" target="_blank">54</a>]. Networks were visualized with Cytoscape [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0170339#pone.0170339.ref055" target="_blank">55</a>]. (B) Networks of PPIs most increased in essentiality in cell lines with mutation or loss of the APC tumor suppressor gene. The 20 most significant PPIs are shown.</p

    Details of the computational framework of MEDICI.

    No full text
    <p>Curated pathway descriptions are integrated with novel interactions discovered by PPI screening to generate an interaction superpathway. Gene essentiality measurements are layered onto the nodes of the superpathway, and the network topology is transformed to the dual graph where the genes become network edges and the gene-interactions become network nodes. Gene essentialities are then diffused over their interactions to infer interaction essentiality weights.</p

    Clustering of most essential PPIs in the superpathway.

    No full text
    <p>(A) Unsupervised Hierarchical Clustering of the 360 most essential PPIs across the 165 cell lines identifies 12 major clusters. The 360 PPIs with an average essentiality score > 0.5 were used to cluster 165 cell lines used in the Achilles shRNA screening study using Cluster and Java Treeview software [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0170339#pone.0170339.ref029" target="_blank">29</a>]. PPI essentiality data was median centered and clustered by average correlation. Red indicates higher essentiality and blue indicates lower essentiality. Major hubs for each cluster are indicated on the right. (B) Clustering of 5798 PPI-MPER values across 165 cell lines. Red indicates PPI essentiality is greater than the max protein essentiality, and blue indicates the PPI essentiality is less than the max protein essentiality.</p

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    No full text
    corecore