11 research outputs found

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    Get PDF
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe

    Molecular testing for the clinical diagnosis of fibrolamellar carcinoma

    Get PDF
    WOS: 000419766500013PubMed ID: 28862261Fibrolamellar carcinoma has a distinctive morphology and immunophenotype, including cytokeratin 7 and CD68 co-expression. Despite the distinct findings, accurate diagnosis of fibrolamellar carcinoma continues to be a challenge. Recently, fibrolamellar carcinomas were found to harbor a characteristic somatic gene fusion, DNAJB1-PRKACA. A break-apart fluorescence in situ hybridization (FISH) assay was designed to detect this fusion event and to examine its diagnostic performance in a large, multicenter, multinational study. Cases initially classified as fibrolamellar carcinoma based on histological features were reviewed from 124 patients. Upon central review, 104 of the 124 cases were classified histologically as typical of fibrolamellar carcinoma, 12 cases as 'possible fibrolamellar carcinoma' and 8 cases as 'unlikely to be fibrolamellar carcinoma'. PRKACA FISH was positive for rearrangement in 102 of 103 (99%) typical fibrolamellar carcinomas, 9 of 12 'possible fibrolamellar carcinomas' and 0 of 8 cases 'unlikely to be fibrolamellar carcinomas'. Within the morphologically typical group of fibrolamellar carcinomas, two tumors with unusual FISH patterns were also identified. Both cases had the fusion gene DNAJB1-PRKACA, but one also had amplification of the fusion gene and one had heterozygous deletion of the normal PRKACA locus. In addition, 88 conventional hepatocellular carcinomas were evaluated with PRKACA FISH and all were negative. These findings demonstrate that FISH for the PRKACA rearrangement is a clinically useful tool to confirm the diagnosis of fibrolamellar carcinoma, with high sensitivity and specificity. A diagnosis of fibrolamellar carcinoma is more accurate when based on morphology plus confirmatory testing than when based on morphology alone.NIDDK NIH HHSUnited States Department of Health & Human ServicesNational Institutes of Health (NIH) - USANIH National Institute of Diabetes & Digestive & Kidney Diseases (NIDDK) [P30 DK026743

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    No full text
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical science. © The Author(s) 2019. Published by Oxford University Press
    corecore