8 research outputs found

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    Get PDF
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe

    Loss of the Transforming Growth Factor-β Effector β2-Spectrin Promotes Genomic Instability.

    No full text
    © 2016 by the American Association for the Study of Liver Diseases. Exposure to genotoxins such as ethanol-derived acetaldehyde leads to DNA damage and liver injury and promotes the development of cancer. We report here a major role for the transforming growth factor β/mothers against decapentaplegic homolog 3 adaptor β2-Spectrin (β2SP, gene Sptbn1) in maintaining genomic stability following alcohol-induced DNA damage. β2SP supports DNA repair through β2SP-dependent activation of Fanconi anemia complementation group D2 (Fancd2), a core component of the Fanconi anemia complex. Loss of β2SP leads to decreased Fancd2 levels and sensitizes β2SP mutants to DNA damage by ethanol treatment, leading to phenotypes that closely resemble those observed in animals lacking both aldehyde dehydrogenase 2 and Fancd2 and resemble human fetal alcohol syndrome. Sptbn1-deficient cells are hypersensitive to DNA crosslinking agents and have defective DNA double-strand break repair that is rescued by ectopic Fancd2 expression. Moreover, Fancd2 transcription in response to DNA damage/transforming growth factor β stimulation is regulated by the β2SP/mothers against decapentaplegic homolog 3 complex. Conclusion: Dysfunctional transforming growth factor β/β2SP signaling impacts the processing of genotoxic metabolites by altering the Fanconi anemia DNA repair pathway. (Hepatology 2017;65:678-693)

    TGF-β/β2-spectrin/CTCF-regulated tumor suppression in human stem cell disorder Beckwith-Wiedemann syndrome.

    No full text
    Beckwith-Wiedemann syndrome (BWS) is a human stem cell disorder, and individuals with this disease have a substantially increased risk (~800-fold) of developing tumors. Epigenetic silencing of β2-spectrin (β2SP, encoded by SPTBN1), a SMAD adaptor for TGF-β signaling, is causally associated with BWS; however, a role of TGF-β deficiency in BWS-associated neoplastic transformation is unexplored. Here, we have reported that double-heterozygous Sptbn1(+/–) Smad3(+/–) mice, which have defective TGF-β signaling, develop multiple tumors that are phenotypically similar to those of BWS patients. Moreover, tumorigenesis-associated genes IGF2 and telomerase reverse transcriptase (TERT) were overexpressed in fibroblasts from BWS patients and TGF-β–defective mice. We further determined that chromatin insulator CCCTC-binding factor (CTCF) is TGF-β inducible and facilitates TGF-β–mediated repression of TERT transcription via interactions with β2SP and SMAD3. This regulation was abrogated in TGF-β–defective mice and BWS, resulting in TERT overexpression. Imprinting of the IGF2/H19 locus and the CDKN1C/KCNQ1 locus on chromosome 11p15.5 is mediated by CTCF, and this regulation is lost in BWS, leading to aberrant overexpression of growth-promoting genes. Therefore, we propose that loss of CTCF-dependent imprinting of tumor-promoting genes, such as IGF2 and TERT, results from a defective TGF-β pathway and is responsible at least in part for BWS-associated tumorigenesis as well as sporadic human cancers that are frequently associated with SPTBN1 and SMAD3 mutations

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    No full text

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    No full text
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical science. © The Author(s) 2019. Published by Oxford University Press
    corecore