45 research outputs found

    Reproducibility of preclinical animal research improves with heterogeneity of study samples

    Get PDF
    Single-laboratory studies conducted under highly standardized conditions are the gold standard in preclinical animal research. Using simulations based on 440 preclinical studies across 13 different interventions in animal models of stroke, myocardial infarction, and breast cancer, we compared the accuracy of effect size estimates between single-laboratory and multi-laboratory study designs. Single-laboratory studies generally failed to predict effect size accurately, and larger sample sizes rendered effect size estimates even less accurate. By contrast, multi-laboratory designs including as few as 2 to 4 laboratories increased coverage probability by up to 42 percentage points without a need for larger sample sizes. These findings demonstrate that within-study standardization is a major cause of poor reproducibility. More representative study samples are required to improve the external validity and reproducibility of preclinical animal research and to prevent wasting animals and resources for inconclusive research

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    Get PDF
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency–Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research

    Completeness of reporting and risks of overstating impact in cluster randomised trials : a systematic review

    Get PDF
    Acknowledgments We received no funding specifically for this systematic review. ELT is funded in part by awards R01-AI141444 from the National Institute of Allergy and Infectious Diseases and R01-MH120649 from the US National Institute of Mental Health; both Institutes are part of the National Institutes of Health (NIH). JAG and ACP's support of this project was made possible (in part) by grant number UL1TR002553 from the National Center for Advancing Translational Sciences of the NIH, and the NIH Roadmap for Medical Research. JEM is supported by an Australian National Health and Medical Research Council Career Development Fellowship (APP1143429). SN was supported by an award that is jointly funded by the UK Medical Research Council (MRC) and the UK Department for International Development (DFID) under the MRC/DFID Concordat agreement, and part of the EDCTP2 programme supported by the European Union (grant reference MR/R010161/1). ABF acknowledges funding support from the National Health and Medical Research Council of Australia (grant ID 1183303). KH is funded by a National Institute for Health Research Senior Research Fellowship SRF-2017-10-002. The contents of the research included in this manuscript are solely the responsibility of the authors and do not necessarily represent the official views of any of the funders. The research contributed by all authors of this manuscript are independent of their funders. Specifically, the funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. We wish to thank the three reviewers for their insightful comments and constructive feedback.Peer reviewedPublisher PD

    Results of Applying Probabilistic IR to OCR Text

    No full text
    Character accuracy of optically recognized text is considered a basic measure for evaluating OCR devices. In the broader sense, another fundamental measure of an OCR's goodness is whether its generated text is usable for retrieving information. In this study, we evaluate retrieval effectiveness from OCR text databases using a probabilistic IR system. We compare these retrieval results to their manually corrected equivalent. We show there is no statistical difference in precision and recall using graded accuracy levels from three OCR devices. However, characteristics of the OCR data have side effects that could cause unstable results with this IR model. In particular, we found individual queries can be greatly affected. Knowing the qualities of OCR text, we compensate for them by applying an automatic post-processing system that improves effectiveness. 1 Introduction Anyone who has performed research in either optical character recognition (OCR) or information retrieval (IR) will atte..
    corecore