5 research outputs found

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    Get PDF
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe

    Concordance between clinical outcomes in the Systolic Blood Pressure Intervention Trial and in the electronic health record

    No full text
    BACKGROUND: Randomized trials are the gold standard for generating clinical practice evidence, but follow-up and outcome ascertainment are resource-intensive. Electronic health record (EHR) data from routine care can be a cost-effective means of follow-up, but concordance with trial-ascertained outcomes is less well-studied. METHODS: We linked EHR and trial data for participants of the Systolic Blood Pressure Intervention Trial (SPRINT), a randomized trial comparing intensive and standard blood pressure targets. Among participants with available EHR data concurrent to trial-ascertained outcomes, we calculated sensitivity, specificity, positive predictive value, and negative predictive value for EHR-recorded cardiovascular disease (CVD) events, using the gold standard of SPRINT-adjudicated outcomes (myocardial infarction (MI)/acute coronary syndrome (ACS), heart failure, stroke, and composite CVD events). We additionally compared the incidence of non-CVD adverse events (hyponatremia, hypernatremia, hypokalemia, hyperkalemia, bradycardia, and hypotension) in trial versus EHR data. RESULTS: 2468 SPRINT participants were included (mean age 68 (SD 9) years; 26% female). EHR data demonstrated ≥80% sensitivity and specificity, and ≥ 99% negative predictive value for MI/ACS, heart failure, stroke, and composite CVD events. Positive predictive value ranged from 26% (95% CI; 16%, 38%) for heart failure to 52% (95% CI; 37%, 67%) for MI/ACS. EHR data uniformly identified more non-CVD adverse events and higher incidence rates compared with trial ascertainment. CONCLUSIONS: These results support a role for EHR data collection in clinical trials, particularly for capturing laboratory-based adverse events. EHR data may be an efficient source for CVD outcome ascertainment, though there is clear benefit from adjudication to avoid false positives

    Effect of Intensive versus Standard BP Control on AKI and Subsequent Cardiovascular Outcomes and Mortality: Findings from the SPRINT EHR Study

    Get PDF
    Background: Adjudication of inpatient AKI in the Systolic Blood Pressure Intervention Trial (SPRINT) was based on billing codes and admission and discharge notes. The purpose of this study was to evaluate the effect of intensive versus standard BP control on creatinine-based inpatient and outpatient AKI, and whether AKI was associated with cardiovascular disease (CVD) and mortality. Methods: We linked electronic health record (EHR) data from 47 clinic sites with trial data to enable creatinine-based adjudication of AKI. Cox regression was used to evaluate the effect of intensive BP control on the incidence of AKI, and the relationship between incident AKI and CVD and all-cause mortality. Results: A total of 3644 participants had linked EHR data. A greater number of inpatient AKI events were identified using EHR data (187 on intensive versus 155 on standard treatment) as compared with serious adverse event (SAE) adjudication in the trial (95 on intensive versus 61 on standard treatment). Intensive treatment increased risk for SPRINT-adjudicated inpatient AKI (HR, 1.51; 95% CI, 1.09 to 2.08) and for creatinine-based outpatient AKI (HR, 1.40; 95% CI, 1.15 to 1.70), but not for creatinine-based inpatient AKI (HR, 1.20; 95% CI, 0.97 to 1.48). Irrespective of the definition (SAE or creatinine based), AKI was associated with increased risk for all-cause mortality, but only creatinine-based inpatient AKI was associated with increased risk for CVD. Conclusions: Creatinine-based ascertainment of AKI, enabled by EHR data, may be more sensitive and less biased than traditional SAE adjudication. Identifying ways to prevent AKI may reduce mortality further in the setting of intensive BP control

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    No full text
    corecore