92 research outputs found

    Chapter 9: Options for Summarizing Medical Test Performance in the Absence of a ā€œGold Standardā€

    Get PDF
    The classical paradigm for evaluating test performance compares the results of an index test with a reference test. When the reference test does not mirror the ā€œtruthā€ adequately well (e.g. is an ā€œimperfectā€ reference standard), the typical (ā€œnaĆÆveā€) estimates of sensitivity and specificity are biased. One has at least four options when performing a systematic review of test performance when the reference standard is ā€œimperfectā€: (a) to forgo the classical paradigm and assess the index testā€™s ability to predict patient relevant outcomes instead of test accuracy (i.e., treat the index test as a predictive instrument); (b) to assess whether the results of the two tests (index and reference) agree or disagree (i.e., treat them as two alternative measurement methods); (c) to calculate ā€œnaĆÆveā€ estimates of the index testā€™s sensitivity and specificity from each study included in the review and discuss in which direction they are biased; (d) mathematically adjust the ā€œnaĆÆveā€ estimates of sensitivity and specificity of the index test to account for the imperfect reference standard. We discuss these options and illustrate some of them through examples

    Family-Based versus Unrelated Case-Control Designs for Genetic Associations

    Get PDF
    The most simple and commonly used approach for genetic associations is the case-control study design of unrelated people. This design is susceptible to population stratification. This problem is obviated in family-based studies, but it is usually difficult to accumulate large enough samples of well-characterized families. We addressed empirically whether the two designs give similar estimates of association in 93 investigations where both unrelated case-control and family-based designs had been employed. Estimated odds ratios differed beyond chance between the two designs in only four instances (4%). The summary relative odds ratio (ROR) (the ratio of odds ratios obtained from unrelated case-control and family-based studies) was close to unity (0.96 [95% confidence interval, 0.91ā€“1.01]). There was no heterogeneity in the ROR across studies (amount of heterogeneity beyond chance I(2) = 0%). Differences on whether results were nominally statistically significant (p < 0.05) or not with the two designs were common (opposite classification rates 14% and 17%); this reflected largely differences in power. Conclusions were largely similar in diverse subgroup analyses. Unrelated case-control and family-based designs give overall similar estimates of association. We cannot rule out rare large biases or common small biases

    Electrocardiogram-gated single-photonemission computed tomography versus cardiacmagnetic resonance imaging for the assessmentof left ventricular volumes and ejection fraction A meta-analysis

    Get PDF
    AbstractObjectivesThe purpose of this study was to evaluate the accuracy of electrocardiogram (ECG)-gated single-photon emission computed tomography (SPECT) for assessment of left ventricular (LV) end-diastolic volume (EDV), end-systolic volume (ESV) and ejection fraction (EF) compared with the gold standard of cardiac magnetic resonance imaging (MRI).BackgroundSeveral comparisons of ECG-gated SPECT with cardiac MRI have been performed for evaluation of LV volumes and EF, but each has considered few subjects, thus leaving uncertainty about the frequency of discrepancies between the two methods.MethodsWe performed a meta-analysis of data on 164 subjects from nine studies comparing ECG-gated SPECT versus cardiac MRI. Data were pooled in correlation and regression analyses relating ECG-gated SPECT and cardiac MRI measurements. The frequency of discrepancies of at least 30 ml in EDV, 20 ml in ESV and 5% or 10% in EF and concordance for EF ā‰¤40% versus >40% were determined.ResultsThere was an overall excellent correlation between ECG-gated SPECT and cardiac MRI for EDV (r = 0.89), ESV (r = 0.92) and EF (r = 0.87). However, rates of discrepancies for individual subjects were considerable (37% [95% confidence interval {CI}, 26% to 50%] for at least 30 ml in EDV; 35% [95% CI, 23% to 49%] for at least 20 ml in ESV; 52% [95% CI, 37% to 63%] for at least 5% in EF; and 23% [95% CI, 11% to 42%] for at least 10% in EF). The misclassification rate for the 40% EF cutoff was 11%.ConclusionsElectrocardiogram-gated SPECT measurements of EDV, ESV and EF show high correlation with cardiac MRI measurements, but substantial errors may occur in individual patients. Electrocardiogram-gated SPECT offers useful functional information, but cardiac MRI should be used when accurate measurement is required

    Appraising the Potential Uses and Harms of LLMs for Medical Systematic Reviews

    Full text link
    Medical systematic reviews play a vital role in healthcare decision making and policy. However, their production is time-consuming, limiting the availability of high-quality and up-to-date evidence summaries. Recent advancements in large language models (LLMs) offer the potential to automatically generate literature reviews on demand, addressing this issue. However, LLMs sometimes generate inaccurate (and potentially misleading) texts by hallucination or omission. In healthcare, this can make LLMs unusable at best and dangerous at worst. We conducted 16 interviews with international systematic review experts to characterize the perceived utility and risks of LLMs in the specific context of medical evidence reviews. Experts indicated that LLMs can assist in the writing process by drafting summaries, generating templates, distilling information, and crosschecking information. They also raised concerns regarding confidently composed but inaccurate LLM outputs and other potential downstream harms, including decreased accountability and proliferation of low-quality reviews. Informed by this qualitative analysis, we identify criteria for rigorous evaluation of biomedical LLMs aligned with domain expert views.Comment: 18 pages, 2 figures, 8 tables. Accepted as an EMNLP 2023 main pape

    Closing the Gap between Methodologists and End-Users: R as a Computational Back-End

    Get PDF
    The R environment provides a natural platform for developing new statistical methods due to the mathematical expressiveness of the language, the large number of existing libraries, and the active developer community. One drawback to R, however, is the learning curve; programming is a deterrent to non-technical users, who typically prefer graphical user interfaces (GUIs) to command line environments. Thus, while statisticians develop new methods in R, practitioners are often behind in terms of the statistical techniques they use as they rely on GUI applications. Meta-analysis is an instructive example; cutting-edge meta-analysis methods are often ignored by the overwhelming majority of practitioners, in part because they have no easy way of applying them. This paper proposes a strategy to close the gap between the statistical state-of-the-science and what is applied in practice. We present open-source meta-analysis software that uses R as the underlying statistical engine, and Python for the GUI. We present a framework that allows methodologists to implement new methods in R that are then automatically integrated into the GUI for use by end-users, so long as the programmer conforms to our interface. Such an approach allows an intuitive interface for non-technical users while leveraging the latest advanced statistical methods implemented by methodologists

    Chapter 10: Deciding Whether to Complement a Systematic Review of Medical Tests with Decision Modeling

    Get PDF
    Limited by what is reported in the literature, most systematic reviews of medical tests focus on ā€œtest accuracyā€ (or better, test performance), rather than on the impact of testing on patient outcomes. The link between testing, test results and patient outcomes is typically complex: even when testing has high accuracy, there is no guarantee that physicians will act according to test results, that patients will follow their orders, or that the intervention will yield a beneficial endpoint. Therefore, test performance is typically not sufficient for assessing the usefulness of medical tests. Modeling (in the form of decision or economic analysis) is a natural framework for linking test performance data to clinical outcomes. We propose that (some) modeling should be considered to facilitate the interpretation of summary test performance measures by connecting testing and patient outcomes. We discuss a simple algorithm for helping systematic reviewers think through this possibility, and illustrate it by means of an example

    Local Literature Bias in Genetic Epidemiology: An Empirical Evaluation of the Chinese Literature

    Get PDF
    BACKGROUND: Postulated epidemiological associations are subject to several biases. We evaluated whether the Chinese literature on human genome epidemiology may offer insights on the operation of selective reporting and language biases. METHODS AND FINDINGS: We targeted 13 gene-disease associations, each already assessed by meta-analyses, including at least 15 non-Chinese studies. We searched the Chinese Journal Full-Text Database for additional Chinese studies on the same topics. We identified 161 Chinese studies on 12 of these gene-disease associations; only 20 were PubMed-indexed (seven English full-text). Many studies (14ā€“35 per topic) were available for six topics, covering diseases common in China. With one exception, the first Chinese study appeared with a time lag (2ā€“21 y) after the first non-Chinese study on the topic. Chinese studies showed significantly more prominent genetic effects than non-Chinese studies, and 48% were statistically significant per se, despite their smaller sample size (median sample size 146 versus 268, p < 0.001). The largest genetic effects were often seen in PubMed-indexed Chinese studies (65% statistically significant per se). Non-Chinese studies of Asian-descent populations (27% significant per se) also tended to show somewhat more prominent genetic effects than studies of non-Asian descent (17% significant per se). CONCLUSION: Our data provide evidence for the interplay of selective reporting and language biases in human genome epidemiology. These biases may not be limited to the Chinese literature and point to the need for a global, transparent, comprehensive outlook in molecular population genetics and epidemiologic studies in general

    Recommendations for conduct, methodological practices, and reporting of cost-effectiveness analyses : second panel on cost-effectiveness in health and medicine

    Get PDF
    ImportanceĀ  Since publication of the report by the Panel on Cost-Effectiveness in Health and Medicine in 1996, researchers have advanced the methods of cost-effectiveness analysis, and policy makers have experimented with its application. The need to deliver health care efficiently and the importance of using analytic techniques to understand the clinical and economic consequences of strategies to improve health have increased in recent years.ObjectiveĀ  To review the state of the field and provide recommendations to improve the quality of cost-effectiveness analyses. The intended audiences include researchers, government policy makers, public health officials, health care administrators, payers, businesses, clinicians, patients, and consumers.DesignĀ  In 2012, the Second Panel on Cost-Effectiveness in Health and Medicine was formed and included 2 co-chairs, 13 members, and 3 additional members of a leadership group. These members were selected on the basis of their experience in the field to provide broad expertise in the design, conduct, and use of cost-effectiveness analyses. Over the next 3.5 years, the panel developed recommendations by consensus. These recommendations were then reviewed by invited external reviewers and through a public posting process.FindingsĀ  The concept of a ā€œreference caseā€ and a set of standard methodological practices that all cost-effectiveness analyses should follow to improve quality and comparability are recommended. All cost-effectiveness analyses should report 2 reference case analyses: one based on a health care sector perspective and another based on a societal perspective. The use of an ā€œimpact inventory,ā€ which is a structured table that contains consequences (both inside and outside the formal health care sector), intended to clarify the scope and boundaries of the 2 reference case analyses is also recommended. This special communication reviews these recommendations and others concerning the estimation of the consequences of interventions, the valuation of health outcomes, and the reporting of cost-effectiveness analyses.Conclusions and RelevanceĀ  The Second Panel reviewed the current status of the field of cost-effectiveness analysis and developed a new set of recommendations. Major changes include the recommendation to perform analyses from 2 reference case perspectives and to provide an impact inventory to clarify included consequences
    • ā€¦
    corecore