7 research outputs found

    Patient Health Questionnaire-9 scores do not accurately estimate depression prevalence: individual participant data meta-analysis.

    Get PDF
    OBJECTIVES: Depression symptom questionnaires are not for diagnostic classification. Patient Health Questionnaire-9 (PHQ-9) scores ≥10 are nonetheless often used to estimate depression prevalence. We compared PHQ-9 ≥10 prevalence to Structured Clinical Interview for Diagnostic and Statistical Manual of Mental Disorders (SCID) major depression prevalence and assessed whether an alternative PHQ-9 cutoff could more accurately estimate prevalence. STUDY DESIGN AND SETTING: Individual participant data meta-analysis of datasets comparing PHQ-9 scores to SCID major depression status. RESULTS: A total of 9,242 participants (1,389 SCID major depression cases) from 44 primary studies were included. Pooled PHQ-9 ≥10 prevalence was 24.6% (95% confidence interval [CI]: 20.8%, 28.9%); pooled SCID major depression prevalence was 12.1% (95% CI: 9.6%, 15.2%); and pooled difference was 11.9% (95% CI: 9.3%, 14.6%). The mean study-level PHQ-9 ≥10 to SCID-based prevalence ratio was 2.5 times. PHQ-9 ≥14 and the PHQ-9 diagnostic algorithm provided prevalence closest to SCID major depression prevalence, but study-level prevalence differed from SCID-based prevalence by an average absolute difference of 4.8% for PHQ-9 ≥14 (95% prediction interval: -13.6%, 14.5%) and 5.6% for the PHQ-9 diagnostic algorithm (95% prediction interval: -16.4%, 15.0%). CONCLUSION: PHQ-9 ≥10 substantially overestimates depression prevalence. There is too much heterogeneity to correct statistically in individual studies

    Accuracy of Patient Health Questionnaire-9 (PHQ-9) for screening to detect major depression: individual participant data meta-analysis

    Get PDF
    Objective: To determine the accuracy of the Patient Health Questionnaire-9 (PHQ-9) for screening to detect major depression. Design: Individual participant data meta-analysis. Data sources: Medline, Medline In-Process and Other Non-Indexed Citations, PsycINFO, and Web of Science (January 2000-February 2015). Inclusion criteria: Eligible studies compared PHQ-9 scores with major depression diagnoses from validated diagnostic interviews. Primary study data and study level data extracted from primary reports were synthesized. For PHQ-9 cut-off scores 5-15, bivariate random effects meta-analysis was used to estimate pooled sensitivity and specificity, separately, among studies that used semistructured diagnostic interviews, which are designed for administration by clinicians; fully structured interviews, which are designed for lay administration; and the Mini International Neuropsychiatric (MINI) diagnostic interviews, a brief fully structured interview. Sensitivity and specificity were examined among participant subgroups and, separately, using meta-regression, considering all subgroup variables in a single model. Results: Data were obtained for 58 of 72 eligible studies (total n=17 357; major depression cases n=2312). Combined sensitivity and specificity was maximized at a cut-off score of 10 or above among studies using a semistructured interview (29 studies, 6725 participants; sensitivity 0.88, 95% confidence interval 0.83 to 0.92; specificity 0.85, 0.82 to 0.88). Across cut-off scores 5-15, sensitivity with semistructured interviews was 5-22% higher than for fully structured interviews (MINI excluded; 14 studies, 7680 participants) and 2-15% higher than for the MINI (15 studies, 2952 participants). Specificity was similar across diagnostic interviews. The PHQ-9 seems to be similarly sensitive but may be less specific for younger patients than for older patients; a cut-off score of 10 or above can be used regardless of age.. Conclusions: PHQ-9 sensitivity compared with semistructured diagnostic interviews was greater than in previous conventional meta-analyses that combined reference standards. A cut-off score of 10 or above maximized combined sensitivity and specificity overall and for subgroups. Registration: PROSPERO CRD42014010673

    Erratum: Equivalency of the diagnostic accuracy of the PHQ-8 and PHQ-9: A systematic review and individual participant data meta-analysis (Journal of Physical Chemistry (2019) DOI: 10.1017/S0033291719001314)

    No full text
    This article was published in Psychological Medicine with incorrect author information. MD Inagaki should be Masatoshi Inagaki and MD Stafford should be Lesley Stafford. This has since been updated as per the above

    Equivalency of the diagnostic accuracy of the PHQ-8 and PHQ-9 : a systematic review and individual participant data meta-analysis

    Get PDF
    BACKGROUND: Item 9 of the Patient Health Questionnaire-9 (PHQ-9) queries about thoughts of death and self-harm, but not suicidality. Although it is sometimes used to assess suicide risk, most positive responses are not associated with suicidality. The PHQ-8, which omits Item 9, is thus increasingly used in research. We assessed equivalency of total score correlations and the diagnostic accuracy to detect major depression of the PHQ-8 and PHQ-9. METHODS: We conducted an individual patient data meta-analysis. We fit bivariate random-effects models to assess diagnostic accuracy. RESULTS: 16 742 participants (2097 major depression cases) from 54 studies were included. The correlation between PHQ-8 and PHQ-9 scores was 0.996 (95% confidence interval 0.996 to 0.996). The standard cutoff score of 10 for the PHQ-9 maximized sensitivity + specificity for the PHQ-8 among studies that used a semi-structured diagnostic interview reference standard (N = 27). At cutoff 10, the PHQ-8 was less sensitive by 0.02 (-0.06 to 0.00) and more specific by 0.01 (0.00 to 0.01) among those studies (N = 27), with similar results for studies that used other types of interviews (N = 27). For all 54 primary studies combined, across all cutoffs, the PHQ-8 was less sensitive than the PHQ-9 by 0.00 to 0.05 (0.03 at cutoff 10), and specificity was within 0.01 for all cutoffs (0.00 to 0.01). CONCLUSIONS: PHQ-8 and PHQ-9 total scores were similar. Sensitivity may be minimally reduced with the PHQ-8, but specificity is similar

    The Accuracy of the Patient Health Questionnaire-9 (PHQ-9) Algorithm for Screening to Detect Major Depression : An Individual Participant Data Meta-analysis

    Get PDF
    Background: Screening for major depression with the Patient Health Questionnaire-9 (PHQ-9) can be done using a cutoff or the PHQ-9 diagnostic algorithm. Many primary studies publish results for only one approach, and previous meta-analyses of the algorithm approach included only a subset of primary studies that collected data and could have published results. Objective: To use an individual participant data meta-analysis to evaluate the accuracy of two PHQ-9 diagnostic algorithms for detecting major depression and compare accuracy between the algorithms and the standard PHQ-9 cutoff score of ≥10. Methods: Medline, Medline In-Process and Other Non-Indexed Citations, PsycINFO, Web of Science (January 1, 2000, to February 7, 2015). Eligible studies that classified current major depression status using a validated diagnostic interview. Results: Data were included for 54 of 72 identified eligible studies (n participants = 16,688, n cases = 2,091). Among studies that used a semi-structured interview, pooled sensitivity and specificity (95% confidence interval) were 0.57 (0.49, 0.64) and 0.95 (0.94, 0.97) for the original algorithm and 0.61 (0.54, 0.68) and 0.95 (0.93, 0.96) for a modified algorithm. Algorithm sensitivity was 0.22-0.24 lower compared to fully structured interviews and 0.06-0.07 lower compared to the Mini International Neuropsychiatric Interview. Specificity was similar across reference standards. For PHQ-9 cutoff of ≥10 compared to semi-structured interviews, sensitivity and specificity (95% confidence interval) were 0.88 (0.82-0.92) and 0.86 (0.82-0.88). Conclusions: The cutoff score approach appears to be a better option than a PHQ-9 algorithm for detecting major depression

    External validation of a shortened screening tool using individual participant data meta-analysis: A case study of the Patient Health Questionnaire-Dep-4

    No full text
    Shortened versions of self-reported questionnaires may be used to reduce respondent burden. When shortened screening tools are used, it is desirable to maintain equivalent diagnostic accuracy to full-length forms. This manuscript presents a case study that illustrates how external data and individual participant data meta-analysis can be used to assess the equivalence in diagnostic accuracy between a shortened and full-length form. This case study compares the Patient Health Questionnaire-9 (PHQ-9) and a 4-item shortened version (PHQ-Dep-4) that was previously developed using optimal test assembly methods. Using a large database of 75 primary studies (34,698 participants, 3,392 major depression cases), we evaluated whether the PHQ-Dep-4 cutoff of ≥ 4 maintained equivalent diagnostic accuracy to a PHQ-9 cutoff of ≥ 10. Using this external validation dataset, a PHQ-Dep-4 cutoff of ≥ 4 maximized the sum of sensitivity and specificity, with a sensitivity of 0.88 (95% CI 0.81, 0.93), 0.68 (95% CI 0.56, 0.78), and 0.80 (95% CI 0.73, 0.85) for the semi-structured, fully structured, and MINI reference standard categories, respectively, and a specificity of 0.79 (95% CI 0.74, 0.83), 0.85 (95% CI 0.78, 0.90), and 0.83 (95% CI 0.80, 0.86) for the semi-structured, fully structured, and MINI reference standard categories, respectively. While equivalence with a PHQ-9 cutoff of ≥ 10 was not established, we found the sensitivity of the PHQ-Dep-4 to be non-inferior to that of the PHQ-9, and the specificity of the PHQ-Dep-4 to be marginally smaller than the PHQ-9

    Probability of major depression classification based on the SCID, CIDI, and MINI diagnostic interviews: A synthesis of three individual participant data meta-analyses

    No full text
    Introduction: Three previous individual participant data meta-analyses (IPDMAs) reported that, compared to the Structured Clinical Interview for the DSM (SCID), alternative reference standards, primarily the Composite International Diagnostic Interview (CIDI) and the Mini International Neuropsychiatric Interview (MINI), tended to misclassify major depression status, when controlling for depression symptom severity. However, there was an important lack of precision in the results. Objective: To compare the odds of the major depression classification based on the SCID, CIDI, and MINI. Methods: We included and standardized data from 3 IPDMA databases. For each IPDMA, separately, we fitted binomial generalized linear mixed models to compare the adjusted odds ratios (aORs) of major depression classification, controlling for symptom severity and characteristics of participants, and the interaction between interview and symptom severity. Next, we synthesized results using a DerSimonian-Laird random-effects meta-analysis. Results: In total, 69,405 participants (7,574 [11%] with major depression) from 212 studies were included. Controlling for symptom severity and participant characteristics, the MINI (74 studies; 25,749 participants) classified major depression more often than the SCID (108 studies; 21,953 participants; aOR 1.46; 95% confidence interval [CI] 1.11-1.92]). Classification odds for the CIDI (30 studies; 21,703 participants) and the SCID did not differ overall (aOR 1.19; 95% CI 0.79-1.75); however, as screening scores increased, the aOR increased less for the CIDI than the SCID (interaction aOR 0.64; 95% CI 0.52-0.80). Conclusions: Compared to the SCID, the MINI classified major depression more often. The odds of the depression classification with the CIDI increased less as symptom levels increased. Interpretation of research that uses diagnostic interviews to classify depression should consider the interview characteristics.</p
    corecore