26 research outputs found

    Are false positives in suicide classification models a risk group? Evidence for “true alarms” in a population-representative longitudinal study of Norwegian adolescents

    Get PDF
    IntroductionFalse positives in retrospective binary suicide attempt classification models are commonly attributed to sheer classification error. However, when machine learning suicide attempt classification models are trained with a multitude of psycho-socio-environmental factors and achieve high accuracy in suicide risk assessment, false positives may turn out to be at high risk of developing suicidal behavior or attempting suicide in the future. Thus, they may be better viewed as “true alarms,” relevant for a suicide prevention program. In this study, using large population-based longitudinal dataset, we examine three hypotheses: (1) false positives, compared to the true negatives, are at higher risk of suicide attempt in future, (2) the suicide attempts risk for the false positives increase as a function of increase in specificity threshold; and (3) as specificity increases, the severity of risk factors between false positives and true positives becomes more similar.MethodsUtilizing the Gradient Boosting algorithm, we used a sample of 11,369 Norwegian adolescents, assessed at two timepoints (1992 and 1994), to classify suicide attempters at the first time point. We then assessed the relative risk of suicide attempt at the second time point for false positives in comparison to true negatives, and in relation to the level of specificity.ResultsWe found that false positives were at significantly higher risk of attempting suicide compared to true negatives. When selecting a higher classification risk threshold by gradually increasing the specificity cutoff from 60% to 97.5%, the relative suicide attempt risk of the false positive group increased, ranging from minimum of 2.96 to 7.22 times. As the risk threshold increased, the severity of various mental health indicators became significantly more comparable between false positives and true positives.ConclusionWe argue that the performance evaluation of machine learning suicide classification models should take the clinical relevance into account, rather than focusing solely on classification error metrics. As shown here, the so-called false positives represent a truly at-risk group that should be included in suicide prevention programs. Hence, these findings should be taken into consideration when interpreting machine learning suicide classification models as well as planning future suicide prevention interventions for adolescents

    Predicting suicide attempts among Norwegian adolescents without using suicide-related items: a machine learning approach

    Get PDF
    IntroductionResearch on the classification models of suicide attempts has predominantly depended on the collection of sensitive data related to suicide. Gathering this type of information at the population level can be challenging, especially when it pertains to adolescents. We addressed two main objectives: (1) the feasibility of classifying adolescents at high risk of attempting suicide without relying on specific suicide-related survey items such as history of suicide attempts, suicide plan, or suicide ideation, and (2) identifying the most important predictors of suicide attempts among adolescents.MethodsNationwide survey data from 173,664 Norwegian adolescents (ages 13–18) were utilized to train a binary classification model, using 169 questionnaire items. The Extreme Gradient Boosting (XGBoost) algorithm was fine-tuned to classify adolescent suicide attempts, and the most important predictors were identified.ResultsXGBoost achieved a sensitivity of 77% with a specificity of 90%, and an AUC of 92.1% and an AUPRC of 47.1%. A coherent set of predictors in the domains of internalizing problems, substance use, interpersonal relationships, and victimization were pinpointed as the most important items related to recent suicide attempts.ConclusionThis study underscores the potential of machine learning for screening adolescent suicide attempts on a population scale without requiring sensitive suicide-related survey items. Future research investigating the etiology of suicidal behavior may direct particular attention to internalizing problems, interpersonal relationships, victimization, and substance use

    Measurement invariance of assessments of depression (PHQ-9) and anxiety (GAD-7) across sex, strata and linguistic backgrounds in a European-wide sample of patients after Traumatic Brain Injury

    Get PDF
    Background The Patient Health Questionnaire-9 (PHQ-9) and the Generalized Anxiety Disorder (GAD-7) are two widely used instruments to screen patients for depression and anxiety. Comparable psychometric properties across different demographic and linguistic groups are necessary for multiple group comparison and international research on depression and anxiety. Objectives and Method We examine measurement invariance for the PHQ-9 and GAD-7 by: (a) the sex of the participants, (b) recruitment stratum, and (c) linguistic background. This study is based on non-randomized observational data six months after Traumatic Brain Injury (TBI) that were collected in 18 countries. We used multiple methods to detect Differential Item Functioning (DIF) including Item Response Theory, logistic regression, and the Mantel-Haenszel method. Results At the 6-month post-injury, 2137 (738 [34.5%] women) participants completed the PHQ-9 and GAD-7 questionnaires: 885 [41.4%] patients were primarily admitted to the Intensive Care Unit (ICU), 805 [37.7%] were admitted to hospital ward, and 447 [20.9%] were evaluated in the Emergency Room and discharged. Results supported the invariance of PHQ-9 and GAD-7 across sex, patient strata and linguistic background. For different strata three PHQ-9 items and one GAD-7 item and for different linguistic groups only two GAD-7 items were flagged as showing differences in two out of four DIF tests. However, the magnitude of the DIF effect was negligible. Limitation Despite high number of participants from ICU, patients have mostly mild TBI. Conclusion The findings demonstrate adequate psychometric properties for PHQ-9 and GAD-7, allowing direct multigroup comparison across sex, strata, and linguistic background

    Rethinking literate programming in statistics

    No full text
    Literate programming is becoming increasingly trendy for data analysis because it allows the generation of dynamic-analysis reports for communicating data analysis and eliminates untraceable human errors in analysis reports. Traditionally, literate programming includes separate processes for compiling the code and preparing the documentation. While this workflow might be satisfactory for software documentation, it is not ideal for writing statistical analysis reports. Instead, these processes should run in parallel. In this article, I introduce the weaver package, which examines this idea by creating a new log system in HTML or LATEX that can be used simultaneously with the Stata log system. The new log system provides many features that the Stata log system lacks; for example, it can render mathematical notations, insert figures, create publication-ready dynamic tables, and style text, and it includes a built-in syntax highlighter. The weaver package also produces dynamic PDF documents by converting the HTML log to PDF or by typesetting the LATEX log and thus provides a real-time preview of the document without recompiling the code. I also discuss potential applications of the weaver package

    Mental Health, Well-Being, and Extremism: A Machine Learning Study on Norwegian Adolescents

    No full text
    a repository for the journal articl

    markdoc: Literate programming in Stata

    No full text
    Rigorous documentation of the analysis plan, procedure, and computer codes enhances the comprehensibility and transparency of data analysis. Documentation is particularly critical when the codes and data are meant to be publicly shared and examined by the scientific community to evaluate the analysis or adapt the results. The popular approach for documenting computer codes is known as literate programming, which requires preparing a trilingual script file that includes a programming language for running the data analysis, a human language for documentation, and a markup language for typesetting the document. In this article, I introduce markdoc, a software package for interactive literate programming and generating dynamic-analysis documents in Stata. markdoc recognizes Markdown, LATEX, and HTML markup languages and can export documents in several formats, such as PDF, Microsoft Office .docx, OpenOffice and LibreOffice .odt, LATEX, HTML, ePub, and Markdown

    Seamless interactive language interfacing between R and Stata

    No full text
    In this article, I propose a new approach to language interfacing for statistical software by allowing automatic interprocess communication between R and Stata. I advocate interactive language interfacing in statistical software by automatizing data communication. I introduce the rcall package and provide examples of how the R language can be used interactively within Stata or embedded into Stata programs using the proposed approach to interfacing. Moreover, I discuss the pros and cons of object synchronization in language interfacing
    corecore