27 research outputs found

    Lessons learned from conducting mental health intervention research in schools in the Global South:Our Experiences in South Africa and Kenya

    Get PDF
    Most of the world’s population of young people live in lower-and middle-income countries (LMICs; (Weine, Horvath Marques, Singh, & Pringle, 2020)), and these young people experience heightened rates of known risk factors for developing mental disorders such as poverty and exposure to trauma (Atwoli, Stein, Koenen, & McLaughlin, 2015). Access to professional psychological treatments is limited in LMICs due to structural barriers (e.g., a dearth of trained professionals) and cultural factors like stigma and beliefs about mental health and illness. Therefore, schools, which are widely attended, may be a good location for providing mental health interventions, and it is important that we develop and evaluate feasible, acceptable, effective, and scalable interventions for use in this context. Yet under 10% of clinical trials of psychotherapies (Venturo-Conerly, Eisenman, Wasil, Singla, & Weisz, 2022) have been conducted in LMICs. And there are particular challenges to conducting research in schools, as has been highlighted in the UK context by Moore et al. (2022). Building on that commentary, our aim herein is to share our learnings from conducting psychotherapy research in schools in Kenya and South Africa

    Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned

    Full text link
    We describe our early efforts to red team language models in order to simultaneously discover, measure, and attempt to reduce their potentially harmful outputs. We make three main contributions. First, we investigate scaling behaviors for red teaming across 3 model sizes (2.7B, 13B, and 52B parameters) and 4 model types: a plain language model (LM); an LM prompted to be helpful, honest, and harmless; an LM with rejection sampling; and a model trained to be helpful and harmless using reinforcement learning from human feedback (RLHF). We find that the RLHF models are increasingly difficult to red team as they scale, and we find a flat trend with scale for the other model types. Second, we release our dataset of 38,961 red team attacks for others to analyze and learn from. We provide our own analysis of the data and find a variety of harmful outputs, which range from offensive language to more subtly harmful non-violent unethical outputs. Third, we exhaustively describe our instructions, processes, statistical methodologies, and uncertainty about red teaming. We hope that this transparency accelerates our ability to work together as a community in order to develop shared norms, practices, and technical standards for how to red team language models

    Language Models (Mostly) Know What They Know

    Full text link
    We study whether language models can evaluate the validity of their own claims and predict which questions they will be able to answer correctly. We first show that larger models are well-calibrated on diverse multiple choice and true/false questions when they are provided in the right format. Thus we can approach self-evaluation on open-ended sampling tasks by asking models to first propose answers, and then to evaluate the probability "P(True)" that their answers are correct. We find encouraging performance, calibration, and scaling for P(True) on a diverse array of tasks. Performance at self-evaluation further improves when we allow models to consider many of their own samples before predicting the validity of one specific possibility. Next, we investigate whether models can be trained to predict "P(IK)", the probability that "I know" the answer to a question, without reference to any particular proposed answer. Models perform well at predicting P(IK) and partially generalize across tasks, though they struggle with calibration of P(IK) on new tasks. The predicted P(IK) probabilities also increase appropriately in the presence of relevant source materials in the context, and in the presence of hints towards the solution of mathematical word problems. We hope these observations lay the groundwork for training more honest models, and for investigating how honesty generalizes to cases where models are trained on objectives other than the imitation of human writing.Comment: 23+17 pages; refs added, typos fixe

    Training and Supervising Lay-providers in Low-income Settings: A Mixed-methods Study of Task-sharing from the Shamiri Randomized Controlled Trial

    Full text link
    Objective: Training lay-providers to deliver mental health interventions is both effective and cost-effective. However, more research is needed to document training and supervision procedures and to collect lay-providers’ feedback. We analyzed the acceptability of a 10-hour lay-provider training and supervision delivered primarily by undergraduates. We also tested lay-provider fidelity and quality. Methods: This study documents training and supervision from an RCT of the Shamiri intervention, a 4-session, school-based intervention which significantly reduced symptoms of anxiety and depression in Kenyan adolescents. We delivered a 10-hour training to 13 lay providers (M(SD)age=21.00(1.95), %female=61.54). We also hosted 30-minute supervision meetings twice weekly. Independent raters coded session recordings for fidelity and quality. We also collected quantitative and qualitative feedback from lay-providers. Results: Reliability and mean ratings for all six of our fidelity and quality measures (delivering required content, adhering to specified details, thoroughness, skillfulness, clarity, and purity) were very good to excellent. Lay-provider quantitative ratings of training were also overwhelmingly positive, with an overall satisfaction rating of 6.46/7.00. We identified central qualitative themes in lay provider comments: Generally, comments about training style, content, and personal interactions were overwhelmingly positive, and many lay-providers reported personal growth. Comments about timing and location were mixed. Conclusions: This study provides preliminary evidence that a very brief training delivered primarily by undergraduates can teach high-school-graduate lay-providers to deliver effective mental health interventions. Additionally, we discuss lessons-learned and implications for future research, including the importance of considering local context when planning and of continuously collecting and addressing lay-provider feedback

    Shamiri Templeton Comparative Effectiveness Trial

    Full text link
    This project page is used to store data, preprints, code, protocols, and other publicly available project materials for the Templeton World Charity Foundation-funded 2021 five-group RCT of Shamiri and its component intervention

    In their own words: Using open-ended assessment to identify culturally relevant concerns among Kenyan adolescents

    Full text link
    Standardized assessment tools developed in western contexts may systematically miss certain problems that are considered important in non-western cultures. In this mixed-methods study, we used an open-ended assessment tool (the Top Problem Assessment; TPA) to identify culturally relevant concerns among low-income Kenyan youth. We then a) applied thematic analysis to identify the most frequently reported problems and b) examined the extent to which these problems were reflected in standardized mental health measures. Using the TPA, we identified common social, academic, and economic problems facing Kenyan youths. Specifically, 61% of the sample reported a social problem, 38% an academic problem, and 35% an economic problem. By contrast, the standardized assessments revealed that worrying and difficulty concentrating were the most commonly reported symptoms. However, the emotional and behavioral problems assessed via the standardized measures were only reported as top problems by 17% of the sample. Overall, our findings are consistent with the idea that standardized measures can miss certain culturally-salient concerns that can be acquired through open-ended assessments. We discuss how brief open-ended assessment tools could complement standardized measures, inform the development of culturally relevant standardized measures, and offer rich data about the experiences of people in understudied cultural contexts

    Depression and Anxiety Symptoms, Social Support, and Demographic Factors Among Kenyan High School Students

    Full text link
    Objectives: Depression and anxiety are leading causes of youth disability worldwide, yet our understanding of these conditions in Sub-Saharan African (SSA) youths is limited. Research has been sparse in SSA, and prevalence rates and correlates of these conditions remain scarcely investigated. To help address these gaps, this cross-sectional study assessed the prevalence of adolescent depression and anxiety symptoms in a community sample of high school students in Kenya. We also examined associations between those symptoms and psychosocial and sociodemographic factors. Methods: We administered self-report measures of depression and anxiety symptoms, social support, gratitude, growth mindsets, and life satisfaction to 658 students (51.37% female) aged 13 – 19. Results: Only the measures of depression (Patient Health Questionnaire-9), anxiety (Generalized Anxiety Disorder Screen-7), and social support (Multidimensional Scale for Perceived Social Support Scale) showed adequate internal consistency (Cronbach alpha > 0.70) in the study sample. Findings with these measures among Kenyan youths showed high levels of depression symptoms (45.90% above clinical cutoff) and anxiety symptoms (37.99% above clinical cutoff). Older adolescents reported higher depression and anxiety symptoms, as well as lower social support than younger adolescents. Females reported more anxiety than males, and members of minority tribes reported more anxiety than members of majority tribes. Conclusions: This study highlights the high prevalence of adolescent internalizing symptoms in Kenyan high school students, identifies important correlates of these symptoms, and illustrates the need for culturally appropriate assessment tools
    corecore