Search CORE

45 research outputs found

Mental distress detection and triage in forum posts: the LT3 CLPsych 2016 shared task system

Author: Desmet Bart
Hoste Veronique
Jacobs Gilles
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2016
Field of study

This paper describes the contribution of LT3 for the CLPsych 2016 Shared Task on automatic triage of mental health forum posts. Our systems use multiclass Support Vector Machines (SVM), cascaded binary SVMs and ensembles with a rich feature set. The best systems obtain macro-averaged F-scores of 40% on the full task and 80% on the green versus alarming distinction. Multiclass SVMs with all features score best in terms of F-score, whereas feature filtering with bi-normal separation and classifier ensembling are found to improve recall of alarming posts

Crossref

Ghent University Academic Bibliography

Depression and Self-Harm Risk Assessment in Online Forums

Author: Cohan Arman
Goharian Nazli
Yates Andrew
Publication venue
Publication date: 01/01/2017
Field of study

Users suffering from mental health conditions often turn to online resources for support, including specialized online support communities or general communities such as Twitter and Reddit. In this work, we present a neural framework for supporting and studying users in both types of communities. We propose methods for identifying posts in support communities that may indicate a risk of self-harm, and demonstrate that our approach outperforms strong previously proposed methods for identifying such posts. Self-harm is closely related to depression, which makes identifying depressed users on general forums a crucial related task. We introduce a large-scale general forum dataset ("RSDD") consisting of users with self-reported depression diagnoses matched with control users. We show how our method can be applied to effectively identify depressed users from their use of language alone. We demonstrate that our method outperforms strong baselines on this general forum dataset.Comment: Expanded version of EMNLP17 paper. Added sections 6.1, 6.2, 6.4, FastText baseline, and CNN-

arXiv.org e-Print Archive

Crossref

MPG.PuRe

Predicting suicide risk from online postings in Reddit : the UGent-IDLab submission to the CLPysch 2019 Shared Task A

Author: Bekoulis Ioannis
Bitew Semere Kiros
Deleu Johannes
Demeester Thomas
Develder Chris
Sterckx Lucas
Zaporojets Klim
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2019
Field of study

This paper describes IDLab’s text classification systems submitted to Task A as part of the CLPsych 2019 shared task. The aim of this shared task was to develop automated systems that predict the degree of suicide risk of people based on their posts on Reddit. Bag-of-words features, emotion features and post level predictions are used to derive user-level predictions. Linear models and ensembles of these models are used to predict final scores. We find that predicting fine-grained risk levels is much more difficult than flagging potentially at-risk users. Furthermore, we do not find clear added value from building richer ensembles compared to simple baselines, given the available training data and the nature of the prediction task

Crossref

Ghent University Academic Bibliography

Predicting psychological health from childhood essays : the UGent-IDLab CLPsych 2018 shared task system.

Author: Deleu Johannes
Demeester Thomas
Develder Chris
Sterckx Lucas
Zaporojets Klim
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2018
Field of study

Crossref

Ghent University Academic Bibliography

Triaging Content Severity in Online Mental Health Forums

Author: Cohan Arman
Goharian Nazli
Yates Andrew
Young Sydney
Publication venue: 'Wiley'
Publication date: 01/01/2017
Field of study

Mental health forums are online communities where people express their issues and seek help from moderators and other users. In such forums, there are often posts with severe content indicating that the user is in acute distress and there is a risk of attempted self-harm. Moderators need to respond to these severe posts in a timely manner to prevent potential self-harm. However, the large volume of daily posted content makes it difficult for the moderators to locate and respond to these critical posts. We present a framework for triaging user content into four severity categories which are defined based on indications of self-harm ideation. Our models are based on a feature-rich classification framework which includes lexical, psycholinguistic, contextual and topic modeling features. Our approaches improve the state of the art in triaging the content severity in mental health forums by large margins (up to 17% improvement over the F-1 scores). Using the proposed model, we analyze the mental state of users and we show that overall, long-term users of the forum demonstrate a decreased severity of risk over time. Our analysis on the interaction of the moderators with the users further indicates that without an automatic way to identify critical content, it is indeed challenging for the moderators to provide timely response to the users in need.Comment: Accepted for publication in Journal of the Association for Information Science and Technology (2017

arXiv.org e-Print Archive

MPG.PuRe

Empirical Evaluation of Pre-trained Transformers for Human-Level NLP: The Role of Sample Size and Dimensionality

Author: Ganesan Adithya V
Matero Matthew
Ravula Aravind Reddy
Schwartz H. Andrew
Vu Huy
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 07/05/2021
Field of study

In human-level NLP tasks, such as predicting mental health, personality, or demographics, the number of observations is often smaller than the standard 768+ hidden state sizes of each layer within modern transformer-based language models, limiting the ability to effectively leverage transformers. Here, we provide a systematic study on the role of dimension reduction methods (principal components analysis, factorization techniques, or multi-layer auto-encoders) as well as the dimensionality of embedding vectors and sample sizes as a function of predictive performance. We first find that fine-tuning large models with a limited amount of data pose a significant difficulty which can be overcome with a pre-trained dimension reduction regime. RoBERTa consistently achieves top performance in human-level tasks, with PCA giving benefit over other reduction methods in better handling users that write longer texts. Finally, we observe that a majority of the tasks achieve results comparable to the best performance with just

\frac{1}{12}

of the embedding dimensions.Comment: 2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT

arXiv.org e-Print Archive

PubMed Central

SMHD : a large-scale resource for exploring online language usage for multiple mental health conditions

Author: Cohan Arman
Desmet Bart
Goharian Nazli
MacAvaney Sean
Soldaini Luca
Yates Andrew
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2018
Field of study

Mental health is a significant and growing public health concern. As language usage can be leveraged to obtain crucial insights into mental health conditions, there is a need for large-scale, labeled, mental health-related datasets of users who have been diagnosed with one or more of such conditions. In this paper, we investigate the creation of high-precision patterns to identify self-reported diagnoses of nine different mental health conditions, and obtain high-quality labeled data without the need for manual labelling. We introduce the SMHD (Self-reported Mental Health Diagnoses) dataset and make it available. SMHD is a novel large dataset of social media posts from users with one or multiple mental health conditions along with matched control users. We examine distinctions in users’ language, as measured by linguistic and psychological variables. We further explore text classification methods to identify individuals with mental conditions through their language

Ghent University Academic Bibliography