9,167 research outputs found

    A text classification framework for simple and effective early depression detection over social media streams

    Get PDF
    With the rise of the Internet, there is a growing need to build intelligent systems that are capable of efficiently dealing with early risk detection (ERD) problems on social media, such as early depression detection, early rumor detection or identification of sexual predators. These systems, nowadays mostly based on machine learning techniques, must be able to deal with data streams since users provide their data over time. In addition, these systems must be able to decide when the processed data is sufficient to actually classify users. Moreover, since ERD tasks involve risky decisions by which people's lives could be affected, such systems must also be able to justify their decisions. However, most standard and state-of-the-art supervised machine learning models (such as SVM, MNB, Neural Networks, etc.) are not well suited to deal with this scenario. This is due to the fact that they either act as black boxes or do not support incremental classification/learning. In this paper we introduce SS3, a novel supervised learning model for text classification that naturally supports these aspects. SS3 was designed to be used as a general framework to deal with ERD problems. We evaluated our model on the CLEF's eRisk2017 pilot task on early depression detection. Most of the 30 contributions submitted to this competition used state-of-the-art methods. Experimental results show that our classifier was able to outperform these models and standard classifiers, despite being less computationally expensive and having the ability to explain its rationale.Fil: Burdisso, Sergio Gastón. Universidad Nacional de San Luis; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - San Luis; ArgentinaFil: Errecalde, Marcelo Luis. Universidad Nacional de San Luis; ArgentinaFil: Montes y Gómez, Manuel. Instituto Nacional de Astrofísica, Óptica y Electrónica; Méxic

    A Big Data Platform for Real Time Analysis of Signs of Depression in Social Media

    Get PDF
    In this paper we propose a scalable platform for real-time processing of Social Media data. The platform ingests huge amounts of contents, such as Social Media posts or comments, and can support Public Health surveillance tasks. The processing and analytical needs of multiple screening tasks can easily be handled by incorporating user-defined execution graphs. The design is modular and supports different processing elements, such as crawlers to extract relevant contents or classifiers to categorise Social Media. We describe here an implementation of a use case built on the platform that monitors Social Media users and detects early signs of depressionThis work was funded by FEDER/Ministerio de Ciencia, Innovación y Universidades—Agencia Estatal de Investigación/ Project (RTI2018-093336-B-C21). Our research also receives financial support from the Consellería de Educación, Universidade e Formación Profesional (accreditation 2019–2022 ED431G-2019/04, ED431C 2018/29, ED431C 2018/19) and the European Regional Development Fund (ERDF), which acknowledges the CiTIUS-Research Center in Intelligent Technologies of the University of Santiago de Compostela as a Research Center of the Galician University SystemS

    A Novel Text Mining Approach for Mental Health Prediction Using Bi-LSTM and BERT Model

    Get PDF
    With the current advancement in the Internet, there has been a growing demand for building intelligent and smart systems that can efficiently address the detection of health-related problems on social media, such as the detection of depression and anxiety. These types of systems, which are mainly dependent on machine learning techniques, must be able to deal with obtaining the semantic and syntactic meaning of texts posted by users on social media. The data generated by users on social media contains unstructured and unpredictable content. Several systems based on machine learning and social media platforms have recently been introduced to identify health-related problems. However, the text representation and deep learning techniques employed provide only limited information and knowledge about the different texts posted by users. This is owing to a lack of long-term dependencies between each word in the entire text and a lack of proper exploitation of recent deep learning schemes. In this paper, we propose a novel framework to efficiently and effectively identify depression and anxiety-related posts while maintaining the contextual and semantic meaning of the words used in the whole corpus when applying bidirectional encoder representations from transformers (BERT). In addition, we propose a knowledge distillation technique, which is a recent technique for transferring knowledge from a large pretrained model (BERT) to a smaller model to boost performance and accuracy. We also devised our own data collection framework from Reddit and Twitter, which are the most common social media sites. Finally, we employed word2vec and BERT with Bi-LSTM to effectively analyze and detect depression and anxiety signs from social media posts. Our system surpasses other state-of-the-art methods and achieves an accuracy of 98% using the knowledge distillation technique

    Towards Measuring the Severity of Depression in Social Media via Text Classification

    Get PDF
    Psychologists have used tests or carefully designed survey questions, such as Beck’s Depression Inventory (BDI), to identify the presence of depression and to assess its severity level. On the other hand, methods for automatic depression detection have gained increasing interest since all the information available in social media, such as Twitter and Facebook, enables novel measurement based on language use. These methods learn to characterize depression through natural language use and have shown that, in fact, language usage can provide strong evidence in detecting depressive people. However, not much attention has been paid to measuring finer grain relationships between both aspects, such as how is connected the language usage with the severity level of depression. The present study is a first step towards that direction. First, we train a binary text classifier to detect “depressed” users and then we use its confidence values to estimate the user’s clinical depression level. In order to do that, our system has to fill the standard BDI depression questionnaire on users’ behalf, based only on the text of users’ postings. Our proposal, publicly tested in the eRisk 2019 T3 task, obtained promising results. This offers very interesting evidence of the potential of our method to estimate the level of depression directly form user’s posts in social media.XVI Workshop Bases de Datos y Minería de Datos.Red de Universidades con Carreras en Informátic

    Enhancing Mental Health Awareness through Twitter Analysis: A Comparative Study of Machine Learning and Hybrid Deep Learning Techniques

    Get PDF
    This study explores the utilization of social media data, specifically tweets and comments, for gaining insights into individuals' mental health conditions. The objective is to enhance mental health awareness and enable early detection and intervention. Twitter data is collected using depression-related keywords, and two models are employed: a Random Forest model with TF-IDF and a hybrid CNN-LSTM model incorporating word2vec. The performance of the CNN-LSTM model surpasses that of the Random Forest model, achieving an accuracy rate of 89.4%. Furthermore, a user interface is developed to analyze users' Twitter profiles based on their tweets, allowing for potential intervention through automated reply messages. By harnessing social media data and advanced machine learning techniques, this research contributes to improving mental health awareness and timely addressing of mental health concerns

    Towards Measuring the Severity of Depression in Social Media via Text Classification

    Get PDF
    Psychologists have used tests or carefully designed survey questions, such as Beck’s Depression Inventory (BDI), to identify the presence of depression and to assess its severity level. On the other hand, methods for automatic depression detection have gained increasing interest since all the information available in social media, such as Twitter and Facebook, enables novel measurement based on language use. These methods learn to characterize depression through natural language use and have shown that, in fact, language usage can provide strong evidence in detecting depressive people. However, not much attention has been paid to measuring finer grain relationships between both aspects, such as how is connected the language usage with the severity level of depression. The present study is a first step towards that direction. First, we train a binary text classifier to detect “depressed” users and then we use its confidence values to estimate the user’s clinical depression level. In order to do that, our system has to fill the standard BDI depression questionnaire on users’ behalf, based only on the text of users’ postings. Our proposal, publicly tested in the eRisk 2019 T3 task, obtained promising results. This offers very interesting evidence of the potential of our method to estimate the level of depression directly form user’s posts in social media.XVI Workshop Bases de Datos y Minería de Datos.Red de Universidades con Carreras en Informátic