    ObjectiveThe purpose of this research is to identify how data science is applied in suicide prevention literature, describe the current landscape of this literature and highlight areas where data science may be useful for future injury prevention research.DesignWe conducted a literature review of injury prevention and data science in April 2020 and January 2021 in three databases.MethodsFor the included 99 articles, we extracted the following: (1) author(s) and year; (2) title; (3) study approach (4) reason for applying data science method; (5) data science method type; (6) study description; (7) data source and (8) focus on a disproportionately affected population.ResultsResults showed the literature on data science and suicide more than doubled from 2019 to 2020, with articles with individual-level approaches more prevalent than population-level approaches. Most population-level articles applied data science methods to describe (n=10) outcomes, while most individual-level articles identified risk factors (n=27). Machine learning was the most common data science method applied in the studies (n=48). A wide array of data sources was used for suicide research, with most articles (n=45) using social media and web-based behaviour data. Eleven studies demonstrated the value of applying data science to suicide prevention literature for disproportionately affected groups.ConclusionData science techniques proved to be effective tools in describing suicidal thoughts or behaviour, identifying individual risk factors and predicting outcomes. Future research should focus on identifying how data science can be applied in other injury-related topics.CC999999/ImCDC/Intramural CDC HHSUnited States

    Suicide Risk Prediction for Users with Depression in Question Answering Communities: A Design Based on Deep Learning

    In the field of public health, suicide risk prediction is a central and urgent problem. Existing researches mainly focus on user’s current post but overlook historical post. In light of the psychological characteristics, we argue that it is valuable to consider users’ historical post in addition to current post for predicting suicide risk. Based on this rationale, we propose a deep learning-based suicide risk prediction framework - Dynamic Historical Information based Suicide Risk Prediction (DHISRP) - by considering the user’s current post content and historical post content. To capture the dynamic and complicated information of historical post, we design a unit based on long short-term memory (LSTM), named RNLSTM. We also conduct experiments to compare with the benchmark model to prove the effectiveness of our model, and perform ablation experiments to verify the significance of each component in the prediction framework in this study

    Characteristics of Multi-Class Suicide Risks Tweets Through Feature Extraction and Machine Learning Techniques

    This paper presents a detailed analysis of the linguistic characteristics connected to specific levels of suicide risks, providing insight into the impact of the feature extraction techniques on the effectiveness of the predictive models of suicide ideation. Prevalent initiatives of research works had been observed in the detection of suicide ideation from social media posts through feature extraction and machine learning techniques but scarcely on the multiclass classification of suicide risks and analysis of linguistic characteristics' impact on predictability. To address this issue, this paper proposes the implementation of a machine learning framework that is capable of analyzing multiclass classification of suicide risks from social media posts with extended analysis of linguistic characteristics that contribute to suicide risk detection. A total of 552 samples of a supervised dataset of Twitter posts were manually annotated for suicide risk modeling. Feature extraction was done through a combination of feature extraction techniques of term frequency-inverse document frequency (TF-IDF), Part-of-Speech (PoS) tagging, and valence-aware dictionary for sentiment reasoning (VADER). Data training and modeling were conducted through the Random Forest technique. Testing of 138 samples with scenarios of detections in real-time data for the performance evaluation yielded 86.23% accuracy, 86.71% precision, and 86.23% recall, an improved result with a combination of feature extraction techniques rather than data modeling techniques. An extended analysis of linguistic characteristics showed that a sentence's context is the main contributor to suicide risk classification accuracy, while grammatical tags and strong conclusive terms were not

    Detecting Mental Distresses Using Social Behavior Analysis in the Context of COVID-19: A Survey

    Online social media provides a channel for monitoring people\u27s social behaviors from which to infer and detect their mental distresses. During the COVID-19 pandemic, online social networks were increasingly used to express opinions, views, and moods due to the restrictions on physical activities and in-person meetings, leading to a significant amount of diverse user-generated social media content. This offers a unique opportunity to examine how COVID-19 changed global behaviors regarding its ramifications on mental well-being. In this article, we surveyed the literature on social media analysis for the detection of mental distress, with a special emphasis on the studies published since the COVID-19 outbreak. We analyze relevant research and its characteristics and propose new approaches to organizing the large amount of studies arising from this emerging research area, thus drawing new views, insights, and knowledge for interested communities. Specifically, we first classify the studies in terms of feature extraction types, language usage patterns, aesthetic preferences, and online behaviors. We then explored various methods (including machine learning and deep learning techniques) for detecting mental health problems. Building upon the in-depth review, we present our findings and discuss future research directions and niche areas in detecting mental health problems using social media data. We also elaborate on the challenges of this fast-growing research area, such as technical issues in deploying such systems at scale as well as privacy and ethical concerns

    Suicide risk detection on social media using neural networks

    Σύμφωνα με τον Παγκόσμιο Οργανισμό Υγείας, προσεγγιστικά 280 εκατομμύρια άνθρωποι υποφέρουν από κατάθληψη και πάνω από 700.000 αυτοκτονούν, ενώ η αυτοκτονία είναι η τέταρτη σε αριθμό θυμάτων αιτία θανάτου εφήβων και νέων ανθρώπων. Παρόλο που η κατάθλιψη επιδέχεται θεραπεία, η πλειοψηφία των νοσούντων αρνείται να αποδεχτεί ότι νοσεί και να ζητήσει ψυχιατρική βοήθεια, ειδικά στις αναπτυσσόμενες χώρες όπου υπάρχει κοινωνικό στίγμα όσον αφορά τις ψυχικές ασθένειες. Οι πλατφόρμες κοινωνικών δικτύων φιλοξενούν ανθρώπους από όλες τις δημογραφικές ομάδες με διαφορετικά χαρακτηριστικά και έχουν ιδιαίτερη απήχηση στους νέους. Για τους περισσότερους ανθρώπους τα κοινωνικά δίκτυα αποτελούν έναν ασφαλή χώρο όπου μπορούν να εκφράσουν τις σκέψεις και τις ανησυχίες τους, ειδικά όταν τους παρέχεται ανωνυμία. Πλατφόρμες όπως το Facebook και το Instagram έχουν δημιουργήσει επιλογή αναφοράς δημοσίευσης όπου χρήστες μπορούν να αναφέρουν μια δημοσίευση που υπονοεί αυτοκτονικές κινήσεις. Είναι υψίστης σημασίας αυτή η διαδικάσία να αυτοματοποιηθεί ούτως ώστε να μην παραβλέπονται δημοσιεύσεις με τέτοιο περιεχόμενο και επιπλέον να υπάρχει η δυνατότητα πρόβλεψης καταθλιπτικών τάσεων το νωρίτερο δυνατό. Σκοπός αυτής της εργασίας είναι η πρόταση και δημιουργία ενός μοντέλου που θα εκπαιδεύεται σε δημοσιεύσεις της πλατφόρμας κοινωνικών δικτύων Reddit και θα προβλέπει με αξιοπιστία αν ένας χρήστης εμφανίζει σημάδια κατάθλιψης εξετάζοντας της δημοσιεύσεις του. Το προτεινόμενο μοντέλο νευρωνικών δικτύων αποτελεί ένα υβριδικό μοντέλο που συνιστάται από συνελικτικά και αναδρομικά δίκτυα, καθώς και από έναν μηχανισμό προσοχής για τη μεγιστοποίηση της ακρίβειας των προβλέψεων.According to World Health Organization records, approximately 280 million people suffer from depression and over 700.000 people die due to suicide while suicide is the fourth leading cause of death among adolescents and young people. Whilst depression is a treatable condition, most people refuse to accept that they are affected and therefore seek psychiatric help, due to social stigma associated with mental disorders, especially in middle-income countries. Social media platforms host people of all kinds of demographic groups and characteristics and they thrive on young people. For most people social media set a safe space where they can share thoughts and concerns, especially when they are covered by anonymity. Platforms like Facebook and Instagram have created report options for such cases where users can report a post that implies suicidal actions. It is of high importance that this procedure becomes automated, so that no users in need slip our attention and also depressive tendencies can be predicted when it is still early. To resolve this problem, this thesis suggests a NN model that is trained on Reddit users’ posts and can reliably predict if a user shows depressive or suicidal signs by examining his posts. The suggested NN is a hybrid model that combines CNN and Bi-LSTM networks and also uses an attention mechanism to optimise predictions

    Detection of Suicide Ideation in Social Media Forums Using Deep Learning

    Suicide ideation expressed in social media has an impact on language usage. Many at-risk individuals use social forum platforms to discuss their problems or get access to information on similar tasks. The key objective of our study is to present ongoing work on automatic recognition of suicidal posts. We address the early detection of suicide ideation through deep learning and machine learning-based classification approaches applied to Reddit social media. For such purpose, we employ an LSTM-CNN combined model to evaluate and compare to other classification models. Our experiment shows the combined neural network architecture with word embedding techniques can achieve the best relevance classification results. Additionally, our results support the strength and ability of deep learning architectures to build an effective model for a suicide risk assessment in various text classification tasks