3,974 research outputs found
Using Linguistic Features to Estimate Suicide Probability of Chinese Microblog Users
If people with high risk of suicide can be identified through social media
like microblog, it is possible to implement an active intervention system to
save their lives. Based on this motivation, the current study administered the
Suicide Probability Scale(SPS) to 1041 weibo users at Sina Weibo, which is a
leading microblog service provider in China. Two NLP (Natural Language
Processing) methods, the Chinese edition of Linguistic Inquiry and Word Count
(LIWC) lexicon and Latent Dirichlet Allocation (LDA), are used to extract
linguistic features from the Sina Weibo data. We trained predicting models by
machine learning algorithm based on these two types of features, to estimate
suicide probability based on linguistic features. The experiment results
indicate that LDA can find topics that relate to suicide probability, and
improve the performance of prediction. Our study adds value in prediction of
suicidal probability of social network users with their behaviors
Triaging Content Severity in Online Mental Health Forums
Mental health forums are online communities where people express their issues
and seek help from moderators and other users. In such forums, there are often
posts with severe content indicating that the user is in acute distress and
there is a risk of attempted self-harm. Moderators need to respond to these
severe posts in a timely manner to prevent potential self-harm. However, the
large volume of daily posted content makes it difficult for the moderators to
locate and respond to these critical posts. We present a framework for triaging
user content into four severity categories which are defined based on
indications of self-harm ideation. Our models are based on a feature-rich
classification framework which includes lexical, psycholinguistic, contextual
and topic modeling features. Our approaches improve the state of the art in
triaging the content severity in mental health forums by large margins (up to
17% improvement over the F-1 scores). Using the proposed model, we analyze the
mental state of users and we show that overall, long-term users of the forum
demonstrate a decreased severity of risk over time. Our analysis on the
interaction of the moderators with the users further indicates that without an
automatic way to identify critical content, it is indeed challenging for the
moderators to provide timely response to the users in need.Comment: Accepted for publication in Journal of the Association for
Information Science and Technology (2017
Regulating Mobile Mental Health Apps
Mobile medical apps (MMAs) are a fastâgrowing category of software typically installed on personal smartphones and wearable devices. A subset of MMAs are aimed at helping consumers identify mental states and/or mental illnesses. Although this is a fledgling domain, there are already enough extant mental health MMAs both to suggest a typology and to detail some of the regulatory issues they pose. As to the former, the current generation of apps includes those that facilitate selfâassessment or selfâhelp, connect patients with online support groups, connect patients with therapists, or predict mental health issues. Regulatory concerns with these apps include their quality, safety, and data protection. Unfortunately, the regulatory frameworks that apply have failed to provide coherent riskâassessment models. As a result, prudent providers will need to progress with caution when it comes to recommending apps to patients or relying on appâgenerated data to guide treatment
Overcoming data scarcity of Twitter: using tweets as bootstrap with application to autism-related topic content analysis
Notwithstanding recent work which has demonstrated the potential of using
Twitter messages for content-specific data mining and analysis, the depth of
such analysis is inherently limited by the scarcity of data imposed by the 140
character tweet limit. In this paper we describe a novel approach for targeted
knowledge exploration which uses tweet content analysis as a preliminary step.
This step is used to bootstrap more sophisticated data collection from directly
related but much richer content sources. In particular we demonstrate that
valuable information can be collected by following URLs included in tweets. We
automatically extract content from the corresponding web pages and treating
each web page as a document linked to the original tweet show how a temporal
topic model based on a hierarchical Dirichlet process can be used to track the
evolution of a complex topic structure of a Twitter community. Using
autism-related tweets we demonstrate that our method is capable of capturing a
much more meaningful picture of information exchange than user-chosen hashtags.Comment: IEEE/ACM International Conference on Advances in Social Networks
Analysis and Mining, 201
Global disease monitoring and forecasting with Wikipedia
Infectious disease is a leading threat to public health, economic stability,
and other key social structures. Efforts to mitigate these impacts depend on
accurate and timely monitoring to measure the risk and progress of disease.
Traditional, biologically-focused monitoring techniques are accurate but costly
and slow; in response, new techniques based on social internet data such as
social media and search queries are emerging. These efforts are promising, but
important challenges in the areas of scientific peer review, breadth of
diseases and countries, and forecasting hamper their operational usefulness.
We examine a freely available, open data source for this use: access logs
from the online encyclopedia Wikipedia. Using linear models, language as a
proxy for location, and a systematic yet simple article selection procedure, we
tested 14 location-disease combinations and demonstrate that these data
feasibly support an approach that overcomes these challenges. Specifically, our
proof-of-concept yields models with up to 0.92, forecasting value up to
the 28 days tested, and several pairs of models similar enough to suggest that
transferring models from one location to another without re-training is
feasible.
Based on these preliminary results, we close with a research agenda designed
to overcome these challenges and produce a disease monitoring and forecasting
system that is significantly more effective, robust, and globally comprehensive
than the current state of the art.Comment: 27 pages; 4 figures; 4 tables. Version 2: Cite McIver & Brownstein
and adjust novelty claims accordingly; revise title; various revisions for
clarit
- âŠ