4,392 research outputs found
Social media mining for identification and exploration of health-related information from pregnant women
Widespread use of social media has led to the generation of substantial
amounts of information about individuals, including health-related information.
Social media provides the opportunity to study health-related information about
selected population groups who may be of interest for a particular study. In
this paper, we explore the possibility of utilizing social media to perform
targeted data collection and analysis from a particular population group --
pregnant women. We hypothesize that we can use social media to identify cohorts
of pregnant women and follow them over time to analyze crucial health-related
information. To identify potentially pregnant women, we employ simple
rule-based searches that attempt to detect pregnancy announcements with
moderate precision. To further filter out false positives and noise, we employ
a supervised classifier using a small number of hand-annotated data. We then
collect their posts over time to create longitudinal health timelines and
attempt to divide the timelines into different pregnancy trimesters. Finally,
we assess the usefulness of the timelines by performing a preliminary analysis
to estimate drug intake patterns of our cohort at different trimesters. Our
rule-based cohort identification technique collected 53,820 users over thirty
months from Twitter. Our pregnancy announcement classification technique
achieved an F-measure of 0.81 for the pregnancy class, resulting in 34,895 user
timelines. Analysis of the timelines revealed that pertinent health-related
information, such as drug-intake and adverse reactions can be mined from the
data. Our approach to using user timelines in this fashion has produced very
encouraging results and can be employed for other important tasks where
cohorts, for which health-related information may not be available from other
sources, are required to be followed over time to derive population-based
estimates.Comment: 9 page
A Large-Scale CNN Ensemble for Medication Safety Analysis
Revealing Adverse Drug Reactions (ADR) is an essential part of post-marketing
drug surveillance, and data from health-related forums and medical communities
can be of a great significance for estimating such effects. In this paper, we
propose an end-to-end CNN-based method for predicting drug safety on user
comments from healthcare discussion forums. We present an architecture that is
based on a vast ensemble of CNNs with varied structural parameters, where the
prediction is determined by the majority vote. To evaluate the performance of
the proposed solution, we present a large-scale dataset collected from a
medical website that consists of over 50 thousand reviews for more than 4000
drugs. The results demonstrate that our model significantly outperforms
conventional approaches and predicts medicine safety with an accuracy of 87.17%
for binary and 62.88% for multi-classification tasks
Recommended from our members
Adverse Drug Reaction Classification With Deep Neural Networks
We study the problem of detecting sentences describing adverse drug reactions (ADRs) and frame the problem as binary classification. We investigate different neural network (NN) architectures for ADR classification. In particular, we propose two new neural network models, Convolutional Recurrent Neural Network (CRNN) by concatenating convolutional neural networks with recurrent neural networks, and Convolutional Neural Network with Attention (CNNA) by adding attention weights into convolutional neural networks. We evaluate various NN architectures on a Twitter dataset containing informal language and an Adverse Drug Effects (ADE) dataset constructed by sampling from MEDLINE case reports. Experimental results show that all the NN architectures outperform the traditional maximum entropy classifiers trained from n-grams with different weighting strategies considerably on both datasets. On the Twitter dataset, all the NN architectures perform similarly. But on the ADE dataset, CNN performs better than other more complex CNN variants. Nevertheless, CNNA allows the visualisation of attention weights of words when making classification decisions and hence is more appropriate for the extraction of word subsequences describing ADRs
Mining social media data for biomedical signals and health-related behavior
Social media data has been increasingly used to study biomedical and
health-related phenomena. From cohort level discussions of a condition to
planetary level analyses of sentiment, social media has provided scientists
with unprecedented amounts of data to study human behavior and response
associated with a variety of health conditions and medical treatments. Here we
review recent work in mining social media for biomedical, epidemiological, and
social phenomena information relevant to the multilevel complexity of human
health. We pay particular attention to topics where social media data analysis
has shown the most progress, including pharmacovigilance, sentiment analysis
especially for mental health, and other areas. We also discuss a variety of
innovative uses of social media data for health-related applications and
important limitations in social media data access and use.Comment: To appear in the Annual Review of Biomedical Data Scienc
- …