1,844 research outputs found

    Social media mental health analysis framework through applied computational approaches

    Get PDF
    Studies have shown that mental illness burdens not only public health and productivity but also established market economies throughout the world. However, mental disorders are difficult to diagnose and monitor through traditional methods, which heavily rely on interviews, questionnaires and surveys, resulting in high under-diagnosis and under-treatment rates. The increasing use of online social media, such as Facebook and Twitter, is now a common part of people’s everyday life. The continuous and real-time user-generated content often reflects feelings, opinions, social status and behaviours of individuals, creating an unprecedented wealth of person-specific information. With advances in data science, social media has already been increasingly employed in population health monitoring and more recently mental health applications to understand mental disorders as well as to develop online screening and intervention tools. However, existing research efforts are still in their infancy, primarily aimed at highlighting the potential of employing social media in mental health research. The majority of work is developed on ad hoc datasets and lacks a systematic research pipeline. [Continues.]</div

    A Multi-label Text Classification Framework: Using Supervised and Unsupervised Feature Selection Strategy

    Get PDF
    Text classification, the task of metadata to documents, needs a person to take significant time and effort. Since online-generated contents are explosively growing, it becomes a challenge for manually annotating with large scale and unstructured data. Recently, various state-or-art text mining methods have been applied to classification process based on the keywords extraction. However, when using these keywords as features in the classification task, it is common that the number of feature dimensions is large. In addition, how to select keywords from documents as features in the classification task is a big challenge. Especially, when using traditional machine learning algorithms in big data, the computation time is very long. On the other hand, about 80% of real data is unstructured and non-labeled in the real world. The conventional supervised feature selection methods cannot be directly used in selecting entities from massive data. Usually, statistical strategies are utilized to extract features from unlabeled data for classification tasks according to their importance scores. We propose a novel method to extract key features effectively before feeding them into the classification assignment. Another challenge in the text classification is the multi-label problem, the assignment of multiple non-exclusive labels to documents. This problem makes text classification more complicated compared with a single label classification. For the above issues, we develop a framework for extracting data and reducing data dimension to solve the multi-label problem on labeled and unlabeled datasets. In order to reduce data dimension, we develop a hybrid feature selection method that extracts meaningful features according to the importance of each feature. The Word2Vec is applied to represent each document by a feature vector for the document categorization for the big dataset. The unsupervised approach is used to extract features from real online-generated data for text classification. Our unsupervised feature selection method is applied to extract depression symptoms from social media such as Twitter. In the future, these depression symptoms will be used for depression self-screening and diagnosis

    Content shared on social media for national cancer survivors day 2018.

    Get PDF
    BACKGROUND:Studies estimate that the number of cancer survivors will double by 2050 due to improvements in diagnostic accuracy and treatment efficacy. Despite the growing population of cancer survivors, there is a paucity of research regarding how these individuals experience the transition from active treatment to long-term surveillance. While research has explored this transition from more organized venues, such as support groups for cancer survivors, this paper explores the discourses surrounding cancer survivorship on social media, paying particular attention to how individuals who identify as cancer survivors represent their experience. METHODS:We identified social media posts relating to cancer survivorship on Twitter and Instagram in early June 2018, in order to coincide with National Cancer Survivorship Day on June 3, 2018. We used nine pre-selected hashtags to identify content. For each hashtag, we manually collected the 150 most recent posts from Twitter and the 100 most recent plus the top 9 posts from Instagram. Our preliminary sample included 1172 posts; after eliminating posts from one hashtag due to irrelevance, we were left with 1063 posts. We randomly sampled 200 of these to create a subset for analysis; after review for irrelevant posts, 193 posts remained for analysis (118 from Instagram and 75 from Twitter). We utilized a grounded theory approach to analyze the posts, first open-coding a subset to develop a codebook, then applying the codebook to the rest of the sample and finally memo writing to develop themes. RESULTS:Overall, there is substantial difference in the tone and thematic content between Instagram and Twitter posts, Instagram takes on a more narrative form that represents journeys through cancer treatment and subsequent survivorship, whereas Twitter is more factual, leaning towards advocacy, awareness and fundraising. In terms of content type, 120 posts (62%) of the sample were images, of which 42 (35%) were images of the individual posting and 28 (23%) were images of patients posted by family or friends. Of the remaining images, 14 (12%) were of support groups and 7 (6%) were of family or friends. We identified four salient themes through analysis of the social media posts from Twitter and Instagram: social support, celebrating milestones and honoring survivors, expressing identity, and renewal vs. rebirth. DISCUSSION:We observed a marked relationship between physical appearance, functional status and survivorship. Additionally, our findings suggest the importance of social support for cancer patients and survivors as well as the role social media can pay in identity formation. CONCLUSION:Our findings suggest that individuals who identify as survivors on social media define their identity fluidly, incorporating elements of physical, emotional and psychological health as well as autonomy

    Predictive Analysis on Twitter: Techniques and Applications

    Full text link
    Predictive analysis of social media data has attracted considerable attention from the research community as well as the business world because of the essential and actionable information it can provide. Over the years, extensive experimentation and analysis for insights have been carried out using Twitter data in various domains such as healthcare, public health, politics, social sciences, and demographics. In this chapter, we discuss techniques, approaches and state-of-the-art applications of predictive analysis of Twitter data. Specifically, we present fine-grained analysis involving aspects such as sentiment, emotion, and the use of domain knowledge in the coarse-grained analysis of Twitter data for making decisions and taking actions, and relate a few success stories

    Novel Natural Language Processing Models for Medical Terms and Symptoms Detection in Twitter

    Get PDF
    This dissertation focuses on disambiguation of language use on Twitter about drug use, consumption types of drugs, drug legalization, ontology-enhanced approaches, and prediction analysis of data-driven by developing novel NLP models. Three technical aims comprise this work: (a) leveraging pattern recognition techniques to improve the quality and quantity of crawled Twitter posts related to drug abuse; (b) using an expert-curated, domain-specific DsOn ontology model that improve knowledge extraction in the form of drug-to-symptom and drug-to-side effect relations; and (c) modeling the prediction of public perception of the drug’s legalization and the sentiment analysis of drug consumption on Twitter. We collected 7.5 million data from August 2015 to March 2016. This work leveraged a longstanding, multidisciplinary collaboration between researchers at the Population & Center for Interventions, Treatment, and Addictions Research (CITAR) in the Boonshoft School of Medicine and the Department of Computer Science and Engineering. In addition, we aimed to develop and deploy an innovative prediction analysis algorithm for eDrugTrends, capable of semi-automated processing of Twitter data to identify emerging trends in cannabis and synthetic cannabinoid use in the U.S. In addition, the study included aim four, a use case study defined by tweets content analyzing PLWH, medication patterns, and identifying keyword trends via Twitter-based, user-generated content. This case study leveraged a multidisciplinary collaboration between researchers at the Departments of Family Medicine and Population and Public Health Sciences at Wright State University’s Boonshoft School of Medicine and the Department of Computer Science and Engineering. We collected 65K data from February 2022 to July 2022 with the U.S.-based HIV knowledge domain recruited via the Twitter API streaming platform. For knowledge discovery, domain knowledge plays a significant role in powering many intelligent frameworks, such as data analysis, information retrieval, and pattern recognition. Recent NLP and semantic web advances have contributed to extending the domain knowledge of medical terms. These techniques required a bag of seeds for medical knowledge discovery. Various initiate seeds create irrelevant data to the noise and negatively impact the prediction analysis performance. The methodology of aim one, PatRDis classifier, applied for noisy and ambiguous issues, and aim two, DsOn Ontology model, applied for semantic parsing and enriching the online medical to classify the data for HIV care medications engagement and symptom detection from Twitter. By applying the methodology of aims 2 and 3, we solved the challenges of ambiguity and explored more than 1500 cannabis and cannabinoid slang terms. Sentiments measured preceding the election, such as states with high levels of positive sentiment preceding the election who were engaged in enhancing their legalization status. we also used the same dataset for prediction analysis for marijuana legalization and consumption trend analysis (Ohio public polling data). In Aim 4, we applied three experiments, ensemble-learning, the RNN-LSM, the NNBERT-CNN models, and five techniques to determine the tweets associated with medication adherence and HIV symptoms. The long short-term memory (LSTM) model and the CNN for sentence classification produce accurate results and have been recently used in NLP tasks. CNN models use convolutional layers and maximum pooling or max-overtime pooling layers to extract higher-level features, while LSTM models can capture long-term dependencies between word sequences hence are better used for text classification. We propose attention-based RNN, MLP, and CNN deep learning models that capitalize on the advantages of LSTM and BERT techniques with an additional attention mechanism. We trained the model using NNBERT to evaluate the proposed model\u27s performance. The test results showed that the proposed models produce more accurate classification results, and BERT obtained higher recall and F1 scores than MLP or LSTM models. In addition, We developed an intelligent tool capable of automated processing of Twitter data to identify emerging trends in HIV disease, HIV symptoms, and medication adherence

    Early-stage pregnancy recognition on microblogs: Machine learning and lexicon-based approaches

    Get PDF
    Pregnancy carries high medical and psychosocial risks that could lead pregnant women to experience serious health consequences. Providing protective measures for pregnant women is one of the critical tasks during the pregnancy period. This study proposes an emotion-based mechanism to detect the early stage of pregnancy using real-time data from Twitter. Pregnancy-related emotions (e.g., anger, fear, sadness, joy, and surprise) and polarity (positive and negative) were extracted from users' tweets using NRC Affect Intensity Lexicon and SentiStrength techniques. Then, pregnancy-related terms were extracted and mapped with pregnancy-related sentiments using part-of-speech tagging and association rules mining techniques. The results showed that pregnancy tweets contained high positivity, as well as significant amounts of joy, sadness, and fear. The classification results demonstrated the possibility of using users’ sentiments for early-stage pregnancy recognition on microblogs. The proposed mechanism offers valuable insights to healthcare decision-makers, allowing them to develop a comprehensive understanding of users' health status based on social media posts

    Digital Pharmacovigilance: the medwatcher system for monitoring adverse events through automated processing of internet social media and crowdsourcing

    Full text link
    Thesis (Ph.D.)--Boston UniversityHalf of Americans take a prescription drug, medical devices are in broad use, and population coverage for many vaccines is over 90%. Nearly all medical products carry risk of adverse events (AEs), sometimes severe. However, pre- approval trials use small populations and exclude participants by specific criteria, making them insufficient to determine the risks of a product as used in the population. Existing post-marketing reporting systems are critical, but suffer from underreporting. Meanwhile, recent years have seen an explosion in adoption of Internet services and smartphones. MedWatcher is a new system that harnesses emerging technologies for pharmacovigilance in the general population. MedWatcher consists of two components, a text-processing module, MedWatcher Social, and a crowdsourcing module, MedWatcher Personal. With the natural language processing component, we acquire public data from the Internet, apply classification algorithms, and extract AE signals. With the crowdsourcing application, we provide software allowing consumers to submit AE reports directly. Our MedWatcher Social algorithm for identifying symptoms performs with 77% precision and 88% recall on a sample of Twitter posts. Our machine learning algorithm for identifying AE-related posts performs with 68% precision and 89% recall on a labeled Twitter corpus. For zolpidem tartrate, certolizumab pegol, and dimethyl fumarate, we compared AE profiles from Twitter with reports from the FDA spontaneous reporting system. We find some concordance (Spearman's rho= 0.85, 0.77, 0.82, respectively, for symptoms at MedDRA System Organ Class level). Where the sources differ, milder effects are overrepresented in Twitter. We also compared post-marketing profiles with trial results and found little concordance. MedWatcher Personal saw substantial user adoption, receiving 550 AE reports in a one-year period, including over 400 for one device, Essure. We categorized 400 Essure reports by symptom, compared them to 129 reports from the FDA spontaneous reporting system, and found high concordance (rho = 0.65) using MedDRA Preferred Term granularity. We also compared Essure Twitter posts with MedWatcher and FDA reports, and found rho= 0.25 and 0.31 respectively. MedWatcher represents a novel pharmacoepidemiology surveillance informatics system; our analysis is the first to compare AEs across social media, direct reporting, FDA spontaneous reports, and pre-approval trials
    • …
    corecore