7 research outputs found

    End-to-end deep framework for disease named entity recognition using social media data

    Get PDF
    © 2017 IEEE. A growing interest in the natural language processing methods applied to healthcare applications has been observed in the recent years. In particular, new drug pharmacological properties can be derived patient observations shared in social media forums. Developing approaches designed to automatically retrieve this information is of no low interest for personalized medicine and wide-scale drug tests. The full potential of the effective exploitation of both textual data and published biological data for drug research often goes untapped mostly because of the lack of tools and focused methodologies to curate and integrate the data and transform it into new, experimentally testable hypotheses. Deep learning architectures have shown promising results for a wide range of tasks. In this work, we propose to address a challenging problem by applying modern deep neural networks for disease named entity recognition. An essential step for this task is recognition of disease mentions and medical concept nor-malization, which is highly difficult with simple string matching approaches. We cast the task as an end-to-end problem, solved using two architectures based on recurrent neural networks and pre-trained word embeddings. We show that it is possible to assess the practicability of using social media data to extract representative medical concepts for pharmacovigilance or drug repurposing

    Interactive attention network for adverse drug reaction classification

    Get PDF
    © Springer Nature Switzerland AG 2018. Detection of new adverse drug reactions is intended to both improve the quality of medications and drug reprofiling. Social media and electronic clinical reports are becoming increasingly popular as a source for obtaining the health-related information, such as identification of adverse drug reactions. One of the tasks of extracting adverse drug reactions from social media is the classification of entities that describe the state of health. In this paper, we investigate the applicability of Interactive Attention Network for identification of adverse drug reactions from user reviews. We formulate this problem as a binary classification task. We show the effectiveness of this method on a number of publicly available corpora

    LCCT: a semisupervised model for sentiment classification

    Get PDF
    Conference Theme: Human Language TechnologiesAnalyzing public opinions towards products, services and social events is an important but challenging task. An accurate sentiment analyzer should take both lexicon-level information and corpus-level information into account. It also needs to exploit the domain-specific knowledge and utilize the common knowledge shared across domains. In addition, we want the algorithm being able to deal with missing labels and learning from incomplete sentiment lexicons. This paper presents a LCCT (Lexicon-based and Corpus-based, Co-Training) model for semi-supervised sentiment classification. The proposed method combines the idea of lexicon-based learning and corpus-based learning in a unified co-training framework. It is capable of incorporating both domain-specific and domain-independent knowledge. Extensive experiments show that it achieves very competitive classification accuracy, even with a small portion of labeled data. Comparing to state-of-the-art sentiment classification methods, the LCCT approach exhibits significantly better performances on a variety of datasets in both English and Chinese. © 2015 Association for Computational Linguisticspublished_or_final_versio

    Sentiment Analysis in Spanish for Improvement of Products and Services: A Deep Learning Approach

    Get PDF

    Sentiment Analysis on Tweets about Diabetes: An Aspect-Level Approach

    Get PDF
    In recent years, some methods of sentiment analysis have been developed for the health domain; however, the diabetes domain has not been explored yet. In addition, there is a lack of approaches that analyze the positive or negative orientation of each aspect contained in a document (a review, a piece of news, and a tweet, among others). Based on this understanding, we propose an aspect-level sentiment analysis method based on ontologies in the diabetes domain. The sentiment of the aspects is calculated by considering the words around the aspect which are obtained through N-gram methods (N-gram after, N-gram before, and N-gram around). To evaluate the effectiveness of our method, we obtained a corpus from Twitter, which has been manually labelled at aspect level as positive, negative, or neutral. The experimental results show that the best result was obtained through the N-gram around method with a precision of 81.93%, a recall of 81.13%, and an F-measure of 81.24%

    Co-training over Domain-independent and Domain-dependent Features for Sentiment Analysis of an Online Cancer Support Community

    No full text
    Abstract—Sentiment analysis has been widely researched in the domain of online review sites with the aim of getting summarized opinions of product users about different aspects of the products. However, there has been little work focusing on identifying the polarity of sentiments expressed by users in online health communities such as cancer support forums, etc. Online health communities act as a medium through which people share their health concerns with fellow members of the community and get social support. Identifying sentiments expressed by members in a health community can be helpful in understanding dynamics of the community such as dominant health issues, emotional impacts of interactions on members, etc. In this work, we perform sentiment classification of user posts in an online cancer support community (Cancer Survivors Network). We use Domain-dependent and Domain-independent sentiment features as the two complementary views of a post and use them for post classification in a semi-supervised setting using the co-training algorithm. Experimental results demonstrate effectiveness of our methods. Keywords—Sentiment analysis, co-training, indirect emotional support, direct emotional support, online health community. I

    Social media narratives in non-communicable disease: their dynamics and value for patients, communities and health researchers

    Get PDF
    Background: Usage of social media is now widespread and growing, as is the number of people living with Non-Communicable Diseases (NCDs) such as diabetes and cancer. This thesis examines how social media are being used to share or discuss NCDs and the benefits, challenges and implications of these trends as a manifestation of digital public health. Aim and research questions: The aim of this research is to address the gap in empirical, evidence-based research into the secondary use of data from social media to understand patient health issues and inform public health research into NCDs. To this end, seven research questions, each linked to a sub-project, were defined and tested during the course of the six-year programme: 1.What is the status of the existing multi-disciplinary research literature based on analysis of data posted on social media for public health research, and where are the gaps in this research? 2.Can existing systematic review methods be re-purposed and applied to analyse data posted on social media? 3.How are research sponsors and researchers addressing the ethical challenges of analysing data posted on social media? 4.To what extent are diabetes-related posts on Twitter relevant to the clinical condition and what topics and intentions are represented in these posts? 5.In what ways do people affected by Type 1 diabetes use different social media (e.g. for social interaction, support-seeking, information-sharing) and what are the implications for researchers wishing to use these data sources in their studies? 6.Are these differences in platform usage and associated data types also seen in people affected by lung cancer? 7.Can characteristic illness trajectories be seen in a cancer patient’s digital narrative and what insights can be gained to inform palliative care services? Methods: A range of different qualitative and quantitative methods and frameworks were used to address each of the research questions listed. Arksey and O’Malley’s five-stage scoping review framework and the PRISMA guidelines are applied to the systematic scoping review of existing literature. The PRISMA guidelines and checklist are re-purposed and applied to the manual extraction and analysis of social media posts. Bjerglund-Andersen and Söderqvist’s typology of social media uses in research and Conway’s taxonomy of ethical considerations are used to classify the ethics guidelines available to researchers. The findings of these were used to inform the research design of the four empirical studies. The methods applied in the conduct of the empirical studies include a content and narrative analysis of cross-sectional and longitudinal data sourced from Twitter, Facebook, the Type 1 diabetes discussion forum on Diabetes.co.uk and the lung cancer discussion forum on Macmillan.org.uk, as well as the application of Bales’ Interaction Process Analysis and Emanuel and Emanuel’s framework for a good death. Results : Of the 49 systematic, quasi-systematic and scoping reviews identified, 24 relate to the secondary use of data from social media, with eight of these focused on infectious disease surveillance and only two on NCDs. Existing reviews tend to be fragmented, narrow in scope and siloed in different academic communities, with limited consideration of the different types of data, analytical methods and ethical issues involved, therefore creating a need for further reviews to synthesise the emerging evidence-base. The rapid increase in the volume of published research is evident, from the results of RQ1, with 87% of the eligible studies published between 2013-2017. Of the 105 eligible empirical studies that focused on NCDs, cancer (54%) and diabetes (20%) dominate the literature. Data is sourced from Twitter (26%), Facebook (14%) and blogs (10%), conducted, published and funded by the medical community. Since 2012, automated methods have increasingly been applied to extract and analyse large volumes of data. Those that use manual methods for extraction did not apply a consistent approach to doing so; the PRISMA guidelines and checklist were therefore re-purposed and applied to analyse data extracted from social media in response to RQ2. The deficit of ethical guidance available to inform research that involves social media data was also identified as a result of RQ3 and the guidelines provided by the ESRC, BPS, AoIR and NIHR were prioritised for the purposes of this research project. Results from the four empirical studies (RQ4-7) reveal that different forms of social interaction and support are represented in the variety of social media platforms available and that this is influenced by the type and nature of the condition with which people are affected, as well as the affordances offered by such platforms. In the pilot study associated with RQ4, Twitter was identified as a ‘noisy’ source of data about diabetes, with only 66% of the sample being relevant to the clinical condition. Twelve per cent of the eligible sample was associated with Type 2 diabetes, compared to 6% for Type 1, and most were information-giving in nature (49%) and correlated with the diagnosis, treatment and management of the condition (44%). A comparison of Twitter to the Type 1 Diabetes community on Facebook and the discussion forum on Diabetes.co.uk for RQ5 indicated that all three social media platforms were used to disseminate information about the condition. However, the Type 1 Diabetes Group on Facebook and the Type 1 discussion forum on Diabetes.co.uk were also used for social interaction and peer support, hence defying the generalisations made in public health studies, where social media platforms were often considered equal or synonymous. The results from the third empirical study into lung cancer (RQ6) support this, indicating that, by virtue of their digital architecture, user base and self-moderating communities, the Lung Cancer Support Group on Facebook and the lung cancer discussion forum on Macmillan.org.uk are more successful in their utility for social interaction and emotional and informational support. Meanwhile, the sample derived from Twitter hashtags showed greater companionship support. The final empirical study in this PhD research project is associated with RQ7 and used longitudinal data posted by a terminally ill patient on Twitter. This revealed that patient activity on social media mirrors the different phases of the end-of-life illness trajectory described in the literature and that it is comparable to or compliments insights garnered using more traditional qualitative research techniques. It also shows the value of such innovative methods for understanding how terminal disease is experienced by and affects individuals, how they cope, how support is sought and obtained and how patients feel about the ability of palliative care services to meet their needs at different stages. Conclusions: The analysis of health data posted on social media continues to be an expanding and evolving field of multi-disciplinary research. The results of the studies included in this thesis reveal the emergence of new methods and ethical considerations to inform research design as well as ethics policy. The re-purposed PRISMA guidelines and checklist were presented at the 2014 Medicine 2.0 Summit and World Congress whilst the review of ethical guidelines was published in the Research Ethics journal. The four empirical studies that extracted and analysed data from social media provide novel insight into the social narratives of those impacted by diabetes and cancer and can be used to inform future research and practice. The results of these studies have, to date, been presented at four international conferences and published in npj Digital Medicine and BMC Palliative Care. Although this thesis and associated publications contribute to an emerging body of knowledge, further research is warranted into the manual versus automated techniques that can be applied and the differences in social interaction and support needed by people affected by different NCDs
    corecore