2 research outputs found

    Disease mentions in airport and hospital geolocations expose dominance of news events for disease concerns

    No full text
    Abstract Background In recent years, Twitter has been applied to monitor diseases through its facility to monitor users’ comments and concerns in real-time. The analysis of tweets for disease mentions should reflect not only user specific concerns but also disease outbreaks. This requires the use of standard terminological resources and can be focused on selected geographic locations. In our study, we differentiate between hospital and airport locations to better distinguish disease outbreaks from background mentions of disease concerns. Results Our analysis covers all geolocated tweets over a 6 months time period, uses SNOMED-CT as a standard medical terminology, and explores language patterns (as well as MetaMap) to identify mentions of diseases in reference to the geolocation of tweets. Contrary to our expectation, hospital and airport geolocations are not suitable to collect significant portions of tweets concerned with disease outcomes. Overall, geolocated tweets exposed a large number of messages commenting on disease-related news articles. Furthermore, the geolocated messages exposed an over-representation of non-communicable diseases in contrast to infectious diseases. Conclusions Our findings suggest that disease mentions on Twitter not only serve the purpose to share personal statements but also to share concerns about news articles. In particular, our assumption about the relevance of hospital and airport geolocations for an increased frequency of diseases mentions has not been met. To further address the linguistic cues, we propose the study of health forums to understand how a change in medium affects the language applied by the users. Finally, our research on the language use may provide essential clues to distinguish complementary trends in the use of language in Twitter when analysing health-related topics

    COHORT IDENTIFICATION FROM FREE-TEXT CLINICAL NOTES USING SNOMED CT’S SEMANTIC RELATIONS

    Get PDF
    In this paper, a new cohort identification framework that exploits the semantic hierarchy of SNOMED CT is proposed to overcome the limitations of supervised machine learning-based approaches. Eligibility criteria descriptions and free-text clinical notes from the 2018 National NLP Clinical Challenge (n2c2) were processed to map to relevant SNOMED CT concepts and to measure semantic similarity between the eligibility criteria and patients. The eligibility of a patient was determined if the patient had a similarity score higher than a threshold cut-off value, which was established where the best F1 score could be achieved. The performance of the proposed system was evaluated for three eligibility criteria. The current framework’s macro-average F1 score across three eligibility criteria was higher than the previously reported results of the 2018 n2c2 (0.933 vs. 0.889). This study demonstrated that SNOMED CT alone can be leveraged for cohort identification tasks without referring to external textual sources for training.Doctor of Philosoph
    corecore