12 research outputs found

    A real-time biosurveillance mechanism for early-stage disease detection from microblogs:a case study of interconnection between emotional and climatic factors related to migraine disease

    Get PDF
    For many years, certain climatic factors have been used to predict potential disease outcomes of relevance to humans. This is because early discovery of disease (or its symptoms) would help people or healthcare professionals to take the necessary precautions. Since microblogs can be used to create new connections and maintain existing relationships,disease detection in microblogs is still considered a serious problem for many healthcare systems, especially for establishing a successful epidemic recognition procedure. To tackle this issue, this study proposed a novel tracking approach to diagnose illnesses in microblogs. It is based on the interconnection between certain emotional type and climatic factors associated with a specific disease (e.g., migraine). In this study, detailed migraine data were collected from Twitter. We used K-means and Apriori algorithms to extract migraine-related emotions and investigate the potential associations between migraine symptoms and climatic factors. The results showed that sad emotions were highly interrelated with migraine symptoms. The classification results showed that Sequential Minimal Optimization (SMO) was efficient (95.53% accuracy) in detecting the migraine symptoms from Twitter. The proposed mechanism can be used efficiently in biosurveillance systems due to its capability in identifying the hidden symptoms of a sickness on microblogs. This study paves the way to discover disease-related features using both emotional and climatic factors.</p

    A Multilingual Approach to Discover Cross-Language Links in Wikipedia

    No full text
    International audienceWikipedia is a well-known public and collaborative ency- clopaedia consisting of millions of articles. Initially in English, the pop- ular website has grown to include versions in over 288 languages. These versions and their articles are interconnected via cross-language links, which not only facilitate navigation and understanding of concepts in multiple languages, but have been used in natural language processing applications, developments in linked open data, and expansion of minor Wikipedia language versions. These applications are the motivation for an automatic, robust, and accurate technique to identify cross-language links. In this paper, we present a multilingual approach called EurekaCL to automatically identify missing cross-language links in Wikipedia. More precisely, given a Wikipedia article (the source) EurekaCL uses the mul- tilingual and semantic features of BabelNet 2.0 in order to efficiently identify a set of candidate articles in a target language that are likely to cover the same topic as the source. The Wikipedia graph structure is then exploited both to prune and to rank the candidates. Our eval- uation carried out on 42,000 pairs of articles in eight language versions of Wikipedia shows that our candidate selection and pruning procedures allow an effective selection of candidates which significantly helps the determination of the correct article in the target language version
    corecore