581 research outputs found

    Combining Clinical Symptoms and Patient Features for Malaria Diagnosis: Machine Learning Approach

    Get PDF
    This research article published by Taylor & Francis Online, 2022Presumptive treatment and self-medication for malaria have been used in limited-resource countries. However, these approaches have been considered unreliable due to the unnecessary use of malaria medication. This study aims to demonstrate supervised machine learning models in diagnosing malaria using patient symptoms and demographic features. Malaria diagnosis dataset extracted in two regions of Tanzania: Morogoro and Kilimanjaro. Important features were selected to improve model performance and reduce processing time. Machine learning classifiers with the k-fold cross-validation method were used to train and validate the model. The dataset developed a machine learning model for malaria diagnosis using patient symptoms and demographic features. A malaria diagnosis dataset of 2556 patients’ records with 36 features was used. It was observed that the ranking of features differs among regions and when combined dataset. Significant features were selected, residence area, fever, age, general body malaise, visit date, and headache. Random Forest was the best classifier with an accuracy of 95% in Kilimanjaro, 87% in Morogoro and 82% in the combined dataset. Based on clinical symptoms and demographic features, a regional-specific malaria predictive model was developed to demonstrate relevant machine learning classifiers. Important features are useful in making the disease prediction

    A decision support system to follow up and diagnose primary headache patients using semantically enriched data

    Get PDF
    Abstract Background Headache disorders are an important health burden, having a large health-economic impact worldwide. Current treatment & follow-up processes are often archaic, creating opportunities for computer-aided and decision support systems to increase their efficiency. Existing systems are mostly completely data-driven, and the underlying models are a black-box, deteriorating interpretability and transparency, which are key factors in order to be deployed in a clinical setting. Methods In this paper, a decision support system is proposed, composed of three components: (i) a cross-platform mobile application to capture the required data from patients to formulate a diagnosis, (ii) an automated diagnosis support module that generates an interpretable decision tree, based on data semantically annotated with expert knowledge, in order to support physicians in formulating the correct diagnosis and (iii) a web application such that the physician can efficiently interpret captured data and learned insights by means of visualizations. Results We show that decision tree induction techniques achieve competitive accuracy rates, compared to other black- and white-box techniques, on a publicly available dataset, referred to as migbase. Migbase contains aggregated information of headache attacks from 849 patients. Each sample is labeled with one of three possible primary headache disorders. We demonstrate that we are able to reduce the classification error, statistically significant (ρ≤0.05), with more than 10% by balancing the dataset using prior expert knowledge. Furthermore, we achieve high accuracy rates by using features extracted using the Weisfeiler-Lehman kernel, which is completely unsupervised. This makes it an ideal approach to solve a potential cold start problem. Conclusion Decision trees are the perfect candidate for the automated diagnosis support module. They achieve predictive performances competitive to other techniques on the migbase dataset and are, foremost, completely interpretable. Moreover, the incorporation of prior knowledge increases both predictive performance as well as transparency of the resulting predictive model on the studied dataset

    Drug side-effect prediction using machine learning methods

    Get PDF
    Drug toxicity (or adverse side effects) is a pressing health problem which is also an impediment to the development of therapeutically effective drugs. Despite many on-going efforts to determine the toxicity beforehand, computational prediction of drug side-effects remains a challenging task. This thesis presents an approach to predict side-effects by utilizing side-information sources for the drugs, while simultaneously comparing state-of-the-art machine learning methods to improve accuracy. Specifically, the thesis implements a data-analysis pipeline for obtaining side-information that are useful for the prediction task. This thesis then formulates the drug side-effect prediction as a machine learning problem: Given disease indications and structural features (as side-information sources) of drugs, for which some measurements of side-effect exist, predict sideeffect for a new drug. As case studies, the prediction accuracies are compared for ten different side-effects using linear as well as non-linear machine learning methods. The thesis summarizes three key findings. First, the drug side-information sources are predictive of the side-effects. Second, non-linear methods show improved prediction accuracies as compared to their linear analogs. Third, the integration of disease indications and structural features with a principled machine learning approach further improves the drug side-effect predictions. However, the current study limits the analysis assuming side-effects are independent. In future, modeling the joint relationships of several side-effects could yield more strong predictions and better help to understand the underlying biological mechanism

    Subgrouping factors influencing migraine intensity in women: A semi-automatic methodology based on machine learning and information geometry

    Full text link
    This is the peer reviewed version of the following article: Pérez-Benito, F.J., Conejero, J.A., Sáez, C., García-Gómez, J.M., Navarro-Pardo, E., Florencio, L.L. and Fernández-de-las-Peñas, C. (2020), Subgrouping Factors Influencing Migraine Intensity in Women: A Semi-automatic Methodology Based on Machine Learning and Information Geometry. Pain Pract, 20: 297-309, which has been published in final form at https://doi.org/10.1111/papr.12854. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Self-Archiving.[EN] Background Migraine is a heterogeneous condition with multiple clinical manifestations. Machine learning algorithms permit the identification of population groups, providing analytical advantages over other modeling techniques. Objective The aim of this study was to analyze critical features that permit the differentiation of subgroups of patients with migraine according to the intensity and frequency of attacks by using machine learning algorithms. Methods Sixty-seven women with migraine participated. Clinical features of migraine, related disability (Migraine Disability Assessment Scale), anxiety/depressive levels (Hospital Anxiety and Depression Scale), anxiety state/trait levels (State-Trait Anxiety Inventory), and pressure pain thresholds (PPTs) over the temporalis, neck, second metacarpal, and tibialis anterior were collected. Physical examination included the flexion-rotation test, cervical range of cervical motion, forward head position while sitting and standing, passive accessory intervertebral movements (PAIVMs) with headache reproduction, and joint positioning sense error. Subgrouping was based on machine learning algorithms by using the nearest neighbors algorithm, multisource variability assessment, and random forest model. Results For migraine intensity, group 2 (women with a regular migraine headache intensity score of 7 on an 11-point Numeric Pain Rating Scale [where 0 = no pain and 10 = maximum pain]) were younger and had lower joint positioning sense error in cervical rotation, greater cervical mobility in rotation and flexion, lower flexion-rotation test scores, positive PAIVMs reproducing migraine, normal PPTs over the tibialis anterior, shorter migraine history, and lower cranio-vertebral angles while standing than the remaining migraine intensity subgroups. The most discriminative variable was the flexion-rotation test score of the symptomatic side. For migraine frequency, no model was able to identify differences between groups (ie, patients with episodic or chronic migraine). Conclusions A subgroup of women with migraine who had common migraine intensity was identified with machine learning algorithms.Perez-Benito, FJ.; Conejero, JA.; Sáez Silvestre, C.; Garcia-Gomez, JM.; Navarro-Pardo, E.; Florencio, LL.; Fernández-De-Las-Peñas, C. (2020). Subgrouping factors influencing migraine intensity in women: A semi-automatic methodology based on machine learning and information geometry. Pain Practice. 20(3):297-309. https://doi.org/10.1111/papr.12854S29730920

    In silico phenotyping via co-training for improved phenotype prediction from genotype

    Get PDF
    Motivation: Predicting disease phenotypes from genotypes is a key challenge in medical applications in the postgenomic era. Large training datasets of patients that have been both genotyped and phenotyped are the key requisite when aiming for high prediction accuracy. With current genotyping projects producing genetic data for hundreds of thousands of patients, large-scale phenotyping has become the bottleneck in disease phenotype prediction. Results: Here we present an approach for imputing missing disease phenotypes given the genotype of a patient. Our approach is based on co-training, which predicts the phenotype of unlabeled patients based on a second class of information, e.g. clinical health record information. Augmenting training datasets by this type of in silico phenotyping can lead to significant improvements in prediction accuracy. We demonstrate this on a dataset of patients with two diagnostic types of migraine, termed migraine with aura and migraine without aura, from the International Headache Genetics Consortium. Conclusions: Imputing missing disease phenotypes for patients via co-training leads to larger training datasets and improved prediction accuracy in phenotype prediction. Availability and implementation: The code can be obtained at: http://www.bsse.ethz.ch/mlcb/research/bioinformatics-and-computational-biology/co-training.html Contact: [email protected] or [email protected] Supplementary information: Supplementary data are available at Bioinformatics onlin

    COVID-19: Symptoms Clustering and Severity Classification Using Machine Learning Approach

    Get PDF
    COVID-19 is an extremely contagious illness that causes illnesses varying from either the common cold to more chronic illnesses or even death. The constant mutation of a new variant of COVID-19 makes it important to identify the symptom of COVID-19 in order to contain the infection. The use of clustering and classification in machine learning is in mainstream use in different aspects of research, especially in recent years to generate useful knowledge on COVID-19 outbreak. Many researchers have shared their COVID-19 data on public database and a lot of studies have been carried out. However, the merit of the dataset is unknown and analysis need to be carried by the researchers to check on its reliability. The dataset that is used in this work was sourced from the Kaggle website. The data was obtained through a survey collected from participants of various gender and age who had been to at least ten countries. There are four levels of severity based on the COVID-19 symptom, which was developed in accordance to World Health Organization (WHO) and the Indian Ministry of Health and Family Welfare recommendations.  This paper presented an inquiry on the dataset utilising supervised and unsupervised machine learning approaches in order to better comprehend the dataset. In this study, the analysis of the severity group based on the COVID-19 symptoms using supervised learning techniques employed a total of seven classifiers, namely the K-NN, Linear SVM, Naive Bayes, Decision Tree (J48), Ada Boost, Bagging, and Stacking. For the unsupervised learning techniques, the clustering algorithm utilized in this work are Simple K-Means and Expectation-Maximization. From the result obtained from both supervised and unsupervised learning techniques, we observed that the result analysis yielded relatively poor classification and clustering results. The findings for the dataset analysed in this study do not appear to be providing the correct result for the symptoms categorized against the severity level which raises concerns about the validity and reliability of the dataset

    COVID-19: Symptoms Clustering and Severity Classification Using Machine Learning Approach

    Get PDF
    COVID-19 is an extremely contagious illness that causes illnesses varying from either the common cold to more chronic illnesses or even death. The constant mutation of a new variant of COVID-19 makes it important to identify the symptom of COVID-19 in order to contain the infection. The use of clustering and classification in machine learning is in mainstream use in different aspects of research, especially in recent years to generate useful knowledge on COVID-19 outbreak. Many researchers have shared their COVID-19 data on public database and a lot of studies have been carried out. However, the merit of the dataset is unknown and analysis need to be carried by the researchers to check on its reliability. The dataset that is used in this work was sourced from the Kaggle website. The data was obtained through a survey collected from participants of various gender and age who had been to at least ten countries. There are four levels of severity based on the COVID-19 symptom, which was developed in accordance to World Health Organization (WHO) and the Indian Ministry of Health and Family Welfare recommendations.  This paper presented an inquiry on the dataset utilising supervised and unsupervised machine learning approaches in order to better comprehend the dataset. In this study, the analysis of the severity group based on the COVID-19 symptoms using supervised learning techniques employed a total of seven classifiers, namely the K-NN, Linear SVM, Naive Bayes, Decision Tree (J48), Ada Boost, Bagging, and Stacking. For the unsupervised learning techniques, the clustering algorithm utilized in this work are Simple K-Means and Expectation-Maximization. From the result obtained from both supervised and unsupervised learning techniques, we observed that the result analysis yielded relatively poor classification and clustering results. The findings for the dataset analysed in this study do not appear to be providing the correct result for the symptoms categorized against the severity level which raises concerns about the validity and reliability of the dataset

    Machine Learning in Chronic Pain Research: A Scoping Review

    Get PDF
    Given the high prevalence and associated cost of chronic pain, it has a significant impact on individuals and society. Improvements in the treatment and management of chronic pain may increase patients’ quality of life and reduce societal costs. In this paper, we evaluate state-of-the-art machine learning approaches in chronic pain research. A literature search was conducted using the PubMed, IEEE Xplore, and the Association of Computing Machinery (ACM) Digital Library databases. Relevant studies were identified by screening titles and abstracts for keywords related to chronic pain and machine learning, followed by analysing full texts. Two hundred and eighty-seven publications were identified in the literature search. In total, fifty-three papers on chronic pain research and machine learning were reviewed. The review showed that while many studies have emphasised machine learning-based classification for the diagnosis of chronic pain, far less attention has been paid to the treatment and management of chronic pain. More research is needed on machine learning approaches to the treatment, rehabilitation, and self-management of chronic pain. As with other chronic conditions, patient involvement and self-management are crucial. In order to achieve this, patients with chronic pain need digital tools that can help them make decisions about their own treatment and care
    corecore