1,899 research outputs found
Framework to predict the metabolic syndrome without doing a blood test: based on machine learning for a clinical decision support system
Metabolic Syndrome (MetS) is a cluster of risk factors that increase the likelihood of heart disease and diabetes mellitus, and researchers have recently linked it to worse outcomes for the novel Covid-19 disease. It is crucial to get diagnosed with time to take preventive measures, especially for patients in locations without proper laboratories and medical consultations. This work presents a new model to diagnose metabolic syndrome using machine learning and non-biochemical variables that healthcare professionals can obtain from initial consultations. For evaluating and comparing the model, this work also proposes a new methodology for performing research on data mining called RAMAD. The methodology standardizes the novel model’s comparison with similar classification models, using their reported variables and previously obtained data from a study in Colombia, using the holdout and random subsampling validation methods to get performance evaluation indicators between the models. The resulting ANN model used three hidden layers and only Hip Circumference, dichotomous Waist Circumference, and dichotomous blood pressure variables. It gave an Area under Receiver Operating Characteristic curves (AROC) of 87.75% by the International Diabetes Federation (IDF) and 85.12% by Harmonized Diagnosis or Joint Interim Statement (HMS) diagnosis criteria, higher than previous models. Thanks to the new methodology, diagnosis models can be thoroughly documented for appropriate future comparisons, thus benefiting the studied diseases’ diagnosis. Medical personnel needs to know the factors involved in the syndrome to start a treatment. So, this work also presents the segmentation of metabolic syndrome in types related to each biochemical variable. It uses the RAMAD methodology together with several machine learning techniques to design a framework to predict MetS and their several types, without using a blood test and only anthropometric and clinical information. The results showed an excellent system for predicting six MetS types that combine several factors mentioned above that have an AROC with a range of 71% to 96%, and an AROC 82.86%. This thesis finishes with the proposal of using a SCRUM Thinking framework for creating mobile health applications to implement the new models and serve as decision tools for healthcare professionals. The standard and fundamental characteristics were analyzed, finding the quality attributes verified in the framework’s early stages. Keywords — Metabolic Syndrome, Segmentation, Quine–McCluskey, Random Subsampling validation, RAMAD, Machine learning, Framework, International Diabetes Federation (IDF), Harmonized Diagnosis or Joint Interim Statement (HMS).DoctoradoDoctor en Ingeniería de Sistemas y Computació
Training and Classification of PCA with LRM model for Diabetes Prediction
There are exponential increase in the number of families who are diagnosed by diabetes mellitus because of lifestyle and other non-determinable factors. Most of the patients are least bothered about the consequences they face or about the danger factor that approaches them. In this, we have established a novel model predicting the type 2 diabetes mellitus (TD2M) dependent on information digging methods. The main constraints are that we are trying to enhance the precision of the expected model and to not limit the model with just one data set. The model contains the improved NB, DT, KSTAR, LOGISTIC REGRESSION, SVM compared to the pre-processing techniques. To compare our outcome and the outcomes from different scientists we use Pima Indians diabetes data set and the Waikato environment for knowledge analysis toolbox. Apart from these, the model which we expect to implement have adequate data set quality. For more analysis, we applied it to two more diabetic datasets. These two provides satisfied outcomes. Henceforth, the model is set to be valuable for the betterment in the field of diabetology.
GIHAT: An Efficient Prediction Technique for Measure for Diabetes Mellitus
The medical service industry is a consistently developing field, producing trillions of information consistently. The modernization of the area has an immediate association with this incremental extent. These acquired informational collections are somewhat organized however for the most part unstructured in nature. These acquired information must be prepared with most extreme care to determine finish usable examples for subjective and prescient investigations. These gigantic records of information, in the wake of handling, when utilized, will turn out to be very unpredictable. Diabetes is a lifetime disease marked by elevated levels of sugar in the blood. It is the second leading cause of sightlessness and renal disease worldwide. Sort 2 diabetes mellitus (S2DM) is genuine and expensive metabolic illness that is a developing worries among peoples .S2DM is related with various comorbid conditions that can prompt negative patient results. Comorbid endless torment is extremely basic in S2DM because of the nearness of diabetic neuropathy and musculoskeletal conditions that are related with delayed hyperglycemia. This Paper using General Integrated High Availability Transaction (GIHAT) algorithm concentrates on the causes, sorts, and factors influencing DM (diabetes mellitus), preventive measures, and treatment of diabetes other than those directly associated with Diabetic Patients structured and unstructured data-sets .This algorithm executed in “R” Programming used for statistical analysis which provides the accurate results comparing existing algorithms
A Predictive Model for Diabetes Mellitus Using Machine Learning Techniques (A Study in Nigeria)
Diabetes Mellitus (DM) is a metabolic disorder that occurs when the blood sugar level in the body is considered to be high, thereby resulting in inadequate insulin in the body leading to a myriad complications. The World Health Organization in 2021 indicated that in 2019, diabetes was the direct cause of 1.5 million deaths. Though some research has been carried out in the area of DM prediction in high-income countries, not much has been done in middle/low-income countries like Nigeria, using factors that are peculiar to their environment. This paper, therefore, aims to develop a machine learning model that predicts DM in individuals at an early stage. The study identified nine DM attributes and used three supervised learning algorithms of K Nearest Neighbors (KNN) decision trees, and artificial neural networks (ANN) to predict DM from a locally collected dataset in Nigeria. The results indicate that ANN produced the highest accuracy, at 97.40%
Recommended from our members
Computing resources sensitive parallelization of neural neworks for large scale diabetes data modelling, diagnosis and prediction
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Diabetes has become one of the most severe deceases due to an increasing number of diabetes patients globally. A large amount of digital data on diabetes has been collected through various channels. How to utilize these data sets to help doctors to make a decision on diagnosis, treatment and prediction of diabetic patients poses many challenges to the research community. The thesis investigates mathematical models with a focus on neural networks for large scale diabetes data modelling and analysis by utilizing modern computing technologies such as grid computing and cloud computing. These computing technologies provide users with an inexpensive way to have access to extensive computing resources over the Internet for solving data and computationally intensive problems. This thesis evaluates the performance of seven representative machine learning techniques in classification of diabetes data and the results show that neural network produces the best accuracy in classification but incurs high overhead in data training. As a result, the thesis develops MRNN, a parallel neural network model based on the MapReduce programming model which has become an enabling technology in support of data intensive applications in the clouds.
By partitioning the diabetic data set into a number of equally sized data blocks, the workload in training is distributed among a number of computing nodes for speedup in data training. MRNN is first evaluated in small scale experimental environments using 12 mappers and subsequently is evaluated in large scale simulated environments using up to 1000 mappers. Both the experimental and simulations results have shown the effectiveness of MRNN in classification, and its high scalability in data training.
MapReduce does not have a sophisticated job scheduling scheme for heterogonous computing environments in which the computing nodes may have varied computing capabilities. For this purpose, this thesis develops a load balancing scheme based on genetic algorithms with an aim to balance the training workload among heterogeneous computing nodes. The nodes with more computing capacities will receive more MapReduce jobs for execution. Divisible load theory is employed to guide the evolutionary process of the genetic algorithm with an aim to achieve fast convergence. The proposed load balancing scheme is evaluated in large scale simulated MapReduce environments with varied levels of heterogeneity using different sizes of data sets. All the results show that the genetic algorithm based load balancing scheme significantly reduce the makespan in job execution in comparison with the time consumed without load balancing.This work is funded by the EPSRC and China Market Association
Utilizing Temporal Information in The EHR for Developing a Novel Continuous Prediction Model
Type 2 diabetes mellitus (T2DM) is a nation-wide prevalent chronic condition, which includes direct and indirect healthcare costs. T2DM, however, is a preventable chronic condition based on previous clinical research. Many prediction models were based on the risk factors identified by clinical trials. One of the major tasks of the T2DM prediction models is to estimate the risks for further testing by HbA1c or fasting plasma glucose to determine whether the patient has or does not have T2DM because nation-wide screening is not cost-effective.
Those models had substantial limitations on data quality, such as missing values. In this dissertation, I tested the conventional models which were based on the most widely used risk factors to predict the possibility of developing T2DM. The AUC was an average of 0.5, which implies the conventional model cannot be used to screen for T2DM risks. Based on this result, I further implemented three types of temporal representations, including non-temporal representation, interval-temporal representation, and continuous-temporal representation for building the T2DM prediction model. According to the results, continuous-temporal representation had the best performance. Continuous-temporal representation was based on deep learning methods. The result implied that the deep learning method could overcome the data quality issue and could achieve better performance.
This dissertation also contributes to a continuous risk output model based on the seq2seq model. This model can generate a monotonic increasing function for a given patient to predict the future probability of developing T2DM. The model is workable but still has many limitations to overcome.
Finally, this dissertation demonstrates some risks factors which are underestimated and are worthy for further research to revise the current T2DM screening guideline. The results were still preliminary. I need to collaborate with an epidemiologist and other fields to verify the findings. In the future, the methods for building a T2DM prediction model can also be used for other prediction models of chronic conditions
Machine Learning of Lifestyle Data for Diabetes
Self-Monitoring of Blood Glucose (SMBG) for Type-2 Diabetes (T2D) remains highly challenging for both patients and doctors due to the complexities of diabetic lifestyle data logging and insufficient short-term and personalized recommendations/advice. The recent mobile diabetes management systems have been proved clinically effective to facilitate self-management. However, most such systems have poor usability and are limited in data analytic functionalities. These two challenges are connected and affected by each other. The ease of data recording brings better data for applicable data analytic algorithms. On the other hand, the irrelevant or inaccurate data input will certainly commit errors and noises. The output of data analysis, as potentially valuable patterns or knowledge, could be the incentives for users to contribute more data.
We believe that the incorporation of machine learning technologies in mobile diabetes management could tackle these challenge simultaneously. In this thesis, we propose, build, and evaluate an intelligent mobile diabetes management system, called GlucoGuide for T2D patients. GlucoGuide conveniently aggregates varieties of lifestyle data collected via mobile devices, analyzes the data with machine learning models, and outputs recommendations.
The most complicated part of SMBG is diet management. GlucoGuide aims to address this crucial issue using classification models and camera-based automatic data logging. The proposed model classifies each food item into three recommendation classes using its nutrient and textual features. Empirical studies show that the food classification task is effective.
A lifestyle-data-driven recommendations framework in GlucoGuide can output short-term and personalized recommendations of lifestyle changes to help patients stabilize their blood glucose level. To evaluate performance and clinical effectiveness of this framework, we conduct a three-month clinical trial on human subjects, in collaboration with Dr. Petrella (MD). Due to the high cost and complexity of trials on humans, a small but representative subject group is involved. Two standard laboratory blood tests for diabetes are used before and after the trial. The results are quite remarkable. Generally speaking, GlucoGuide amounted to turning an early diabetic patient to be pre-diabetic, and pre-diabetic to non-diabetic, in only 3-months, depending on their before-trial diabetic conditions. cThis clinical dataset has also been expanded and enhanced to generate scientifically controlled artificial datasets. Such datasets can be used for varieties of machine learning empirical studies, as our on-going and future research works.
GlucoGuide now is a university spin-off, allowing us to collect a large scale of practical diabetic lifestyle data and make potential impact on diabetes treatment and management
A Rule Based Classification Model to Predict Colon Cancer Survival
Introduction: Colon cancer is the second most common cancer in the world and fourth most common
cancer in both sexes in Iran, whose % 8.12 of all cancers in the covers. Predict the outcome of cancer and
basic clinical data about it is very important. Data mining techniques can be used to predict cancer outcome.
In our country, data mining studies on colon cancer, not covered as lung or breast cancers. It seems can be
with identify factors influencing on survival and modify them, increased survival of colon cancer patients.
Then according to high rates of colon cancer and the benefits of data mining to predict survival, in this study
examined factors influencing on the survival of these patients.
Materials and Methods: We use a dataset with four attributes that include the records of 570 patients in
which 327 Patients (57.4%) and 243 (42.6%) patients were males and females respectively. Trees Random
Forest (TRF), AdaBoost (AD), RBF Network (RBFN), and Multilayer Perceptron (MLP) machine learning
techniques with 10-cross fold technique were used with the proposed model for the prediction of colon
cancer survival. The performance of machine learning techniques were evaluated with accuracy, precision,
sensitivity, specificity, and area under ROC curve.
Results: Out of 570 patients, 338 patients and 232 patients were alive and dead respectively. In this Study,
at first sight it seems that among this techniques, Trees Random Forest (TRF) technique showed better
results in comparison to other techniques (AD, RBFN and MLP). The accuracy, sensitivity, specificity and
the area under ROC curve of TRF are 0.76, 0.808, 0.70 and 0.83, respectively.
Conclusions: In this study seems that Trees Random Forest model (TRF) which is a rule based
classification model was the best model with the highest level of accuracy. Therefore, this model is
recommended as a useful tool for colon cancer survival prediction as well as medical decision making
- …