16 research outputs found
Applying Artificial Intelligence to Medical Data
Machine learning, data mining, and deep learning has become the methodology of choice for analyzing medical data and images. In this study, we implemented three different machine learning techniques to medical data and image analysis. Our first study was to implement different log base entropy for a decision tree algorithm. Our results suggested that using a higher log base for the dataset with mostly categorical attributes with three or more categories for each attribute can obtain a higher accuracy. For the second study, we analyzed mental health data tuning the parameters of the decision tree (splitting method, depth and entropy). Our results identified the most crucial attributes for the dataset. The final study is on the Kimia Path24 image dataset. We built and trained a deep convolutional neural network and tested different hypotheses of batch size, number of epoch and learning rate. For the final study, all the hypotheses were supported with our experimental results
Med-BERT: pre-trained contextualized embeddings on large-scale structured electronic health records for disease prediction
Deep learning (DL) based predictive models from electronic health records
(EHR) deliver impressive performance in many clinical tasks. Large training
cohorts, however, are often required to achieve high accuracy, hindering the
adoption of DL-based models in scenarios with limited training data size.
Recently, bidirectional encoder representations from transformers (BERT) and
related models have achieved tremendous successes in the natural language
processing domain. The pre-training of BERT on a very large training corpus
generates contextualized embeddings that can boost the performance of models
trained on smaller datasets. We propose Med-BERT, which adapts the BERT
framework for pre-training contextualized embedding models on structured
diagnosis data from 28,490,650 patients EHR dataset. Fine-tuning experiments
are conducted on two disease-prediction tasks: (1) prediction of heart failure
in patients with diabetes and (2) prediction of pancreatic cancer from two
clinical databases. Med-BERT substantially improves prediction accuracy,
boosting the area under receiver operating characteristics curve (AUC) by
2.02-7.12%. In particular, pre-trained Med-BERT substantially improves the
performance of tasks with very small fine-tuning training sets (300-500
samples) boosting the AUC by more than 20% or equivalent to the AUC of 10 times
larger training set. We believe that Med-BERT will benefit disease-prediction
studies with small local training datasets, reduce data collection expenses,
and accelerate the pace of artificial intelligence aided healthcare.Comment: L.R., X.Y., and Z.X. share first authorship of this wor
Diagnosis and monitoring of Alzheimer's patients using classical and deep learning techniques
Machine based analysis and prediction systems are widely used for diagnosis of Alzheimer's Disease (AD). However, lower accuracy of existing techniques and lack of post diagnosis monitoring systems limit the scope of such studies. In this paper, a novel machine learning based diagnosis and monitoring of AD-like diseases is proposed. The AD-like diseases diagnosis process is accomplished by analysing the magnetic resonance imaging (MRI) scans using deep learning and is followed by an activity monitoring framework to monitor the subjects’ activities of daily living using body worn inertial sensors. The activity monitoring provides an assistive framework in daily life activities and evaluates vulnerability of the patients based on the activity level. The AD diagnosis results show up to 82% improvement in comparison to well-known existing techniques. Moreover, above 95% accuracy is achieved to classify the activities of daily living which is quite encouraging in terms of monitoring the activity profile of the subject
Digital Rock Segmentation for Petrophysical Analysis With Reduced User Bias Using Convolutional Neural Networks
Pore‐scale digital images are usually obtained from microcomputed tomography data that has been segmented into void and grain space. Image segmentation is a crucial step in the process of digital rock analysis that can influence pore‐scale characterization studies and/or the numerical simulation of petrophysical properties. This is concerning since all segmentation methods have user‐selected parameters that result in biases. Convolutional neural networks (CNNs) provide a way forward since once trained,
CNN can provide consistent and reliable image segmentation with no user‐defined inputs. In this paper, a CNN is used to segment digital sandstone data, and various ground truth data sets are tested. The ground truth images are created based on high‐resolution microcomputed tomography data and corresponding
scanning electron microscope data. The results are evaluated in terms of porosity, permeability, and pore size distribution computed from the segmented data. We find that watershed‐based segmentation provides a wide range of possible petrophysical values depending on user‐selected thresholds, whereas CNN provides a smaller variance when trained on scanning electron microscope data. It can be concluded that
CNN offers a reliable and consistent way to segment digital sandstone data for petrophysical analyse
Diagnosis and monitoring of Alzheimer's patients using classical and deep learning techniques
Machine based analysis and prediction systems are widely used for diagnosis of Alzheimer's Disease (AD). However, lower accuracy of existing techniques and lack of post diagnosis monitoring systems limit the scope of such studies. In this paper, a novel machine learning based diagnosis and monitoring of AD-like diseases is proposed. The AD-like diseases diagnosis process is accomplished by analysing the magnetic resonance imaging (MRI) scans using deep learning and is followed by an activity monitoring framework to monitor the subjects’ activities of daily living using body worn inertial sensors. The activity monitoring provides an assistive framework in daily life activities and evaluates vulnerability of the patients based on the activity level. The AD diagnosis results show up to 82% improvement in comparison to well-known existing techniques. Moreover, above 95% accuracy is achieved to classify the activities of daily living which is quite encouraging in terms of monitoring the activity profile of the subject
Predicting brain age with deep learning from raw imaging data results in a reliable and heritable biomarker.
Machine learning analysis of neuroimaging data can accurately predict chronological age in healthy people. Deviations from healthy brain ageing have been associated with cognitive impairment and disease. Here we sought to further establish the credentials of 'brain-predicted age' as a biomarker of individual differences in the brain ageing process, using a predictive modelling approach based on deep learning, and specifically convolutional neural networks (CNN), and applied to both pre-processed and raw T1-weighted MRI data. Firstly, we aimed to demonstrate the accuracy of CNN brain-predicted age using a large dataset of healthy adults (N = 2001). Next, we sought to establish the heritability of brain-predicted age using a sample of monozygotic and dizygotic female twins (N = 62). Thirdly, we examined the test-retest and multi-centre reliability of brain-predicted age using two samples (within-scanner N = 20; between-scanner N = 11). CNN brain-predicted ages were generated and compared to a Gaussian Process Regression (GPR) approach, on all datasets. Input data were grey matter (GM) or white matter (WM) volumetric maps generated by Statistical Parametric Mapping (SPM) or raw data. CNN accurately predicted chronological age using GM (correlation between brain-predicted age and chronological age r = 0.96, mean absolute error [MAE] = 4.16 years) and raw (r = 0.94, MAE = 4.65 years) data. This was comparable to GPR brain-predicted age using GM data (r = 0.95, MAE = 4.66 years). Brain-predicted age was a heritable phenotype for all models and input data (h(2) ≥ 0.5). Brain-predicted age showed high test-retest reliability (intraclass correlation coefficient [ICC] = 0.90-0.99). Multi-centre reliability was more variable within high ICCs for GM (0.83-0.96) and poor-moderate levels for WM and raw data (0.51-0.77). Brain-predicted age represents an accurate, highly reliable and genetically-influenced phenotype, that has potential to be used as a biomarker of brain ageing. Moreover, age predictions can be accurately generated on raw T1-MRI data, substantially reducing computation time for novel data, bringing the process closer to giving real-time information on brain health in clinical settings
Predicting brain age with deep learning from raw imaging data results in a reliable and heritable biomarker
Machine learning analysis of neuroimaging data can accurately predict chronological age in healthy people. Deviations from healthy brain ageing have been associated with cognitive impairment and disease. Here we sought to further establish the credentials of ‘brain-predicted age’ as a biomarker of individual differences in the brain ageing process, using a predictive modelling approach based on deep learning, and specifically convolutional neural networks (CNN), and applied to both pre-processed and raw T1-weighted MRI data.
Firstly, we aimed to demonstrate the accuracy of CNN brain-predicted age using a large dataset of healthy adults (N = 2001). Next, we sought to establish the heritability of brain-predicted age using a sample of monozygotic and dizygotic female twins (N = 62). Thirdly, we examined the test-retest and multi-centre reliability of brain-predicted age using two samples (within-scanner N = 20; between-scanner N = 11). CNN brain-predicted ages were generated and compared to a Gaussian Process Regression (GPR) approach, on all datasets. Input data were grey matter (GM) or white matter (WM) volumetric maps generated by Statistical Parametric Mapping (SPM) or raw data.
CNN accurately predicted chronological age using GM (correlation between brain-predicted age and chronological age r = 0.96, mean absolute error [MAE] = 4.16 years) and raw (r = 0.94, MAE = 4.65 years) data. This was comparable to GPR brain-predicted age using GM data (r = 0.95, MAE = 4.66 years). Brain-predicted age was a heritable phenotype for all models and input data (h2 ≥ 0.5). Brain-predicted age showed high test-retest reliability (intraclass correlation coefficient [ICC] = 0.90–0.99). Multi-centre reliability was more variable within high ICCs for GM (0.83–0.96) and poor-moderate levels for WM and raw data (0.51–0.77).
Brain-predicted age represents an accurate, highly reliable and genetically-influenced phenotype, that has potential to be used as a biomarker of brain ageing. Moreover, age predictions can be accurately generated on raw T1-MRI data, substantially reducing computation time for novel data, bringing the process closer to giving real-time information on brain health in clinical settings
Побудова нейронної мережі моделі класифікації для оцінки політики агента в глибокому навчанні з підкріпленням на прикладі гри Minecraft
Магістерська дисертація: 84 с., 19 рисунків, 15 таблиць, 30 джерел.
В роботі розглянуті і проаналізовані одні з найбільш вживаних з тих, що
існують на даний момент, сучасних методів інтелектуального аналізу даних.
Проведено дослідження відомих методів класифікації, а також ефективності
використання ансамблів базових класифікаторів. Окрім цього, була
запропонована модель нейронної мережі для класифікації та використання її
для навчання Q функціїї, доведена її ефективність на практичній задачі, а саме
класифікації для гри Minecraft.
В роботі було розглянуто загальні відомості про машинне навчання,
розглянуто основні складові методики Q-Learning. Було виконано аналіз
сучасного використання модифікацій Q-Learning.
Об’єктом дослідження є фото та відео дані з гри Minecraft а також їх
навігація та розмічені сегменти.
Предметом дослідження є математичні моделі інтелектуального аналізу
даних та їх ансамблів для проведення класифікації на основі статистичних
даних.Master’s thesis: 84 pages, 19 figures, 15 tables, 30 sources.
Theme: Deep reinforcement learning for Minecraft, image classification. The
known methods of classification, as well as the efficiency of use of ensembles of
basic classifiers, are conducted. In addition, a neural network model was proposed
to classify and use it to train Q functions, and proved its effectiveness in a practical
task, namely, classification for the Minecraft game.
The paper considered general information about machine learning, the basic
components of the methodology Q-Learning. The using of modern versions of Q-
Learning was analyzed, such as Fuzzy Q-learning.
The subject of the study is photos and video data from the Minecraft game, as
well as their navigation and marked segments.
The subject of research is mathematical models of data mining and their
ensembles for classification based on statistics
Методи інтелектуального аналізу даних для прийняття рішень щодо діагностування пацієнта
Магістерська дисертація: 111 с., 12 рис., 36 табл., 2 додатки, 46 джерела.
В роботі розглянуті і проаналізовані одні з найбільш вживаних з тих, що існують на даний момент, сучасних методів інтелектуального аналізу даних. Проведено дослідження відомих методів класифікації, а також ефективності використання ансамблів базових класифікаторів. Окрім цього, була запропонована дворівнева модель класифікації та доведена її ефективність на практичній задачі, а саме діагностиці пацієнта на предмет захворювання на Ішемічну хворобу серця та хронічну хворобу нирок.
Об’єктом дослідження є медичні показники (демографічні, симптоми, ЕКГ та результати обстежень) та їх значення для успішного діагностування захворювання.
Предметом дослідження є математичні моделі інтелектуального аналізу даних та їх ансамблів для проведення класифікації на основі статистичних даних.Master’s thesis: 111 pages, 12 figures, 36 tables, 2 appendixes, 46 sources.
Theme: Data mining methods for diagnostic decision-making.
In this work one of the most widely used modern data mining methods were studied and analyzed. The research of known methods of classification, as well as the effectiveness of the use of ensembles of basic classifiers, has been carried out. In addition, a two-level model of classification was proposed and its effectiveness was proved on a practical task, namely diagnostics of the patient having heart and chronic kidney disease.
The subject of the study is medical indicators (demographic, symptoms, ECG and survey results) and their significance for successful diagnosis of the disease.
The subject of the study is the mathematical models of the intellectual analysis of data and their ensembles for the classification on the basis of statistical data