6 research outputs found

    Liver imaging features by convolutional neural network to predict the metachronous liver metastasis in stage I-III colorectal cancer patients based on preoperative abdominal CT scan

    Get PDF
    Background Introducing deep learning approach to medical images has rendered a large amount of un-decoded information into usage in clinical research. But mostly, it has been focusing on the performance of the prediction modeling for disease-related entity, but not on the clinical implication of the feature itself. Here we analyzed liver imaging features of abdominal CT images collected from 2019 patients with stage I โ€“ III colorectal cancer (CRC) using convolutional neural network (CNN) to elucidate its clinical implication in oncological perspectives. Results CNN generated imaging features from the liver parenchyma. Dimension reduction was done for the features by principal component analysis. We designed multiple prediction models for 5-year metachronous liver metastasis (5YLM) using combinations of clinical variables (age, sex, T stage, N stage) and top principal components (PCs), with logistic regression classification. The model using 1st PC (PC1) + clinical information had the highest performance (mean AUCโ€‰=โ€‰0.747) to predict 5YLM, compared to the model with clinical features alone (mean AUCโ€‰=โ€‰0.709). The PC1 was independently associated with 5YLM in multivariate analysis (betaโ€‰=โ€‰โˆ’โ€‰3.831, Pโ€‰<โ€‰0.001). For the 5-year mortality rate, PC1 did not contribute to an improvement to the model with clinical features alone. For the PC1, Kaplan-Meier plots showed a significant difference between PC1 low vs. high group. The 5YLM-free survival of low PC1 was 89.6% and the high PC1 was 95.9%. In addition, PC1 had a significant correlation with sex, body mass index, alcohol consumption, and fatty liver status. Conclusion The imaging features combined with clinical information improved the performance compared to the standardized prediction model using only clinical information. The liver imaging features generated by CNN may have the potential to predict liver metastasis. These results suggest that even though there were no liver metastasis during the primary colectomy, the features of liver imaging can impose characteristics that could be predictive for metachronous liver metastasis.The support for this research in the design of the study, and analysis, interpretation of data and in writing the manuscript was provided by NLM R01 LM012535. Publication costs are funded by NLM R01 funding (LM012535)

    ํ™˜์ž ๋ณด๊ณ  ์„ฑ๊ณผ ์ง€ํ‘œ๋ฅผ ํ™œ์šฉํ•œ ํ•œ๊ตญ์ธ ํ์•” ๋ฌด๋ณ‘ ์ƒ์กด์ž ์ƒ์กด ์˜ˆ์ธก ๋ชจํ˜• ๊ฐœ๋ฐœ

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ)-- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ์˜๊ณผ๋Œ€ํ•™ ์˜๊ณผํ•™๊ณผ, 2018. 2. ์œค์˜ํ˜ธ.Introduction: The prediction of lung cancer survival is a crucial factor for successful cancer survivorship and follow-up planning. The principal objective of this study is to construct a novel Korean prognostic model of 5-year survival within lung cancer disease-free survivors using socio-clinical and HRQOL variables and to compare its predictive performance with the prediction model based on the traditional known clinical variables. Diverse techniques such as Cox proportional hazard model and machine learning technologies (MLT) were applied to the modeling process. Methods: Data of 809 survivors, who underwent lung cancer surgery between 1994 and 2002 at two Korean tertiary teaching hospitals, were used. The following variables were selected as independent variables for the prognostic model by using literature reviews and univariate analysis: clinical and socio-demographic variables, including age, sex, stage, metastatic lymph node and incomehealth related quality of life (HRQOL) factors from the European Organization for Research and Treatment of Cancer Quality of Life Questionnaire Core 30Quality of Life Questionnaire Lung Cancer ModuleHospital Anxiety and Depression Scale, and Post-traumatic Growth Inventory. Survivors body mass index before a surgery and physical activity were also chosen. The three prediction modeling features sets included 1) only clinical and socio-demographic variables, 2) only HRQOL and lifestyle factors, and 3) variables from feature set 1 and 2 considered altogether. For each feature set, three Cox proportional hazard regression model were constructed and compared among each other by evaluating their performance in terms of discrimination and calibration ability using the C-statistic and Hosmer-Lemeshow chi-square statistics. Further, four machine learning algorithms using decision tree (DT), random forest (RF), bagging, and adaptive boosting (AdaBoost) were applied to three feature sets and compared with the performances of one another. The performance of the derived predictive models based on MLTs were internally validated by K-fold cross-validation. Results: In the Cox modeling, Model Cox-3 (based on Feature set 3: HRQOL factors added into clinical and socio-demographic variables) showed the highest area under curve (AUC = 0.809) compared with two other Cox regression (Cox-1, 2). When we applied the modeling methods into all other MLT based models, the most effective models were Model DT-3 from DT, Model RF-3 from RF, Model Bag-3 from Bagging, Model AdaBoost-3 from AdaBoost techniques, showing the highest accuracy for each of MLT. Model RF-3, Model Bag-3, Model AdaBoost-3 showed the highest accuracy even after k-fold cross-validation were conducted. Conclusions: Considering that the HRQOLs were added with clinical and socio-demographic variables, the proposed model proved to be useful based on the Cox model or we can apply MLT algorithms in the prediction of lung cancer survival. Improved accuracy for lung cancer survival prediction model has the potential to help clinicians and survivors make more meaningful decisions about future plans and their support to cancer care.I. INTRODUCTION 1 A. Background 1 1. Lung cancer statistics 1 2. The importance of suggesting survival prediction model to cancer survivors 4 3. HRQOL and lifestyle measurement as important predictors for lung cancer survival 5 4. Traditional survival analysis versus machine learning techniques (MLTs) 7 B. Hypothesis and objectives 10 1. Hypothesis 10 2. Objectives 10 II. MATERIALS AND METHODS 12 A. Study subjects 12 1. Subject selection 12 2. Data collection 13 2.1. Socio-demographic and clinical variables 15 2.2. Patient lifestyle characteristics 17 3. Study process 20 B. Prognostic variables selection and data preprocessing 22 1. Prognostic variables selection 22 1.1. Literature review for the selection of candidate predictors 22 1.2. Grading the evidence and mapping into the conceptual framework 25 1.3. Examination of prognosis variables selection from statistical analyses 28 2. Data preprocessing 29 2.1. Data cleaning, missing imputation 29 2.2. Test of multi-collinearity 29 2.3. Decisions of cut-off points 30 2.4. Data sampling for data balancing, SMOTE 31 2.5. Data splitting (holdout strategy) 32 C. Model development 33 1. Cox model development 34 3. Random forest model 38 4. Bagging (bootstrap aggregating) 40 5. Adaptive boosting (AdaBoost) 42 D. Model validation 44 1. Model validation for Cox model 44 1.1. Discrimination for Cox model 44 1.2. Calibration for Cox model 44 2. Model validation of other MLTs 45 3. K-fold Cross Validation for MLT based prediction models to avoid over-fitting 46 III. RESULTS 48 A. Literature review for selection of candidate predictors 48 1. Selection of candidate prognostic factors with literature review 48 2. Model constructing feature sets with selecting prognostic factors 51 B. Baseline characteristics 52 1. Demographics of participants characteristics and survival data 52 2. Candidate selection from statistical analyses 54 2.1. Univariate analysis of HRQOL mean scores between non-event and event groups 54 2.2. Univariate analysis of BMI, weight change, and MET of lung cancer survivors 58 3. Final candidate variable selection for phased modeling 60 4. Result of data preprocessing 62 4.1. Missing imputation 62 C. Model development 64 1. Cox model development 65 1.1. Prediction model based on Cox regression analysis 67 1.2. Final prediction model equation for Cox models 71 2. Decision tree model development 72 2.1. Assessment of the relative importance and model developing 72 2.2. Selecting CP value for decision tree pruning using rpart packages 74 3. Random forest model development 76 4. Bagged decision tree model development 78 5. AdaBoost model development 79 6. Developed models applied with MLTs 81 D. Model validation and performance 88 1. Cox proportional hazard ratio model internal validation 88 1.1. Discrimination 88 1.2. Calibration 91 2. Comparison model performance of Cox model and other MLTs 96 IV. DISCUSSION 106 A. Literature review for selection of candidate predictors 107 B. Model development using Cox and other MLTs 109 C. Model validation of Cox regression model and application of the predictive models to other MLT based models 112 D. Clinical and practical implications 114 E. Strengths and limitations of this study 117 CONCLUSION 119 REFERENCES 120 ๊ตญ๋ฌธ ์ดˆ๋ก 133 APPENDIX 135Docto

    Development and evaluation of machine learning algorithms for biomedical applications

    Get PDF
    Gene network inference and drug response prediction are two important problems in computational biomedicine. The former helps scientists better understand the functional elements and regulatory circuits of cells. The latter helps a physician gain full understanding of the effective treatment on patients. Both problems have been widely studied, though current solutions are far from perfect. More research is needed to improve the accuracy of existing approaches. This dissertation develops machine learning and data mining algorithms, and applies these algorithms to solve the two important biomedical problems. Specifically, to tackle the gene network inference problem, the dissertation proposes (i) new techniques for selecting topological features suitable for link prediction in gene networks; a graph sparsification method for network sampling; (iii) combined supervised and unsupervised methods to infer gene networks; and (iv) sampling and boosting techniques for reverse engineering gene networks. For drug sensitivity prediction problem, the dissertation presents (i) an instance selection technique and hybrid method for drug sensitivity prediction; (ii) a link prediction approach to drug sensitivity prediction; a noise-filtering method for drug sensitivity prediction; and (iv) transfer learning approaches for enhancing the performance of drug sensitivity prediction. Substantial experiments are conducted to evaluate the effectiveness and efficiency of the proposed algorithms. Experimental results demonstrate the feasibility of the algorithms and their superiority over the existing approaches

    A Learning Health System for Radiation Oncology

    Get PDF
    The proposed research aims to address the challenges faced by clinical data science researchers in radiation oncology accessing, integrating, and analyzing heterogeneous data from various sources. The research presents a scalable intelligent infrastructure, called the Health Information Gateway and Exchange (HINGE), which captures and structures data from multiple sources into a knowledge base with semantically interlinked entities. This infrastructure enables researchers to mine novel associations and gather relevant knowledge for personalized clinical outcomes. The dissertation discusses the design framework and implementation of HINGE, which abstracts structured data from treatment planning systems, treatment management systems, and electronic health records. It utilizes disease-specific smart templates for capturing clinical information in a discrete manner. HINGE performs data extraction, aggregation, and quality and outcome assessment functions automatically, connecting seamlessly with local IT/medical infrastructure. Furthermore, the research presents a knowledge graph-based approach to map radiotherapy data to an ontology-based data repository using FAIR (Findable, Accessible, Interoperable, Reusable) concepts. This approach ensures that the data is easily discoverable and accessible for clinical decision support systems. The dissertation explores the ETL (Extract, Transform, Load) process, data model frameworks, ontologies, and provides a real-world clinical use case for this data mapping. To improve the efficiency of retrieving information from large clinical datasets, a search engine based on ontology-based keyword searching and synonym-based term matching tool was developed. The hierarchical nature of ontologies is leveraged to retrieve patient records based on parent and children classes. Additionally, patient similarity analysis is conducted using vector embedding models (Word2Vec, Doc2Vec, GloVe, and FastText) to identify similar patients based on text corpus creation methods. Results from the analysis using these models are presented. The implementation of a learning health system for predicting radiation pneumonitis following stereotactic body radiotherapy is also discussed. 3D convolutional neural networks (CNNs) are utilized with radiographic and dosimetric datasets to predict the likelihood of radiation pneumonitis. DenseNet-121 and ResNet-50 models are employed for this study, along with integrated gradient techniques to identify salient regions within the input 3D image dataset. The predictive performance of the 3D CNN models is evaluated based on clinical outcomes. Overall, the proposed Learning Health System provides a comprehensive solution for capturing, integrating, and analyzing heterogeneous data in a knowledge base. It offers researchers the ability to extract valuable insights and associations from diverse sources, ultimately leading to improved clinical outcomes. This work can serve as a model for implementing LHS in other medical specialties, advancing personalized and data-driven medicine

    ็ ”็ฉถๆฅญ็ธพใ€€้›ปๅญๆƒ…ๅ ฑๅญฆ็ณป

    Get PDF
    corecore