86 research outputs found
Deep Generative Modeling Based Retinal Image Analysis
In the recent past, deep learning algorithms have been widely used in retinal image analysis (fundus and OCT) to perform tasks like segmentation and classification. But to build robust and highly efficient deep learning models amount of the training images, the quality of the training images is extremely necessary. The quality of an image is also an extremely important factor for the clinical diagnosis of different diseases. The main aim of this thesis is to explore two relatively under-explored area of retinal image analysis, namely, the retinal image quality enhancement and artificial image synthesis.
In this thesis, we proposed a series of deep generative modeling based algorithms to perform these above-mentioned tasks. From a mathematical perspective, the generative model is a statistical model of the joint probability distribution between an observable variable and a target variable. The generative adversarial network (GAN), variational auto-encoder(VAE) are some popular generative models. Generative models can be used to generate new samples from a given distribution.
The OCT images have inherent speckle noise in it, fundus images do not suffer from noises in general, but the newly developed tele-ophthalmoscope devices produce images with relatively low spatial resolution and blur. Different GAN based algorithms were developed to generate corresponding high-quality images fro its low-quality counterpart.
A combination of residual VAE and GAN was implemented to generate artificial retinal fundus images with their corresponding artificial blood vessel segmentation maps. This will not only help to generate new training images as many as needed but also will help to reduce the privacy issue of releasing personal medical data
λ₯λ¬λμ μ΄μ©ν λ Ήλ΄μ₯ μ§λ¨ 보쑰 μμ€ν
νμλ
Όλ¬Έ (λ°μ¬) -- μμΈλνκ΅ λνμ : 곡과λν νλκ³Όμ λ°μ΄μ€μμ§λμ΄λ§μ 곡, 2021. 2. κΉν¬μ°¬.λ³Έ λ
Όλ¬Έμμλ λ₯ λ¬λ κΈ°λ°μ μ§λ¨ 보쑰 μμ€ν
μ μ μνμλ€. μλ‘μ΄ λ°©λ²μ΄ λ
Ήλ΄μ₯ λ°μ΄ν°μ μ μ©λμκ³ κ²°κ³Όλ₯Ό νκ°νμλ€.
첫λ²μ§Έ μ°κ΅¬μμλ μ€ννΈλΌμμ λΉκ°μλ¨μΈ΅μ΄¬μκΈ°(SD-OCT)λ₯Ό λ₯ λ¬λ λΆλ₯ κΈ°λ₯Ό μ΄μ©ν΄ λΆμνμλ€. μ€ννΈλΌμμ λΉκ°μλ¨μΈ΅μ΄¬μκΈ°λ λ
Ήλ΄μ₯μΌλ‘ μΈν ꡬ쑰μ μμμ νκ°νκΈ° μν΄ μ¬μ©νλ μ₯λΉμ΄λ€. λΆλ₯ μκ³ λ¦¬μ¦μ ν©μ± κ³± μ κ²½λ§μ μ΄μ©ν΄ κ°λ° λμμΌλ©°, μ€ννΈλΌμμ λΉκ°μλ¨μΈ΅μ΄¬μκΈ°μ λ§λ§μ κ²½μ¬μ μΈ΅(RNFL)κ³Ό ν©λ°λΆ μ κ²½μ μΈν¬λ΄λ§μμΈ΅ (GCIPL) μ¬μ§μ μ΄μ©ν΄ νμ΅νλ€. μ μν λ°©λ²μ λκ°μ μ΄λ―Έμ§λ₯Ό μ
λ ₯μΌλ‘ λ°λ μ΄μ€μ
λ ₯ν©μ±κ³±μ κ²½λ§(DICNN)μ΄λ©°, λ₯ λ¬λ λΆλ₯μμ ν¨κ³Όμ μΈ κ²μΌλ‘ μλ €μ Έ μλ€. μ΄μ€μ
λ ₯ν©μ±κ³±μ κ²½λ§μ λ§λ§μ κ²½μ¬μ μΈ΅ κ³Ό μ κ²½μ μΈν¬μΈ΅ μ λκ» μ§λλ₯Ό μ΄μ©νμ¬ νμ΅ λμΌλ©°, νμ΅λ λ€νΈμν¬λ λ
Ήλ΄μ₯κ³Ό μ μ κ΅°μ ꡬλΆνλ€. μ΄μ€μ
λ ₯ν©μ±κ³±μ κ²½λ§μ μ νλμ μμ κΈ°λμνΉμ±κ³‘μ νλ©΄μ (AUC)μΌλ‘ νκ° λμλ€. λ§λ§μ κ²½μ¬μ μΈ΅κ³Ό μ κ²½μ μΈν¬μΈ΅ λκ» μ§λλ‘ νμ΅λ μ€κ³ν λ₯ λ¬λ λͺ¨λΈμ μ‘°κΈ° λ
Ήλ΄μ₯κ³Ό μ μ κ΅°μ λΆλ₯νλ μ±λ₯μ νκ°νκ³ λΉκ΅νμλ€. μ±λ₯νκ° κ²°κ³Ό μ΄μ€μ
λ ₯ν©μ±κ³±μ κ²½λ§μ μ‘°κΈ° λ
Ήλ΄μ₯μ λΆλ₯νλλ° 0.869μ μμ κΈ°λμνΉμ±κ³‘μ μλμ΄μ 0.921μ λ―Όκ°λ, 0.756μ νΉμ΄λλ₯Ό 보μλ€.
λλ²μ§Έ μ°κ΅¬μμλ λ₯ λ¬λμ μ΄μ©ν΄ μμ κ²½μ λμ¬μ§μ ν΄μλμ λλΉ, μκ°, λ°κΈ°λ₯Ό 보μ νλ λ°©λ²μ μ μνμλ€. μμ κ²½μ λμ¬μ§μ λ
Ήλ΄μ₯μ μ§λ¨νλλ° μμ΄ ν¨κ³Όμ μΈ κ²μΌλ‘ μλ €μ Έ μλ€. νμ§λ§, λ
Ήλ΄μ₯μ μ§λ¨μμ νμμ λ, μμ λ곡, 맀체 λΆν¬λͺ
μ± λ±μΌλ‘ μΈν΄ νκ°κ° μ΄λ €μ΄ κ²½μ°κ° μλ€. μ΄ ν΄μλμ 보μ μκ³ λ¦¬μ¦μ μ΄ ν΄μλ μ λμ μμ±μ κ²½λ§μ ν΅ν΄ κ°λ°λμλ€. μλ³Έ κ³ ν΄μλμ μμ κ²½ μ λ μ¬μ§μ μ ν΄μλ μ¬μ§μΌλ‘ μΆμλκ³ , 보μ λ κ³ ν΄μλ μμ κ²½μ λμ¬μ§μΌλ‘ 보μ λλ©°, 보μ λ μ¬μ§μ μμ κ²½μ¬λ°±μ κ°μμ±κ³Ό κ·Όμ² νκ΄μ μ 보μ΄λλ‘ νμ²λ¦¬ μκ³ λ¦¬μ¦μ μ΄μ©νλ€. μ ν΄μλμ΄λ―Έμ§λ₯Ό 보μ λ κ³ ν΄μλμ΄λ―Έμ§λ‘ 볡μνλ κ³Όμ μ μ΄ν΄μλμ λμ μ κ²½λ§μ ν΅ν΄ νμ΅νλ€. μ€κ³ν λ€νΈμν¬λ μ νΈ λ μ‘μ λΉ(PSNR)κ³Ό ꡬ쑰μ μ μ¬μ±(SSIM), νκ· νκ°μ (MOS)λ₯Ό μ΄μ©ν΄ νκ° λμλ€. νμ¬μ μ°κ΅¬λ λ₯ λ¬λμ΄ μκ³Ό μ΄λ―Έμ§λ₯Ό 4λ°° ν΄μλμ ꡬ쑰μ μΈ μΈλΆ νλͺ©μ΄ μ 보μ΄λλ‘ κ°μ ν μ μλ€λ κ²μ 보μ¬μ£Όμλ€. ν₯μλ μμ κ²½μ λ μ¬μ§μ μμ κ²½μ λ³λ¦¬νμ μΈ νΉμ±μ μ§λ¨ μ νλλ₯Ό λͺ
νν ν₯μμν¨λ€. μ±λ₯νκ°κ²°κ³Ό νκ· PSNRμ 25.01 SSIMμ 0.75 MOSλ 4.33μΌλ‘ λνλ¬λ€.
μΈλ²μ§Έ μ°κ΅¬μμλ νμ μ 보μ μκ³Ό μμ(μμ κ²½μ λ μ¬μ§κ³Ό λΆμμμ΄ μλ λ§λ§μ κ²½μ¬μ μΈ΅ μ¬μ§)μ μ΄μ©ν΄ λ
Ήλ΄μ₯ μμ¬ νμλ₯Ό λΆλ³νκ³ λ
Ήλ΄μ₯ μμ¬ νμμ λ°λ³ μ°μλ₯Ό μμΈ‘νλ λ₯ λ¬λ λͺ¨λΈμ κ°λ°νμλ€. μμ λ°μ΄ν°λ€μ λ
Ήλ΄μ₯μ μ§λ¨νκ±°λ μμΈ‘νλλ° μ μ©ν μ 보λ€μ κ°μ§κ³ μλ€. νμ§λ§, μ΄λ»κ² λ€μν μ νμ μμμ 보λ€μ μ‘°ν©νλ κ²μ΄ κ°κ°μ νμλ€μ λν΄ μ μ¬μ μΈ λ
Ήλ΄μ₯μ μμΈ‘νλλ° μ΄λ€ μν₯μ μ£Όλμ§μ λν μ°κ΅¬κ° μ§ν λ μ μ΄ μλ€. λ
Ήλ΄μ₯ μ μ¬μ λΆλ₯μ λ°λ³ λ
μ μμΈ‘μ ν©μ± κ³± μλ μΈμ½λ(CAE)λ₯Ό λΉ μ§λμ νΉμ± μΆμΆ κΈ°λ‘ μ¬μ©νκ³ , κΈ°κ³νμ΅ λΆλ₯ κΈ°μ νκ·κΈ°λ₯Ό ν΅ν΄ μ§ννμλ€. μ€κ³ν λͺ¨λΈμ μ νλμ νκ· μ κ³±μ€μ°¨(MSE)λ₯Ό ν΅ν΄ νκ° λμμΌλ©°, μ΄λ―Έμ§ νΉμ§κ³Ό νμ νΉμ§μ μ‘°ν©νμ λ λ
Ήλ΄μ₯ μμ¬ νμ λΆλ₯μ λ°λ³ λ
μ μμΈ‘μ μ±λ₯μ΄ μ΄λ―Έμ§ νΉμ§κ³Ό νμ νΉμ§μ κ°κ° μΌμ λλ³΄λ€ μ±λ₯μ΄ μ’μλ€. μ λ΅κ³Όμ MSEλ 2.613μΌλ‘ λνλ¬λ€.
λ³Έ μ°κ΅¬μμλ λ₯ λ¬λμ μ΄μ©ν΄ λ
Ήλ΄μ₯ κ΄λ ¨ μμ λ°μ΄ν° μ€ λ§λ§μ κ²½μ¬μ μΈ΅, μ κ²½μ μΈν¬μΈ΅ μ¬μ§μ λ
Ήλ΄μ₯ μ§λ¨μ μ΄μ©λμκ³ , μμ κ²½μ λ μ¬μ§μ μμ κ²½μ λ³λ¦¬νμ μΈ μ§λ¨ μ νλλ₯Ό λμκ³ , νμ μ 보λ λ³΄λ€ μ νν λ
Ήλ΄μ₯ μμ¬ νμ λΆλ₯μ λ°λ³ λ
μ μμΈ‘μ μ΄μ©λμλ€. ν₯μλ λ
Ήλ΄μ₯ μ§λ¨ μ±λ₯μ κΈ°μ μ μ΄κ³ μμμ μΈ μ§νλ€μ ν΅ν΄ κ²μ¦λμλ€.This paper presents deep learning-based methods for improving glaucoma diagnosis support systems. Novel methods were applied to glaucoma clinical cases and the results were evaluated.
In the first study, a deep learning classifier for glaucoma diagnosis based on spectral-domain optical coherence tomography (SD-OCT) images was proposed and evaluated. Spectral-domain optical coherence tomography (SD-OCT) is commonly employed as an imaging modality for the evaluation of glaucomatous structural damage. The classification model was developed using convolutional neural network (CNN) as a base, and was trained with SD-OCT retinal nerve fiber layer (RNFL) and macular ganglion cell-inner plexiform layer (GCIPL) images. The proposed network architecture, termed Dual-Input Convolutional Neural Network (DICNN), showed great potential as an effective classification algorithm based on two input images. DICNN was trained with both RNFL and GCIPL thickness maps that enabled it to discriminate between normal and glaucomatous eyes. The performance of the proposed DICNN was evaluated with accuracy and area under the receiver operating characteristic curve (AUC), and was compared to other methods using these metrics. Compared to other methods, the proposed DICNN model demonstrated high diagnostic ability for the discrimination of early-stage glaucoma patients in normal subjects. AUC, sensitivity and specificity was 0.869, 0.921, 0.756 respectively.
In the second study, a deep-learning method for increasing the resolution and improving the legibility of Optic-disc Photography(ODP) was proposed. ODP has been proven to be useful for optic nerve evaluation in glaucoma. But in clinical practice, limited patient cooperation, small pupil or media opacities can limit the performance of ODP. A model to enhance the resolution of ODP images, termed super-resolution, was developed using Super Resolution Generative Adversarial Network(SR-GAN). To train this model, high-resolution original ODP images were transformed into two counterparts: (1) down-scaled low-resolution ODPs, and (2) compensated high-resolution ODPs with enhanced visibility of the optic disc margin and surrounding retinal vessels which were produced using a customized image post-processing algorithm. The SR-GAN was trained to learn and recognize the differences between these two counterparts. The performance of the network was evaluated using Peak Signal to Noise Ratio (PSNR), Structural Similarity (SSIM), and Mean Opinion Score (MOS). The proposed study demonstrated that deep learning can be applied to create a generative model that is capable of producing enhanced ophthalmic images with 4x resolution and with improved structural details. The proposed method can be used to enhance ODPs and thereby significantly increase the detection accuracy of optic disc pathology. The average PSNR, SSIM and MOS was 25.01, 0.75, 4.33 respectively
In the third study, a deep-learning model was used to classify suspected glaucoma and to predict subsequent glaucoma onset-year in glaucoma suspects using clinical data and retinal images (ODP & Red-free Fundus RNFL Photo). Clinical data contains useful information about glaucoma diagnosis and prediction. However, no study has been undertaken to investigate how combining different types of clinical information would be helpful for predicting the subsequent course of glaucoma in an individual patient. For this study, image features extracted using Convolutional Auto Encoder (CAE) along with clinical features were used for glaucoma suspect classification and onset-year prediction. The performance of the proposed model was evaluated using accuracy and Mean Squared Error (MSE). Combing the CAE extracted image features and clinical features improved glaucoma suspect classification and on-set year prediction performance as compared to using the image features and patient features separately. The average MSE between onset-year and predicted onset year was 2.613
In this study, deep learning methodology was applied to clinical images related to glaucoma. DICNN with RNFL and GCIPL images were used for classification of glaucoma, SR-GAN with ODP images were used to increase detection accuracy of optic disc pathology, and CAE & machine learning algorithm with clinical data and retinal images was used for glaucoma suspect classification and onset-year predication. The improved glaucoma diagnosis performance was validated using both technical and clinical parameters. The proposed methods as a whole can significantly improve outcomes of glaucoma patients by early detection, prediction and enhancing detection accuracy.Contents
Abstract i
Contents iv
List of Tables vii
List of Figures viii
Chapter 1 General Introduction 1
1.1 Glaucoma 1
1.2 Deep Learning for Glaucoma Diagnosis 3
1.4 Thesis Objectives 3
Chapter 2 Dual-Input Convolutional Neural Network for Glaucoma Diagnosis using Spectral-Domain Optical Coherence Tomography 6
2.1 Introduction 6
2.1.1 Background 6
2.1.2 Related Work 7
2.2 Methods 8
2.2.1 Study Design 8
2.2.2 Dataset 9
2.2.3 Dual-Input Convolutional Neural Network (DICNN) 15
2.2.4 Training Environment 18
2.2.5 Statistical Analysis 19
2.3 Results 20
2.3.1 DICNN Performance 20
2.3.1 Grad-CAM for DICNN 34
2.4 Discussion 37
2.4.1 Research Significance 37
2.4.2 Limitations 40
2.5 Conclusion 42
Chapter 3 Deep-learning-based enhanced optic-disc photography 43
3.1 Introduction 43
3.1.1 Background 43
3.1.2 Needs 44
3.1.3 Related Work 45
3.2 Methods 46
3.2.1 Study Design 46
3.2.2 Dataset 46
3.2.2.1 Details on Customized Image Post-Processing Algorithm 47
3.2.3 SR-GAN Network 50
3.2.3.1 Design of Generative Adversarial Network 50
3.2.3.2 Loss Functions 55
3.2.4 Assessment of Clinical Implications of Enhanced ODPs 58
3.2.5 Statistical Analysis 60
3.2.6 Hardware Specifications & Software Specifications 60
3.3 Results 62
3.3.1 Training Loss of Modified SR-GAN 62
3.3.2 Performance of Final Network 66
3.3.3 Clinical Validation of Enhanced ODP by MOS comparison 77
3.3.4 Comparison of DH-Detection Accuracy 79
3.4 Discussion 80
3.4.1 Research Significance 80
3.4.2 Limitations 85
3.5 Conclusion 88
Chapter 4 Deep Learning Based Prediction of Glaucoma Onset Using Retinal Image and Patient Data 89
4.1 Introduction 89
4.1.1 Background 89
4.1.2 Related Work 90
4.2 Methods 90
4.2.1 Study Design 90
4.2.2 Dataset 91
4.2.3 Design of Overall System 94
4.2.4 Design of Convolutional Auto Encoder 95
4.2.5 Glaucoma Suspect Classification 97
4.2.6 Glaucoma Onset-Year Prediction 97
4.3 Result 99
4.3.1 Performance of Designed CAE 99
4.3.2 Performance of Designed Glaucoma Suspect Classification 101
4.3.3 Performance of Designed Glaucoma Onset-Year Prediction 105
4.4 Discussion 110
4.4.1 Research Significance 110
4.4.2 Limitations 110
4.5 Conclusion 111
Chapter 5 Summary and Future Works 112
5.1 Thesis Summary 112
5.2 Limitations and Future Works 113
Bibliography 115
Abstract in Korean 127
Acknowledgement 130Docto
RFormer: Transformer-based Generative Adversarial Network for Real Fundus Image Restoration on A New Clinical Benchmark
Ophthalmologists have used fundus images to screen and diagnose eye diseases.
However, different equipments and ophthalmologists pose large variations to the
quality of fundus images. Low-quality (LQ) degraded fundus images easily lead
to uncertainty in clinical screening and generally increase the risk of
misdiagnosis. Thus, real fundus image restoration is worth studying.
Unfortunately, real clinical benchmark has not been explored for this task so
far. In this paper, we investigate the real clinical fundus image restoration
problem. Firstly, We establish a clinical dataset, Real Fundus (RF), including
120 low- and high-quality (HQ) image pairs. Then we propose a novel
Transformer-based Generative Adversarial Network (RFormer) to restore the real
degradation of clinical fundus images. The key component in our network is the
Window-based Self-Attention Block (WSAB) which captures non-local
self-similarity and long-range dependencies. To produce more visually pleasant
results, a Transformer-based discriminator is introduced. Extensive experiments
on our clinical benchmark show that the proposed RFormer significantly
outperforms the state-of-the-art (SOTA) methods. In addition, experiments of
downstream tasks such as vessel segmentation and optic disc/cup detection
demonstrate that our proposed RFormer benefits clinical fundus image analysis
and applications. The dataset, code, and models are publicly available at
https://github.com/dengzhuo-AI/Real-FundusComment: IEEE J-BHI 2022; The First Benchmark and First Transformer-based
Method for Real Clinical Fundus Image Restoratio
GAN-Based Super-Resolution And Segmentation Of Retinal Layers In Optical Coherence Tomography Scans
Optical Coherence Tomography (OCT) has been identified as a noninvasive and cost-effective imaging modality for identifying potential biomarkers for Alzheimer\u27s diagnosis and progress detection. Current hypotheses indicate that retinal layer thickness, which can be assessed via OCT scans, is an efficient biomarker for identifying Alzheimer\u27s disease. Due to factors such as speckle noise, a small target region, and unfavorable imaging conditions manual segmentation of retina layers is a challenging task. Therefore, as a reasonable first step, this study focuses on automatically segmenting retinal layers to separate them for subsequent investigations. Another important challenge commonly faced is the lack of clarity of the layer boundaries in retina OCT scans, which compels the research of super-resolving the images for improved clarity.
Deep learning pipelines have stimulated substantial progress for the segmentation tasks. Generative adversarial networks (GANs) are a prominent field of deep learning which achieved astonishing performance in semantic segmentation. Conditional adversarial networks as a general-purpose solution to image-to-image translation problems not only learn the mapping from the input image to the output image but also learn a loss function to train this mapping. We propose a GAN-based segmentation model and evaluate incorporating popular networks, namely, U-Net and ResNet, in the GAN architecture with additional blocks of transposed convolution and sub-pixel convolution for the task of upscaling OCT images from low to high resolution by a factor of four. We also incorporate the Dice loss as an additional reconstruction loss term to improve the performance of this joint optimization task. Our best model configuration empirically achieved the Dice coefficient of 0.867 and mIOU of 0.765
Enhancing Image Quality: A Comparative Study of Spatial, Frequency Domain, and Deep Learning Methods
Image restoration and noise reduction methods have been created to restore deteriorated images and improve their quality. These methods have garnered substantial significance in recent times, mainly due to the growing utilization of digital imaging across diverse domains, including but not limited to medical imaging, surveillance, satellite imaging, and numerous others.
In this paper, we conduct a comparative analysis of three distinct approaches to image restoration: the spatial method, the frequency domain method, and the deep learning method. The study was conducted on a dataset of 10,000 images, and the performance of each method was evaluated using the accuracy and loss metrics. The results show that the deep learning method outperformed the other two methods, achieving a validation accuracy of 72.68% after 10 epochs. The spatial method had the lowest accuracy of the three, achieving a validation accuracy of 69.98% after 10 epochs. The FFT frequency domain method had a validation accuracy of 52.87% after 10 epochs, significantly lower than the other two methods. The study demonstrates that deep learning is a promising approach for image classification tasks and outperforms traditional methods such as spatial and frequency domain techniques
- β¦