7 research outputs found

    Automatic face and VLPโ€™s recognition for smart parking system

    Get PDF
    One of the concerning issues regarding smart city is Smart Parking. In Smart Parking, some researchers try to provide solutions and breakthroughs on several research topics among security systems, the availability of single space, an IoT framework, etc. In this study, we proposed a security system on Smart Parking based on face recognition and VLPโ€™s (Vehicle License Plates) identification. In this research, SSIM (Structural Similarity) method as part of IQA has been applied due to its reliability and simple computation for face detection and recognition process. From the test results of 30 data, obtained the highest SSIM value 0.83 with the highest accuracy rate of 76.67%. That level of accuracy still has not reached the implementation standard of 99.9%. So that it still needs to be improved in the future studies, especially in the filtering noise section

    ๋”ฅ๋Ÿฌ๋‹์„ ์ด์šฉํ•œ ๋…น๋‚ด์žฅ ์ง„๋‹จ ๋ณด์กฐ ์‹œ์Šคํ…œ

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ํ˜‘๋™๊ณผ์ • ๋ฐ”์ด์˜ค์—”์ง€๋‹ˆ์–ด๋ง์ „๊ณต, 2021. 2. ๊น€ํฌ์ฐฌ.๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ๋”ฅ ๋Ÿฌ๋‹ ๊ธฐ๋ฐ˜์˜ ์ง„๋‹จ ๋ณด์กฐ ์‹œ์Šคํ…œ์„ ์ œ์•ˆํ•˜์˜€๋‹ค. ์ƒˆ๋กœ์šด ๋ฐฉ๋ฒ•์ด ๋…น๋‚ด์žฅ ๋ฐ์ดํ„ฐ์— ์ ์šฉ๋˜์—ˆ๊ณ  ๊ฒฐ๊ณผ๋ฅผ ํ‰๊ฐ€ํ•˜์˜€๋‹ค. ์ฒซ๋ฒˆ์งธ ์—ฐ๊ตฌ์—์„œ๋Š” ์ŠคํŽ™ํŠธ๋Ÿผ์˜์—ญ ๋น›๊ฐ„์„ญ๋‹จ์ธต์ดฌ์˜๊ธฐ(SD-OCT)๋ฅผ ๋”ฅ ๋Ÿฌ๋‹ ๋ถ„๋ฅ˜ ๊ธฐ๋ฅผ ์ด์šฉํ•ด ๋ถ„์„ํ•˜์˜€๋‹ค. ์ŠคํŽ™ํŠธ๋Ÿผ์˜์—ญ ๋น›๊ฐ„์„ญ๋‹จ์ธต์ดฌ์˜๊ธฐ๋Š” ๋…น๋‚ด์žฅ์œผ๋กœ ์ธํ•œ ๊ตฌ์กฐ์  ์†์ƒ์„ ํ‰๊ฐ€ํ•˜๊ธฐ ์œ„ํ•ด ์‚ฌ์šฉํ•˜๋Š” ์žฅ๋น„์ด๋‹ค. ๋ถ„๋ฅ˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ํ•ฉ์„ฑ ๊ณฑ ์‹ ๊ฒฝ๋ง์„ ์ด์šฉํ•ด ๊ฐœ๋ฐœ ๋˜์—ˆ์œผ๋ฉฐ, ์ŠคํŽ™ํŠธ๋Ÿผ์˜์—ญ ๋น›๊ฐ„์„ญ๋‹จ์ธต์ดฌ์˜๊ธฐ์˜ ๋ง๋ง‰์‹ ๊ฒฝ์„ฌ์œ ์ธต(RNFL)๊ณผ ํ™ฉ๋ฐ˜๋ถ€ ์‹ ๊ฒฝ์ ˆ์„ธํฌ๋‚ด๋ง์ƒ์ธต (GCIPL) ์‚ฌ์ง„์„ ์ด์šฉํ•ด ํ•™์Šตํ–ˆ๋‹ค. ์ œ์•ˆํ•œ ๋ฐฉ๋ฒ•์€ ๋‘๊ฐœ์˜ ์ด๋ฏธ์ง€๋ฅผ ์ž…๋ ฅ์œผ๋กœ ๋ฐ›๋Š” ์ด์ค‘์ž…๋ ฅํ•ฉ์„ฑ๊ณฑ์‹ ๊ฒฝ๋ง(DICNN)์ด๋ฉฐ, ๋”ฅ ๋Ÿฌ๋‹ ๋ถ„๋ฅ˜์—์„œ ํšจ๊ณผ์ ์ธ ๊ฒƒ์œผ๋กœ ์•Œ๋ ค์ ธ ์žˆ๋‹ค. ์ด์ค‘์ž…๋ ฅํ•ฉ์„ฑ๊ณฑ์‹ ๊ฒฝ๋ง์€ ๋ง๋ง‰์‹ ๊ฒฝ์„ฌ์œ ์ธต ๊ณผ ์‹ ๊ฒฝ์ ˆ์„ธํฌ์ธต ์˜ ๋‘๊ป˜ ์ง€๋„๋ฅผ ์ด์šฉํ•˜์—ฌ ํ•™์Šต ๋์œผ๋ฉฐ, ํ•™์Šต๋œ ๋„คํŠธ์›Œํฌ๋Š” ๋…น๋‚ด์žฅ๊ณผ ์ •์ƒ ๊ตฐ์„ ๊ตฌ๋ถ„ํ•œ๋‹ค. ์ด์ค‘์ž…๋ ฅํ•ฉ์„ฑ๊ณฑ์‹ ๊ฒฝ๋ง์€ ์ •ํ™•๋„์™€ ์ˆ˜์‹ ๊ธฐ๋™์ž‘ํŠน์„ฑ๊ณก์„ ํ•˜๋ฉด์  (AUC)์œผ๋กœ ํ‰๊ฐ€ ๋˜์—ˆ๋‹ค. ๋ง๋ง‰์‹ ๊ฒฝ์„ฌ์œ ์ธต๊ณผ ์‹ ๊ฒฝ์ ˆ์„ธํฌ์ธต ๋‘๊ป˜ ์ง€๋„๋กœ ํ•™์Šต๋œ ์„ค๊ณ„ํ•œ ๋”ฅ ๋Ÿฌ๋‹ ๋ชจ๋ธ์„ ์กฐ๊ธฐ ๋…น๋‚ด์žฅ๊ณผ ์ •์ƒ ๊ตฐ์„ ๋ถ„๋ฅ˜ํ•˜๋Š” ์„ฑ๋Šฅ์„ ํ‰๊ฐ€ํ•˜๊ณ  ๋น„๊ตํ•˜์˜€๋‹ค. ์„ฑ๋Šฅํ‰๊ฐ€ ๊ฒฐ๊ณผ ์ด์ค‘์ž…๋ ฅํ•ฉ์„ฑ๊ณฑ์‹ ๊ฒฝ๋ง์€ ์กฐ๊ธฐ ๋…น๋‚ด์žฅ์„ ๋ถ„๋ฅ˜ํ•˜๋Š”๋ฐ 0.869์˜ ์ˆ˜์‹ ๊ธฐ๋™์ž‘ํŠน์„ฑ๊ณก์„ ์˜๋„“์ด์™€ 0.921์˜ ๋ฏผ๊ฐ๋„, 0.756์˜ ํŠน์ด๋„๋ฅผ ๋ณด์˜€๋‹ค. ๋‘๋ฒˆ์งธ ์—ฐ๊ตฌ์—์„œ๋Š” ๋”ฅ ๋Ÿฌ๋‹์„ ์ด์šฉํ•ด ์‹œ์‹ ๊ฒฝ์œ ๋‘์‚ฌ์ง„์˜ ํ•ด์ƒ๋„์™€ ๋Œ€๋น„, ์ƒ‰๊ฐ, ๋ฐ๊ธฐ๋ฅผ ๋ณด์ •ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•˜์˜€๋‹ค. ์‹œ์‹ ๊ฒฝ์œ ๋‘์‚ฌ์ง„์€ ๋…น๋‚ด์žฅ์„ ์ง„๋‹จํ•˜๋Š”๋ฐ ์žˆ์–ด ํšจ๊ณผ์ ์ธ ๊ฒƒ์œผ๋กœ ์•Œ๋ ค์ ธ ์žˆ๋‹ค. ํ•˜์ง€๋งŒ, ๋…น๋‚ด์žฅ์˜ ์ง„๋‹จ์—์„œ ํ™˜์ž์˜ ๋‚˜, ์ž‘์€ ๋™๊ณต, ๋งค์ฒด ๋ถˆํˆฌ๋ช…์„ฑ ๋“ฑ์œผ๋กœ ์ธํ•ด ํ‰๊ฐ€๊ฐ€ ์–ด๋ ค์šด ๊ฒฝ์šฐ๊ฐ€ ์žˆ๋‹ค. ์ดˆ ํ•ด์ƒ๋„์™€ ๋ณด์ • ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ์ดˆ ํ•ด์ƒ๋„ ์ ๋Œ€์ ์ƒ์„ฑ์‹ ๊ฒฝ๋ง์„ ํ†ตํ•ด ๊ฐœ๋ฐœ๋˜์—ˆ๋‹ค. ์›๋ณธ ๊ณ ํ•ด์ƒ๋„์˜ ์‹œ์‹ ๊ฒฝ ์œ ๋‘ ์‚ฌ์ง„์€ ์ €ํ•ด์ƒ๋„ ์‚ฌ์ง„์œผ๋กœ ์ถ•์†Œ๋˜๊ณ , ๋ณด์ •๋œ ๊ณ ํ•ด์ƒ๋„ ์‹œ์‹ ๊ฒฝ์œ ๋‘์‚ฌ์ง„์œผ๋กœ ๋ณด์ • ๋˜๋ฉฐ, ๋ณด์ •๋œ ์‚ฌ์ง„์€ ์‹œ์‹ ๊ฒฝ์—ฌ๋ฐฑ์˜ ๊ฐ€์‹œ์„ฑ๊ณผ ๊ทผ์ฒ˜ ํ˜ˆ๊ด€์„ ์ž˜ ๋ณด์ด๋„๋ก ํ›„์ฒ˜๋ฆฌ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ด์šฉํ•œ๋‹ค. ์ €ํ•ด์ƒ๋„์ด๋ฏธ์ง€๋ฅผ ๋ณด์ •๋œ ๊ณ ํ•ด์ƒ๋„์ด๋ฏธ์ง€๋กœ ๋ณต์›ํ•˜๋Š” ๊ณผ์ •์„ ์ดˆํ•ด์ƒ๋„์ ๋Œ€์ ์‹ ๊ฒฝ๋ง์„ ํ†ตํ•ด ํ•™์Šตํ•œ๋‹ค. ์„ค๊ณ„ํ•œ ๋„คํŠธ์›Œํฌ๋Š” ์‹ ํ˜ธ ๋Œ€ ์žก์Œ ๋น„(PSNR)๊ณผ ๊ตฌ์กฐ์ ์œ ์‚ฌ์„ฑ(SSIM), ํ‰๊ท ํ‰๊ฐ€์ (MOS)๋ฅผ ์ด์šฉํ•ด ํ‰๊ฐ€ ๋˜์—ˆ๋‹ค. ํ˜„์žฌ์˜ ์—ฐ๊ตฌ๋Š” ๋”ฅ ๋Ÿฌ๋‹์ด ์•ˆ๊ณผ ์ด๋ฏธ์ง€๋ฅผ 4๋ฐฐ ํ•ด์ƒ๋„์™€ ๊ตฌ์กฐ์ ์ธ ์„ธ๋ถ€ ํ•ญ๋ชฉ์ด ์ž˜ ๋ณด์ด๋„๋ก ๊ฐœ์„ ํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ฒƒ์„ ๋ณด์—ฌ์ฃผ์—ˆ๋‹ค. ํ–ฅ์ƒ๋œ ์‹œ์‹ ๊ฒฝ์œ ๋‘ ์‚ฌ์ง„์€ ์‹œ์‹ ๊ฒฝ์˜ ๋ณ‘๋ฆฌํ•™์ ์ธ ํŠน์„ฑ์˜ ์ง„๋‹จ ์ •ํ™•๋„๋ฅผ ๋ช…ํ™•ํžˆ ํ–ฅ์ƒ์‹œํ‚จ๋‹ค. ์„ฑ๋Šฅํ‰๊ฐ€๊ฒฐ๊ณผ ํ‰๊ท  PSNR์€ 25.01 SSIM์€ 0.75 MOS๋Š” 4.33์œผ๋กœ ๋‚˜ํƒ€๋‚ฌ๋‹ค. ์„ธ๋ฒˆ์งธ ์—ฐ๊ตฌ์—์„œ๋Š” ํ™˜์ž ์ •๋ณด์™€ ์•ˆ๊ณผ ์˜์ƒ(์‹œ์‹ ๊ฒฝ์œ ๋‘ ์‚ฌ์ง„๊ณผ ๋ถ‰์€์ƒ‰์ด ์—†๋Š” ๋ง๋ง‰์‹ ๊ฒฝ์„ฌ์œ ์ธต ์‚ฌ์ง„)์„ ์ด์šฉํ•ด ๋…น๋‚ด์žฅ ์˜์‹ฌ ํ™˜์ž๋ฅผ ๋ถ„๋ณ„ํ•˜๊ณ  ๋…น๋‚ด์žฅ ์˜์‹ฌ ํ™˜์ž์˜ ๋ฐœ๋ณ‘ ์—ฐ์ˆ˜๋ฅผ ์˜ˆ์ธกํ•˜๋Š” ๋”ฅ ๋Ÿฌ๋‹ ๋ชจ๋ธ์„ ๊ฐœ๋ฐœํ•˜์˜€๋‹ค. ์ž„์ƒ ๋ฐ์ดํ„ฐ๋“ค์€ ๋…น๋‚ด์žฅ์„ ์ง„๋‹จํ•˜๊ฑฐ๋‚˜ ์˜ˆ์ธกํ•˜๋Š”๋ฐ ์œ ์šฉํ•œ ์ •๋ณด๋“ค์„ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค. ํ•˜์ง€๋งŒ, ์–ด๋–ป๊ฒŒ ๋‹ค์–‘ํ•œ ์œ ํ˜•์˜ ์ž„์ƒ์ •๋ณด๋“ค์„ ์กฐํ•ฉํ•˜๋Š” ๊ฒƒ์ด ๊ฐ๊ฐ์˜ ํ™˜์ž๋“ค์— ๋Œ€ํ•ด ์ž ์žฌ์ ์ธ ๋…น๋‚ด์žฅ์„ ์˜ˆ์ธกํ•˜๋Š”๋ฐ ์–ด๋–ค ์˜ํ–ฅ์„ ์ฃผ๋Š”์ง€์— ๋Œ€ํ•œ ์—ฐ๊ตฌ๊ฐ€ ์ง„ํ–‰ ๋œ ์ ์ด ์—†๋‹ค. ๋…น๋‚ด์žฅ ์˜ ์‹ฌ์ž ๋ถ„๋ฅ˜์™€ ๋ฐœ๋ณ‘ ๋…„ ์ˆ˜ ์˜ˆ์ธก์€ ํ•ฉ์„ฑ ๊ณฑ ์ž๋™ ์ธ์ฝ”๋”(CAE)๋ฅผ ๋น„ ์ง€๋„์  ํŠน์„ฑ ์ถ”์ถœ ๊ธฐ๋กœ ์‚ฌ์šฉํ•˜๊ณ , ๊ธฐ๊ณ„ํ•™์Šต ๋ถ„๋ฅ˜ ๊ธฐ์™€ ํšŒ๊ท€๊ธฐ๋ฅผ ํ†ตํ•ด ์ง„ํ–‰ํ•˜์˜€๋‹ค. ์„ค๊ณ„ํ•œ ๋ชจ๋ธ์€ ์ •ํ™•๋„์™€ ํ‰๊ท ์ œ๊ณฑ์˜ค์ฐจ(MSE)๋ฅผ ํ†ตํ•ด ํ‰๊ฐ€ ๋˜์—ˆ์œผ๋ฉฐ, ์ด๋ฏธ์ง€ ํŠน์ง•๊ณผ ํ™˜์ž ํŠน์ง•์€ ์กฐํ•ฉํ–ˆ์„ ๋•Œ ๋…น๋‚ด์žฅ ์˜์‹ฌ ํ™˜์ž ๋ถ„๋ฅ˜์™€ ๋ฐœ๋ณ‘ ๋…„ ์ˆ˜ ์˜ˆ์ธก์˜ ์„ฑ๋Šฅ์ด ์ด๋ฏธ์ง€ ํŠน์ง•๊ณผ ํ™˜์ž ํŠน์ง•์„ ๊ฐ๊ฐ ์ผ์„ ๋•Œ๋ณด๋‹ค ์„ฑ๋Šฅ์ด ์ข‹์•˜๋‹ค. ์ •๋‹ต๊ณผ์˜ MSE๋Š” 2.613์œผ๋กœ ๋‚˜ํƒ€๋‚ฌ๋‹ค. ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ๋”ฅ ๋Ÿฌ๋‹์„ ์ด์šฉํ•ด ๋…น๋‚ด์žฅ ๊ด€๋ จ ์ž„์ƒ ๋ฐ์ดํ„ฐ ์ค‘ ๋ง๋ง‰์‹ ๊ฒฝ์„ฌ์œ ์ธต, ์‹ ๊ฒฝ์ ˆ์„ธํฌ์ธต ์‚ฌ์ง„์„ ๋…น๋‚ด์žฅ ์ง„๋‹จ์— ์ด์šฉ๋˜์—ˆ๊ณ , ์‹œ์‹ ๊ฒฝ์œ ๋‘ ์‚ฌ์ง„์€ ์‹œ์‹ ๊ฒฝ์˜ ๋ณ‘๋ฆฌํ•™์ ์ธ ์ง„๋‹จ ์ •ํ™•๋„๋ฅผ ๋†’์˜€๊ณ , ํ™˜์ž ์ •๋ณด๋Š” ๋ณด๋‹ค ์ •ํ™•ํ•œ ๋…น๋‚ด์žฅ ์˜์‹ฌ ํ™˜์ž ๋ถ„๋ฅ˜์™€ ๋ฐœ๋ณ‘ ๋…„ ์ˆ˜ ์˜ˆ์ธก์— ์ด์šฉ๋˜์—ˆ๋‹ค. ํ–ฅ์ƒ๋œ ๋…น๋‚ด์žฅ ์ง„๋‹จ ์„ฑ๋Šฅ์€ ๊ธฐ์ˆ ์ ์ด๊ณ  ์ž„์ƒ์ ์ธ ์ง€ํ‘œ๋“ค์„ ํ†ตํ•ด ๊ฒ€์ฆ๋˜์—ˆ๋‹ค.This paper presents deep learning-based methods for improving glaucoma diagnosis support systems. Novel methods were applied to glaucoma clinical cases and the results were evaluated. In the first study, a deep learning classifier for glaucoma diagnosis based on spectral-domain optical coherence tomography (SD-OCT) images was proposed and evaluated. Spectral-domain optical coherence tomography (SD-OCT) is commonly employed as an imaging modality for the evaluation of glaucomatous structural damage. The classification model was developed using convolutional neural network (CNN) as a base, and was trained with SD-OCT retinal nerve fiber layer (RNFL) and macular ganglion cell-inner plexiform layer (GCIPL) images. The proposed network architecture, termed Dual-Input Convolutional Neural Network (DICNN), showed great potential as an effective classification algorithm based on two input images. DICNN was trained with both RNFL and GCIPL thickness maps that enabled it to discriminate between normal and glaucomatous eyes. The performance of the proposed DICNN was evaluated with accuracy and area under the receiver operating characteristic curve (AUC), and was compared to other methods using these metrics. Compared to other methods, the proposed DICNN model demonstrated high diagnostic ability for the discrimination of early-stage glaucoma patients in normal subjects. AUC, sensitivity and specificity was 0.869, 0.921, 0.756 respectively. In the second study, a deep-learning method for increasing the resolution and improving the legibility of Optic-disc Photography(ODP) was proposed. ODP has been proven to be useful for optic nerve evaluation in glaucoma. But in clinical practice, limited patient cooperation, small pupil or media opacities can limit the performance of ODP. A model to enhance the resolution of ODP images, termed super-resolution, was developed using Super Resolution Generative Adversarial Network(SR-GAN). To train this model, high-resolution original ODP images were transformed into two counterparts: (1) down-scaled low-resolution ODPs, and (2) compensated high-resolution ODPs with enhanced visibility of the optic disc margin and surrounding retinal vessels which were produced using a customized image post-processing algorithm. The SR-GAN was trained to learn and recognize the differences between these two counterparts. The performance of the network was evaluated using Peak Signal to Noise Ratio (PSNR), Structural Similarity (SSIM), and Mean Opinion Score (MOS). The proposed study demonstrated that deep learning can be applied to create a generative model that is capable of producing enhanced ophthalmic images with 4x resolution and with improved structural details. The proposed method can be used to enhance ODPs and thereby significantly increase the detection accuracy of optic disc pathology. The average PSNR, SSIM and MOS was 25.01, 0.75, 4.33 respectively In the third study, a deep-learning model was used to classify suspected glaucoma and to predict subsequent glaucoma onset-year in glaucoma suspects using clinical data and retinal images (ODP & Red-free Fundus RNFL Photo). Clinical data contains useful information about glaucoma diagnosis and prediction. However, no study has been undertaken to investigate how combining different types of clinical information would be helpful for predicting the subsequent course of glaucoma in an individual patient. For this study, image features extracted using Convolutional Auto Encoder (CAE) along with clinical features were used for glaucoma suspect classification and onset-year prediction. The performance of the proposed model was evaluated using accuracy and Mean Squared Error (MSE). Combing the CAE extracted image features and clinical features improved glaucoma suspect classification and on-set year prediction performance as compared to using the image features and patient features separately. The average MSE between onset-year and predicted onset year was 2.613 In this study, deep learning methodology was applied to clinical images related to glaucoma. DICNN with RNFL and GCIPL images were used for classification of glaucoma, SR-GAN with ODP images were used to increase detection accuracy of optic disc pathology, and CAE & machine learning algorithm with clinical data and retinal images was used for glaucoma suspect classification and onset-year predication. The improved glaucoma diagnosis performance was validated using both technical and clinical parameters. The proposed methods as a whole can significantly improve outcomes of glaucoma patients by early detection, prediction and enhancing detection accuracy.Contents Abstract i Contents iv List of Tables vii List of Figures viii Chapter 1 General Introduction 1 1.1 Glaucoma 1 1.2 Deep Learning for Glaucoma Diagnosis 3 1.4 Thesis Objectives 3 Chapter 2 Dual-Input Convolutional Neural Network for Glaucoma Diagnosis using Spectral-Domain Optical Coherence Tomography 6 2.1 Introduction 6 2.1.1 Background 6 2.1.2 Related Work 7 2.2 Methods 8 2.2.1 Study Design 8 2.2.2 Dataset 9 2.2.3 Dual-Input Convolutional Neural Network (DICNN) 15 2.2.4 Training Environment 18 2.2.5 Statistical Analysis 19 2.3 Results 20 2.3.1 DICNN Performance 20 2.3.1 Grad-CAM for DICNN 34 2.4 Discussion 37 2.4.1 Research Significance 37 2.4.2 Limitations 40 2.5 Conclusion 42 Chapter 3 Deep-learning-based enhanced optic-disc photography 43 3.1 Introduction 43 3.1.1 Background 43 3.1.2 Needs 44 3.1.3 Related Work 45 3.2 Methods 46 3.2.1 Study Design 46 3.2.2 Dataset 46 3.2.2.1 Details on Customized Image Post-Processing Algorithm 47 3.2.3 SR-GAN Network 50 3.2.3.1 Design of Generative Adversarial Network 50 3.2.3.2 Loss Functions 55 3.2.4 Assessment of Clinical Implications of Enhanced ODPs 58 3.2.5 Statistical Analysis 60 3.2.6 Hardware Specifications & Software Specifications 60 3.3 Results 62 3.3.1 Training Loss of Modified SR-GAN 62 3.3.2 Performance of Final Network 66 3.3.3 Clinical Validation of Enhanced ODP by MOS comparison 77 3.3.4 Comparison of DH-Detection Accuracy 79 3.4 Discussion 80 3.4.1 Research Significance 80 3.4.2 Limitations 85 3.5 Conclusion 88 Chapter 4 Deep Learning Based Prediction of Glaucoma Onset Using Retinal Image and Patient Data 89 4.1 Introduction 89 4.1.1 Background 89 4.1.2 Related Work 90 4.2 Methods 90 4.2.1 Study Design 90 4.2.2 Dataset 91 4.2.3 Design of Overall System 94 4.2.4 Design of Convolutional Auto Encoder 95 4.2.5 Glaucoma Suspect Classification 97 4.2.6 Glaucoma Onset-Year Prediction 97 4.3 Result 99 4.3.1 Performance of Designed CAE 99 4.3.2 Performance of Designed Glaucoma Suspect Classification 101 4.3.3 Performance of Designed Glaucoma Onset-Year Prediction 105 4.4 Discussion 110 4.4.1 Research Significance 110 4.4.2 Limitations 110 4.5 Conclusion 111 Chapter 5 Summary and Future Works 112 5.1 Thesis Summary 112 5.2 Limitations and Future Works 113 Bibliography 115 Abstract in Korean 127 Acknowledgement 130Docto

    Efficient and effective objective image quality assessment metrics

    Get PDF
    Acquisition, transmission, and storage of images and videos have been largely increased in recent years. At the same time, there has been an increasing demand for high quality images and videos to provide satisfactory quality-of-experience for viewers. In this respect, high dynamic range (HDR) imaging with higher than 8-bit depth has been an interesting approach in order to capture more realistic images and videos. Objective image and video quality assessment plays a significant role in monitoring and enhancing the image and video quality in several applications such as image acquisition, image compression, multimedia streaming, image restoration, image enhancement and displaying. The main contributions of this work are to propose efficient features and similarity maps that can be used to design perceptually consistent image quality assessment tools. In this thesis, perceptually consistent full-reference image quality assessment (FR-IQA) metrics are proposed to assess the quality of natural, synthetic, photo-retouched and tone-mapped images. In addition, efficient no-reference image quality metrics are proposed to assess JPEG compressed and contrast distorted images. Finally, we propose a perceptually consistent color to gray conversion method, perform a subjective rating and evaluate existing color to gray assessment metrics. Existing FR-IQA metrics may have the following limitations. First, their performance is not consistent for different distortions and datasets. Second, better performing metrics usually have high complexity. We propose in this thesis an efficient and reliable full-reference image quality evaluator based on new gradient and color similarities. We derive a general deviation pooling formulation and use it to compute a final quality score from the similarity maps. Extensive experimental results verify high accuracy and consistent performance of the proposed metric on natural, synthetic and photo retouched datasets as well as its low complexity. In order to visualize HDR images on standard low dynamic range (LDR) displays, tone-mapping operators are used in order to convert HDR into LDR. Given different depth bits of HDR and LDR, traditional FR-IQA metrics are not able to assess the quality of tone-mapped images. The existing full-reference metric for tone-mapped images called TMQI converts both HDR and LDR to an intermediate color space and measure their similarity in the spatial domain. We propose in this thesis a feature similarity full-reference metric in which local phase of HDR is compared with the local phase of LDR. Phase is an important information of images and previous studies have shown that human visual system responds strongly to points in an image where the phase information is ordered. Experimental results on two available datasets show the very promising performance of the proposed metric. No-reference image quality assessment (NR-IQA) metrics are of high interest because in the most present and emerging practical real-world applications, the reference signals are not available. In this thesis, we propose two perceptually consistent distortion-specific NR-IQA metrics for JPEG compressed and contrast distorted images. Based on edge statistics of JPEG compressed images, an efficient NR-IQA metric for blockiness artifact is proposed which is robust to block size and misalignment. Then, we consider the quality assessment of contrast distorted images which is a common distortion. Higher orders of Minkowski distance and power transformation are used to train a low complexity model that is able to assess contrast distortion with high accuracy. For the first time, the proposed model is used to classify the type of contrast distortions which is very useful additional information for image contrast enhancement. Unlike its traditional use in the assessment of distortions, objective IQA can be used in other applications. Examples are the quality assessment of image fusion, color to gray image conversion, inpainting, background subtraction, etc. In the last part of this thesis, a real-time and perceptually consistent color to gray image conversion methodology is proposed. The proposed correlation-based method and state-of-the-art methods are compared by subjective and objective evaluation. Then, a conclusion is made on the choice of the objective quality assessment metric for the color to gray image conversion. The conducted subjective ratings can be used in the development process of quality assessment metrics for the color to gray image conversion and to test their performance

    Image Quality Assessment: Addressing the Data Shortage and Multi-Stage Distortion Challenges

    Get PDF
    Visual content constitutes the vast majority of the ever increasing global Internet traffic, thus highlighting the central role that it plays in our daily lives. The perceived quality of such content can be degraded due to a number of distortions that it may undergo during the processes of acquisition, storage, transmission under bandwidth constraints, and display. Since the subjective evaluation of such large volumes of visual content is impossible, the development of perceptually well-aligned and practically applicable objective image quality assessment (IQA) methods has taken on crucial importance to ensure the delivery of an adequate quality of experience to the end user. Substantial strides have been made in the last two decades in designing perceptual quality methods and three major paradigms are now well-established in IQA research, these being Full-Reference (FR), Reduced-Reference (RR), and No-Reference (NR), which require complete, partial, and no access to the pristine reference content, respectively. Notwithstanding the progress made so far, significant challenges are restricting the development of practically applicable IQA methods. In this dissertation we aim to address two major challenges: 1) The data shortage challenge, and 2) The multi-stage distortion challenge. NR or blind IQA (BIQA) methods usually rely on machine learning methods, such as deep neural networks (DNNs), to learn a quality model by training on subject-rated IQA databases. Due to constraints of subjective-testing, such annotated datasets are quite small-scale, containing at best a few thousands of images. This is in sharp contrast to the area of visual recognition where tens of millions of annotated images are available. Such a data challenge has become a major hurdle on the breakthrough of DNN-based IQA approaches. We address the data challenge by developing the largest IQA dataset, called the Waterloo Exploration-II database, which consists of 3,570 pristine and around 3.45 million distorted images which are generated by using content adaptive distortion parameters and consist of both singly and multiply distorted content. As a prerequisite requirement of developing an alternative annotation mechanism, we conduct the largest performance evaluation survey in the IQA area to-date to ascertain the top performing FR and fused FR methods. Based on the findings of this survey, we develop a technique called Synthetic Quality Benchmark (SQB), to automatically assign highly perceptual quality labels to large-scale IQA datasets. We train a DNN-based BIQA model, called EONSS, on the SQB-annotated Waterloo Exploration-II database. Extensive tests on a large collection of completely independent and subject-rated IQA datasets show that EONSS outperforms the very state-of-the-art in BIQA, both in terms of perceptual quality prediction performance and computation time, thereby demonstrating the efficacy of our approach to address the data challenge. In practical media distribution systems, visual content undergoes a number of degradations as it is transmitted along the delivery chain, making it multiply distorted. Yet, research in IQA has mainly focused on the simplistic case of singly distorted content. In many practical systems, apart from the final multiply distorted content, access to earlier degraded versions of such content is available. However, the three major IQA paradigms (FR, RR, and, NR) are unable to take advantage of this additional information. To address this challenge, we make one of the first attempts to study the behavior of multiple simultaneous distortion combinations in a two-stage distortion pipeline. Next, we introduce a new major IQA paradigm, called degraded reference (DR) IQA, to evaluate the quality of multiply distorted images by also taking into consideration their respective degraded references. We construct two datasets for the purpose of DR IQA model development, and call them DR IQA database V1 and V2. These datasets are designed on the pattern of the Waterloo Exploration-II database and have 32,912 SQB-annotated distorted images, composed of both singly distorted degraded references and multiply distorted content. We develop distortion behavior based and SVR-based DR IQA models. Extensive testing on an independent set of IQA datasets, including three subject-rated datasets, demonstrates that by utilizing the additional information available in the form of degraded references, the DR IQA models perform significantly better than their BIQA counterparts, thereby establishing DR IQA as a new paradigm in IQA
    corecore