2 research outputs found

    ํ•ฉ์„ฑ๊ณฑ ์ปค๋„ ์ •๊ทœํ™”๋ฅผ ์œ„ํ•œ ๊ณ ๋ฅธ ๊ฐ๋„๋ถ„์‚ฐ๋ฐฉ๋ฒ•

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ์ž์—ฐ๊ณผํ•™๋Œ€ํ•™ ์ˆ˜๋ฆฌ๊ณผํ•™๋ถ€, 2022. 8. ๊ฐ•๋ช…์ฃผ.In this thesis, we propose new convolutional kernel regularization methods. Along with the development of deep learning, there have been attempts to effectively regularize a convolutional layer, which is an important basic module of deep neural networks. Convolutional neural networks (CNN) are excellent at abstracting input data, but deepening causes gradient vanishing or explosion issues and produces redundant features. An approach to solve these issues is to directly regularize convolutional kernel weights of CNN. Its basic idea is to convert a convolutional kernel weight into a matrix and make the row or column vectors of the matrix orthogonal. However, this approach has some shortcomings. Firstly, it requires appropriate manipulation because overcomlete issue occurs when the number of vectors is larger than the dimension of vectors. As a method to deal with this issue, we define the concept of evenly dispersed state and propose PH0 and MST regularizations using this. Secondly, prior regularizations which enforce the Gram matrix of a matrix to be an identity matrix might not be an optimal approach for orthogonality of the matrix. We point out that these rather reduces the update of angles between some two vectors when two vectors are adjacent. Therefore, to complement for this issue, we propose EADK and EADC regularizations which update directly the angle. Through various experiments, we demonstrate that EADK and EADC regularizations outperform prior methods in some neural network architectures and, in particular, EADK has fast learning time.์ด ๋…ผ๋ฌธ์—์„œ๋Š” ํ•ฉ์„ฑ๊ณฑ์ปค๋„์— ๋Œ€ํ•œ ์ƒˆ๋กœ์šด ์ •๊ทœํ™” ๋ฐฉ๋ฒ•๋“ค์„ ์ œ์•ˆํ•œ๋‹ค. ๋”ฅ๋Ÿฌ๋‹์˜ ๋ฐœ๋‹ฌ๊ณผ ๋”๋ถˆ์–ด ์‹ ๊ฒฝ๋ง์˜ ๊ฐ€์žฅ ๊ธฐ๋ณธ์ ์ธ ๋ชจ๋“ˆ์ธ ํ•ฉ์„ฑ๊ณฑ ๋ ˆ์ด์–ด๋ฅผ ํšจ๊ณผ์ ์œผ๋กœ ์ •๊ทœํ™” ํ•˜๋ ค๋Š” ์‹œ๋„๋“ค์ด ์žˆ์–ด ์™”๋‹ค. ํ•ฉ์„ฑ๊ณฑ์‹ ๊ฒฝ๋ง๋Š” ์ธํ’‹๋ฐ์ดํ„ฐ๋ฅผ ์ถ”์ƒํ™”ํ•˜๋Š”๋ฐ ํƒ์›”ํ•˜์ง€๋งŒ ๋„คํŠธ์›Œํฌ์˜ ๊นŠ์ด๊ฐ€ ๊นŠ์–ด์ง€๋ฉด ๊ทธ๋ ˆ๋””์–ธํŠธ ์†Œ๋ฉธ์ด๋‚˜ ํญ๋ฐœ ๋ฌธ์ œ๋ฅผ ์ผ์œผํ‚ค๊ณ  ์ค‘๋ณต๋œ ํ”ผ์ณ๋“ค์„ ๋งŒ๋“ ๋‹ค. ์ด๋Ÿฌํ•œ ๋ฌธ์ œ๋“ค์„ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•œ ์ ‘๊ทผ๋ฒ• ์ค‘ ํ•˜๋‚˜๋Š” ์ง์ ‘ ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง์˜ ํ•ฉ์„ฑ๊ณฑ์ปค๋„์„ ์ง์ ‘ ์ •๊ทœํ™” ํ•˜๋Š” ๊ฒƒ์ด๋‹ค. ์ด ๋ฐฉ๋ฒ•์€ ํ•ฉ์„ฑ๊ณฑ์ปค๋„์„ ์–ด๋–ค ํ–‰๋ ฌ๋กœ ๋ณ€ํ™˜ํ•˜๊ณ  ํ–‰๋ ฌ์˜ ํ–‰ ๋˜๋Š” ์—ด๋“ค์˜ ๋ฒกํ„ฐ๋“ค์„ ์ง๊ต์‹œํ‚ค๋Š” ๊ฒƒ์ด๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์ด๋Ÿฌํ•œ ์ ‘๊ทผ๋ฒ•์€ ๋ช‡๊ฐ€์ง€ ๋‹จ์ ์ด ์žˆ๋‹ค. ์ฒซ์งธ๋กœ, ๋ฒกํ„ฐ์˜ ์ˆ˜๊ฐ€ ๋ฒกํ„ฐ์˜ ์ฐจ์›๋ณด๋‹ค ๋งŽ์„ ๋•Œ๋Š” ๋ชจ๋“  ๋ฒกํ„ฐ๋ฅผ ์ง๊ตํ™” ์‹œํ‚ฌ ์ˆ˜ ์—†๊ฒŒ ๋˜๋ฏ€๋กœ ์ ์ ˆํ•œ ๊ธฐ๋ฒ•๋“ค์„ ํ•„์š”๋กœ ํ•œ๋‹ค. ์ด ๋ฌธ์ œ๋ฅผ ๋‹ค๋ฃจ๊ธฐ ์œ„ํ•œ ํ•œ ๊ฐ€์ง€ ๋ฐฉ๋ฒ•์œผ๋กœ ์šฐ๋ฆฌ๋Š” ๋ถ„์‚ฐ ์ƒํƒœ๋ผ๋Š” ๊ฐœ๋…์„ ์ •์˜ํ•˜๊ณ  ์ด ๊ฐœ๋…์„ ํ™œ์šฉํ•œ PH0์™€ MST ์ •๊ทœํ™”๋ฒ•์„ ์ œ์•ˆํ•œ๋‹ค. ๋‘˜์งธ๋กœ, ๊ทธ๋žŒํ–‰๋ ฌ์„ ํ•ญ๋“ฑํ–‰๋ ฌ๋กœ ๊ทผ์‚ฌ์‹œํ‚ค๋Š” ๋ฐฉ๋ฒ•์„ ์‚ฌ์šฉํ•˜๋Š” ๊ธฐ์กด ์ •๊ทœํ™”๋ฒ•์ด ๋ฒกํ„ฐ๋“ค์„ ์ง๊ตํ™”์‹œํ‚ค๋Š” ์ตœ์ ์˜ ๋ฐฉ๋ฒ•์ด ์•„๋‹ ์ˆ˜ ์žˆ๋‹ค๋Š” ์ ์ด๋‹ค. ์ฆ‰, ๊ธฐ์กด์˜ ์ •๊ทœํ™”๋ฒ•์ด ๋‘ ๋ฒกํ„ฐ๊ฐ€ ๊ฐ€๊นŒ์šธ ๋•Œ๋Š” ์˜คํžˆ๋ ค ๊ฐ๋„์˜ ์—…๋ฐ์ดํŠธ๋ฅผ ์ค„์ด๊ฒŒ ๋œ๋‹ค.๋”ฐ๋ผ์„œ ์ด๋ฅผ ๋ณด์™„ํ•˜๊ธฐ ์œ„ํ•˜์—ฌ ์šฐ๋ฆฌ๋Š” ๊ฐ๋„๋ฅผ ์ง์ ‘ ์—…๋ฐ์ดํŠธํ•˜๋Š” EADK์™€ EADC ์ •๊ทœํ™”๋ฒ•์„ ์ œ์•ˆํ•œ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๋‹ค์–‘ํ•œ ์‹คํ—˜์„ ํ†ตํ•ด EADK์™€ EADC ์ •๊ทœํ™”๋ฒ•์ด ๋‹ค์ˆ˜์˜ ์‹ ๊ฒฝ๋ง๊ตฌ์กฐ์—์„œ ๊ธฐ์กด์˜ ๋ฐฉ๋ฒ•๋“ค๋ณด๋‹ค ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ๋ณด์ด๊ณ  ํŠนํžˆ EADK๋Š” ๋น ๋ฅธ ํ•™์Šต์‹œ๊ฐ„์„ ๊ฐ€์ง„๋‹ค๋Š” ๊ฒƒ์„ ํ™•์ธํ•œ๋‹ค.Abstract i 1 Introduction 1 2 Preliminaries 4 2.1 Two Ways of Understanding CNN Layers as Matrix Operations 5 2.1.1 Kernel Matrix 6 2.1.2 Convolution Matrix 7 2.2 Soft Orthogonality 11 2.2.1 SO Regularization 11 2.2.2 DSO Regularization 12 2.3 Mutual Coherence 13 2.3.1 MC Regularization 13 2.4 Spectral Restricted Isometry Property 13 2.4.1 Restricted Isometry Property 13 2.4.2 SRIP Regularization 15 2.5 Orthogonal Convolutional Neural Networks 18 2.5.1 OCNN Regularizaiton 18 3 Topological Dispersing Regularizations 22 3.1 Evenly Dispersed State 23 3.1.1 Dispersing Vectors on Sphere 23 3.1.2 Evenly Dispersed State in the Real Projective Spaces 25 3.2 Persistent Homology Regularization 33 3.2.1 Cech and Vietoris-Rips Complexes 35 3.2.2 Persistent Homology 36 3.2.3 PH0 Regularization 38 3.3 Minimum Spanning Tree Regularization 39 3.3.1 Minimum Spanning Tree 39 3.3.2 MST Regularization 41 4 Evenly Angle Dispersing Regularizations 42 4.1 Analysis of Soft Orthogonality 43 4.1.1 Analysis of Soft Orthogonality 43 4.2 Evenly Angle Dispersing Regularizations 47 4.2.1 Evenly Angle Dispersing Regularization with Kernel Matrix 47 4.2.2 Evenly Angle Dispersing Regularization with Convolution Matrix 52 5 Algorithms & Experiments 54 5.1 Algorithms 55 5.1.1 PH0 and MST 55 5.1.2 EADK 57 5.1.3 EADC 58 5.2 Experiments 59 5.2.1 Analysis for Angle Dispersing 59 5.2.2 Experimental Setups 62 5.2.3 Classification Accuracy 68 5.2.4 Additional Experiments 76 6 Conclusion 78 The bibliography 80 Abstract (in Korean) 85๋ฐ•
    corecore