4 research outputs found

    ์•ฝํ•œ ์ง€๋„ํ•™์Šต ๊ธฐ๋ฐ˜์˜ ๋ฌผ์ฒด ํƒ์ง€์—์„œ์˜ ํ•™์Šต ๋ถ€๋‹ด์„ ์ค„์ด๊ธฐ ์œ„ํ•œ ์—ฐ๊ตฌ

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ์ž์—ฐ๊ณผํ•™๋Œ€ํ•™ ์ˆ˜๋ฆฌ๊ณผํ•™๋ถ€, 2023. 2. ๊ฐ•๋ช…์ฃผ.In this thesis, we propose two models for weakly supervised object localization (WSOL). Many existing WSOL models have various burdens of learning, e.g., the nonnegligible cost of hyperparameter search for loss function. Thus, we first propose a model called SFPN to reduce the cost of hyperparameter search for loss function. SFPN enhances the information of the feature maps by exploiting the structure of feature pyramid network. Then these feature maps are engaged in the prediction of the bounding box. This process helps us use only cross-entropy loss as well as improving performance. Furthermore, we propose the second model named A2E Net to enjoy a smaller number of parameters. A2E Net consists of spatial attention branch and refinement branch. Spatial attention branch heightens the spatial information using few parameters. Also, refinement branch is composed of attention module and erasing module, and these modules have no trainable parameters. With the output feature map of spatial attention branch, attention module makes the feature map with more accurate information by using a connection between pixels. Also, erasing module erases the most discriminative region to make the network take account of the less discriminative region. Moreover, we boost the performance with multiple sizes of erasing. Finally, we sum up two output feature maps from attention module and erasing module to utilize information from these two modules. Extensive experiments on CUB-200-2011 and ILSVRC show the great performance of SFPN and A2E Net compared to other existing WSOL models.๋ณธ ํ•™์œ„๋…ผ๋ฌธ์—์„œ ์šฐ๋ฆฌ๋Š” ์•ฝํ•œ ์ง€๋„ ๊ธฐ๋ฐ˜์˜ ๋ฌผ์ฒดํƒ์ง€๋ฅผ ์œ„ํ•œ ๋‘ ๊ฐ€์ง€ ๋ชจ๋ธ์„ ์ œ์•ˆํ•œ๋‹ค. ๊ธฐ์กด์˜ ๋งŽ์€ ์•ฝํ•œ ์ง€๋„ ๊ธฐ๋ฐ˜์˜ ๋ฌผ์ฒดํƒ์ง€๋ฅผ ์œ„ํ•œ ๋ชจ๋ธ๋“ค์€ ์†์‹คํ•จ์ˆ˜์˜ ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ ์ฐพ๊ธฐ์— ๋“ค์–ด๊ฐ€๋Š” ๋น„์šฉ์ด ๋ฌด์‹œํ•˜๊ธฐ ์–ด๋ ต๋‹ค๋Š” ๋“ฑ์˜ ํ•œ๊ณ„์ ์ด ์žˆ๋‹ค. ๊ทธ๋ž˜์„œ ์šฐ๋ฆฌ๋Š” ๋จผ์ € ์ด ์†์‹คํ•จ์ˆ˜์˜ ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ ์ฐพ๊ธฐ์— ๋“ค์–ด๊ฐ€๋Š” ๋น„์šฉ์„ ์ค„์ด๊ธฐ ์œ„ํ•ด์„œ SFPN์ด๋ผ๋Š” ์ด๋ฆ„์„ ๊ฐ€์ง„ ๋ชจ๋ธ์„ ์ œ์•ˆํ•œ๋‹ค. SFPN์€ ํŠน์ง• ํ”ผ๋ผ๋ฏธ๋“œ ๋„คํŠธ์›Œํฌ์˜ ๊ตฌ์กฐ๋ฅผ ํ™œ์šฉํ•˜์—ฌ ํŠน์ง• ๋งต๋“ค์˜ ์ •๋ณด๋ฅผ ๊ฐ•ํ™”์‹œ์ผฐ๋‹ค. ์ดํ›„์— ์ด ํŠน์ง• ๋งต๋“ค์€ ๊ฒฝ๊ณ„ ์ƒ์ž์˜ ์˜ˆ์ธก์— ์ฐธ์—ฌํ•œ๋‹ค. ์ด ๊ณผ์ •์€ ์„ฑ๋Šฅ ํ–ฅ์ƒ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ์˜ค์ง ๊ต์ฐจ ์—”ํŠธ๋กœํ”ผ ํ•จ์ˆ˜๋งŒ์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•˜๋Š” ํšจ๊ณผ๋ฅผ ๊ฐ€์ ธ์™”๋‹ค. ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ์šฐ๋ฆฌ๋Š” ์ข€ ๋” ์ ์€ ๊ฐœ์ˆ˜์˜ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ํ™œ์šฉํ•˜๊ธฐ ์œ„ํ•˜์—ฌ ๋‘ ๋ฒˆ์งธ ๋ชจ๋ธ์ธ A2E Net์„ ์ œ์•ˆํ•œ๋‹ค. ์ด ๋ชจ๋ธ์€ ๊ณต๊ฐ„ ์ง‘์ค‘ ๋ถ„๊ธฐ, ์ •์ œ ๋ถ„๊ธฐ๋กœ ๊ตฌ์„ฑ๋œ๋‹ค. ์šฐ์„ , ๊ณต๊ฐ„ ์ง‘์ค‘ ๋ถ„๊ธฐ๋Š” ์ ์€ ๊ฐœ์ˆ˜์˜ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๊ณต๊ฐ„ ์ •๋ณด๋ฅผ ๊ฐ•ํ™”์‹œํ‚จ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ์ •์ œ ๋ถ„๊ธฐ๋Š” ์ง‘์ค‘ ๋ชจ๋“ˆ๊ณผ ์ง€์šฐ๊ธฐ ๋ชจ๋“ˆ๋กœ ๊ตฌ์„ฑ๋˜๊ณ , ์ด ๋ชจ๋“ˆ๋“ค์€ ๋ชจ๋‘ ํ•™์Šต ๊ฐ€๋Šฅํ•œ ํŒŒ๋ผ๋ฏธํ„ฐ๊ฐ€ ์—†๋‹ค. ๊ณต๊ฐ„ ์ง‘์ค‘ ๋ถ„๊ธฐ์˜ ๊ฒฐ๊ณผ๋ฅผ ์ž…๋ ฅ์œผ๋กœ ์‚ฌ์šฉํ•˜์—ฌ, ์ง‘์ค‘ ๋ชจ๋“ˆ์€ ํ”ฝ์…€ ๊ฐ„์˜ ๊ด€๊ณ„๋ฅผ ๊ณ ๋ คํ•˜์—ฌ ํŠน์ง• ๋งต์˜ ์ •๋ณด๋ฅผ ์ข€ ๋” ์ •๊ตํ•˜๊ฒŒ ๋งŒ๋“ ๋‹ค. ๋˜ํ•œ, ์ง€์šฐ๊ธฐ ๋ชจ๋“ˆ์€ ๊ณต๊ฐ„ ์ง‘์ค‘ ๋ถ„๊ธฐ์˜ ์ถœ๋ ฅ ํŠน์ง• ๋งต์˜ ๊ฐ€์žฅ ๊ตฌ๋ณ„๋˜๋Š” ์˜์—ญ์„ ์ง€์›Œ์„œ ๋„คํŠธ์›Œํฌ๊ฐ€ ๋œ ๊ตฌ๋ณ„๋˜๋Š” ์˜์—ญ๋„ ๊ณ ๋ คํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•œ๋‹ค. ๋”์šฑ์ด ์ง€์šฐ๋Š” ์˜์—ญ์˜ ํฌ๊ธฐ๋ฅผ ๋‹ค์–‘ํ•˜๊ฒŒ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•˜์—ฌ ์„ฑ๋Šฅ์„ ๋” ํ–ฅ์ƒ์‹œ์ผฐ๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ, ์ง‘์ค‘๊ณผ ์ง€์šฐ๊ธฐ์—์„œ ๋‚˜์˜ค๋Š” ์ •๋ณด๋ฅผ ๋ชจ๋‘ ํ™œ์šฉํ•˜๊ธฐ ์œ„ํ•˜์—ฌ ์ด ๋‘ ๋ชจ๋“ˆ์˜ ์ถœ๋ ฅ ํŠน์ง• ๋งต๋“ค์„ ๋”ํ•œ๋‹ค. ์ด๋ ‡๊ฒŒ ์ œ์•ˆ๋œ SFPN๊ณผ A2E Net์€ CUB-200-2011๊ณผ ILSVRC ์—์„œ์˜ ์‹คํ—˜์„ ํ†ตํ•ด ๊ธฐ์กด์˜ ์•ฝ์ง€๋„ ๋ฌผ์ฒด ํƒ์ง€ ๊ธฐ๋ฒ•๋“ค๋ณด๋‹ค ์ข‹์€ ์„ฑ๋Šฅ์„ ๊ฐ€์ง์„ ๋ณด์˜€๋‹ค.1 Introduction 1 2 Preliminaries 5 2.1 Convolutional Neural Networks 5 2.1.1 Convolution Operation 5 2.1.2 Some Convolutional Neural Networks 7 3 SFPN: Simple Feature Pyramid Network for Weakly Supervised Object Localization 12 3.1 Introduction 12 3.2 Related works 14 3.2.1 Some Object Detection Methods 14 3.2.2 Existing Methods for Weakly Supervised Object Localization 18 3.3 Proposed Method 23 3.4 Experiment 26 3.4.1 Datasets 26 3.4.2 Evaluation Metrics 27 3.4.3 Implementation Details 28 3.4.4 Result 28 3.4.5 Ablation Study 30 4 A2E Net: Aggregation of Attention and Erasing for Weakly Supervised Object Localization 33 4.1 Introduction 33 4.2 Related Works 35 4.2.1 Attention Mechanism 35 4.2.2 Erasing Methods 40 4.2.3 Existing Methods for Weakly Supervised Object Localization 43 4.3 Proposed Method 48 4.3.1 Spatial Attention Branch 48 4.3.2 Refinement Branch 49 4.4 Experiment 56 4.4.1 Implementation Details 56 4.4.2 Result 57 4.4.3 Ablation Study 60 5 Conclusion 67 The bibliography 70 Abstract (in Korean) 78๋ฐ•

    ํ‘œ์ฐฝ์žฅ

    No full text
    corecore