4,255 research outputs found

    ๊ฐ์ฒด ์ธ์‹์˜ ๋ ˆ์ด๋ธ” ํšจ์œจ์  ํ•™์Šต

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ „๊ธฐยท์ •๋ณด๊ณตํ•™๋ถ€, 2023. 2. ์œค์„ฑ๋กœ.๋”ฅ๋Ÿฌ๋‹์˜ ๋ฐœ์ „์€ ์ด๋ฏธ์ง€ ๋ฌผ์ฒด ์ธ์‹ ๋ถ„์•ผ๋ฅผ ํฌ๊ฒŒ ๋ฐœ์ „์‹œ์ผฐ๋‹ค. ํ•˜์ง€๋งŒ ์ด๋Ÿฌํ•œ ๋ฐœ์ „์€ ์ˆ˜๋งŽ์€ ํ•™์Šต ์ด๋ฏธ์ง€์™€ ๊ฐ ์ด๋ฏธ์ง€์— ์‚ฌ๋žŒ์ด ์ง์ ‘ ์ƒ์„ฑํ•œ ๋ฌผ์ฒด์˜ ์œ„์น˜ ์ •๋ณด์— ๋Œ€ํ•œ ๋ ˆ์ด๋ธ” ๋•๋ถ„์— ๊ฐ€๋Šฅํ•œ ๊ฒƒ์ด์˜€๋‹ค. ์ด๋ฏธ์ง€ ๋ฌผ์ฒด ์ธ์‹ ๋ถ„์•ผ๋ฅผ ์‹ค์ƒํ™œ์—์„œ ํ™œ์šฉํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ๋‹ค์–‘ํ•œ ๋ฌผ์ฒด์˜ ์นดํ…Œ๊ณ ๋ฆฌ๋ฅผ ์ธ์‹ ํ•  ์ˆ˜ ์žˆ์–ด์•ผ ํ•˜๋ฉฐ, ์ด๋ฅผ ์œ„ํ•ด์„  ๊ฐ ์นดํ…Œ๊ณ ๋ฆฌ๋‹น ์ˆ˜๋งŽ์€ ํ•™์Šต ๋ฐ์ดํ„ฐ๊ฐ€ ํ•„์š”ํ•˜๋‹ค. ํ•˜์ง€๋งŒ ๊ฐ ์ด๋ฏธ์ง€๋‹น ๋ฌผ์ฒด์˜ ์œ„์น˜๋ฅผ ๊ฐ ํ”ฝ์…€๋งˆ๋‹ค ์ฃผ์„์„ ๋‹ค๋Š” ๊ฒƒ์€ ๋งŽ์€ ๋น„์šฉ์ด ๋“ค์–ด๊ฐ„๋‹ค. ์ด๋Ÿฌํ•œ ์ •๋ณด๋ฅผ ์–ป์„ ๋•Œ ํ•„์š”ํ•œ ๋น„์šฉ์€ ์•ฝํ•œ์ง€๋„ํ•™์Šต์œผ๋กœ ์ค„์ผ ์ˆ˜ ์žˆ๋‹ค. ์•ฝํ•œ ์ง€๋„ ํ•™์Šต์ด๋ž€, ๋ฌผ์ฒด์˜ ๋ช…์‹œ์ ์ธ ์œ„์น˜ ์ •๋ณด๋ฅผ ํฌํ•จํ•˜๋Š” ๋ ˆ์ด๋ธ”๋ณด๋‹ค ๋” ๊ฐ’์‹ธ๊ฒŒ ์–ป์„ ์ˆ˜๋Š” ์žˆ์ง€๋งŒ, ์•ฝํ•œ ์œ„์น˜ ์ •๋ณด๋ฅผ ํ™œ์šฉํ•˜์—ฌ ๋‰ด๋Ÿด๋„คํŠธ์›Œํฌ๋ฅผ ํ•™์Šตํ•˜๋Š” ๊ฒƒ์ด๋‹ค. ๋ณธ ํ•™์œ„๋…ผ๋ฌธ์—์„œ๋Š” ๋ฌผ์ฒด์˜ ์นดํ…Œ๊ณ ๋ฆฌ ์ •๋ณด, ํ•™์Šต ์™ธ ๋ถ„ํฌ ๋ฐ์ดํ„ฐ (out-of-distribution) ๋ฐ์ดํ„ฐ, ๊ทธ๋ฆฌ๊ณ  ๋ฌผ์ฒด์˜ ๋ฐ•์Šค ๋ ˆ์ด๋ธ”์„ ํ™œ์šฉํ•˜๋Š” ์•ฝํ•œ์ง€๋„ํ•™์Šต ๋ฐฉ๋ฒ•๋ก ๋“ค์„ ๋‹ค๋ฃฌ๋‹ค. ์ฒซ ๋ฒˆ์งธ๋กœ, ๋ฌผ์ฒด์˜ ์นดํ…Œ๊ณ ๋ฆฌ ์ •๋ณด๋ฅผ ์ด์šฉํ•œ ์•ฝํ•œ ์ง€๋„ ํ•™์Šต์„ ๋‹ค๋ฃฌ๋‹ค. ๋Œ€๋ถ€๋ถ„์˜ ์นดํ…Œ๋กœ๊ธฐ ์ •๋ณด๋ฅผ ํ™œ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•๋“ค์€ ํ•™์Šต๋œ ๋ถ„๋ฅ˜๊ธฐ๋กœ๋ถ€ํ„ฐ ์–ป์–ด์ง„ ๊ธฐ์—ฌ๋„๋งต (attribution map) ์„ ํ™œ์šฉํ•˜์ง€๋งŒ, ์ด๋“ค์€ ๋ฌผ์ฒด์˜ ์ผ๋ถ€๋งŒ์„ ์ฐพ์•„๋‚ด๋Š” ๋ฌธ์ œ๊ฐ€ ์žˆ๋‹ค. ์šฐ๋ฆฌ๋Š” ์ด ๋ฌธ์ œ์— ๋Œ€ํ•œ ๊ทผ๋ณธ ์›์ธ์„ ์ด๋ก ์ ์ธ ๊ด€์ ์—์„œ ์˜๋…ผํ•˜๊ณ , ์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ๋Š” ์„ธ ๊ฐ€์ง€์˜ ๋ฐฉ๋ฒ•๋ก ์„ ์ œ์•ˆํ•œ๋‹ค. ํ•˜์ง€๋งŒ, ๋ฌผ์ฒด์˜ ์นดํ…Œ๊ณ ๋ฆฌ ์ •๋ณด๋งŒ ํ™œ์šฉํ•˜๊ฒŒ ๋˜๋ฉด ์ด๋ฏธ์ง€์˜ ์ „๊ฒฝ๊ณผ ๋ฐฐ๊ฒฝ์ด ์•…์˜์ ์ธ ์ƒ๊ด€๊ด€๊ณ„๋ฅผ ๊ฐ€์ง„๋‹ค๊ณ  ์ž˜ ์•Œ๋ ค์ ธ ์žˆ๋‹ค. ์šฐ๋ฆฌ๋Š” ์ด๋Ÿฌํ•œ ์ƒ๊ด€๊ด€๊ณ„๋ฅผ ํ•™์Šต ์™ธ ๋ถ„ํฌ ๋ฐ์ดํ„ฐ๋ฅผ ํ™œ์šฉํ•˜์—ฌ ์™„ํ™”ํ•œ๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ, ๋ฌผ์ฒด์˜ ์นดํ…Œ๊ณ ๋ฆฌ ์ •๋ณด์— ๊ธฐ๋ฐ˜ํ•œ ๋ฐฉ๋ฒ•๋ก ๋“ค์€ ๊ฐ™์€ ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๋ฌผ์ฒด๋ฅผ ๋ถ„๋ฆฌํ•˜์ง€ ๋ชปํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์ธ์Šคํ„ด์Šค ๋ถ„ํ•  (instance segmentation) ์— ์ ์šฉ๋˜๊ธฐ๋Š” ํž˜๋“ค๋‹ค. ๋”ฐ๋ผ์„œ ๋ฌผ์ฒด์˜ ๋ฐ•์Šค ๋ ˆ์ด๋ธ”์„ ํ™œ์šฉํ•œ ์•ฝํ•œ ์ง€๋„ํ•™์Šต ๋ฐฉ๋ฒ•๋ก ์„ ์ œ์•ˆํ•œ๋‹ค. ์ œ์•ˆ๋œ ๋ฐฉ๋ฒ•๋ก ์„ ํ†ตํ•ด ๋ ˆ์ด๋ธ”์„ ์ œ์ž‘ํ•˜๋Š” ์‹œ๊ฐ„์„ ํš๊ธฐ์ ์œผ๋กœ ์ค„์ผ ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ฒƒ์„ ์‹คํ—˜๊ฒฐ๊ณผ๋ฅผ ํ†ตํ•ด ํ™•์ธํ–ˆ๋‹ค. ์–ด๋ ค์šด ๋ฐ์ดํ„ฐ์…‹์ธ Pascal VOC ์— ๋Œ€ํ•ด ์šฐ๋ฆฌ๋Š” 91%์˜ ๋ฐ์ดํ„ฐ ๋น„์šฉ์„ ๊ฐ์†Œํ•˜๋ฉด์„œ, ๊ฐ•ํ•œ ๋ ˆ์ด๋ธ”๋กœ ํ•™์Šต๋œ ๋น„๊ต๊ตฐ์˜ 89%์˜ ์„ฑ๋Šฅ์„ ๋‹ฌ์„ฑํ•˜์˜€๋‹ค. ๋˜ํ•œ, ๋ฌผ์ฒด์˜ ๋ฐ•์Šค ์ •๋ณด๋ฅผ ํ™œ์šฉํ•ด์„œ๋Š” 83% ์˜ ๋ฐ์ดํ„ฐ ๋น„์šฉ์„ ๊ฐ์†Œํ•˜๋ฉด์„œ, ๊ฐ•ํ•œ ๋ ˆ์ด๋ธ”๋กœ ํ•™์Šต๋œ ๋น„๊ต๊ตฐ์˜ 96%์˜ ์„ฑ๋Šฅ์„ ๋‹ฌ์„ฑํ•˜์˜€๋‹ค. ๋ณธ ํ•™์œ„๋…ผ๋ฌธ์—์„œ ์ œ์•ˆ๋œ ๋ฐฉ๋ฒ•๋ก ๋“ค์ด ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ฐ˜์˜ ๋ฌผ์ฒด ์ธ์‹์ด ๋‹ค์–‘ํ•œ ๋ฐ์ดํ„ฐ์™€ ๋‹ค์–‘ํ•œ ํ™˜๊ฒฝ์—์„œ ํ™œ์šฉ๋˜๋Š” ๋ฐ์— ์žˆ์–ด ๋„์›€์ด ๋˜๊ธฐ๋ฅผ ๊ธฐ๋Œ€ํ•œ๋‹ค.Advances in deep neural network approaches have produced tremendous progress in object recognition tasks, but it has come at the cost of annotating a huge amount of training images with explicit localization cues. To use object recognition tasks in real-life applications requires a large variety of object classes and a great deal of labeled data for each class. However, labeling pixel-level annotations of each object class is laborious, and hampers the expansion of object classes. The need for such expensive annotations is sidestepped by weakly supervised learning, in which a DNN is trained on images with some form of abbreviated annotation that is cheaper than explicit localization cues. In the dissertation, we study the methods of using various form of weak supervision, i.e., image-level class labels, out-of-distribution data, and bounding box labels. We first study image-level class labels for weakly supervised semantic segmentation. Most of the weakly supervised methods on image-level class labels depend on attribution maps from a trained classifier, but their focus tends to be restricted to a small discriminative region of the target object. We theoretically discuss the root cause of this problem, and propose three novel techniques to address this issue. However, built on class labels only, the produced localization maps are known to suffer from the confusion between foreground and background cues, i.e., spurious correlation. We address the spurious correlation problem by utilizing out-of-distribution data. Finally, methods based on class labels cannot separate different instance objects of the same class, which is essential for instance segmentation. Therefore, we utilize bounding box labels for weakly supervised instance segmentation as boxes provide information about individual objects and their locations. Experimental results show that annotation cost for learning semantic segmentation and instance segmentation can be significantly reduced: On the challenging Pascal VOC dataset, we have achieved 89% of the performance of the fully supervised equivalent by using only class labels, which reduces the label cost by 91%. In addition, we have achieved 96% of the performance of the fully supervised equivalent by using bounding box labels, which reduces the label cost by 83%. We expect that the methods introduced in this dissertation will be helpful for applying deep learning based object recognition tasks in a variety of domains and scenarios.1 Introduction 1 2 Background 8 2.1 Object Recognition 8 2.2 Weak Supervision 13 2.3 Preliminary Algirothms 16 2.3.1 Attribution Methods for Image Classifier 16 2.3.2 Refinement Techniques of Localization Maps 18 3 Learning with Image-Level Class Labels 22 3.1 Introduction 22 3.2 Related Work 23 3.2.1 FickleNet: Stochastic Inference Approach 23 3.2.2 Other Recent Approaches 26 3.3 Anti-Adversarially Manipulated Attribution 28 3.3.1 Adversarial Attack 28 3.3.2 Proposed Method 29 3.3.3 Experiments 33 3.3.4 Discussion 36 3.3.5 Analysis of Results by Class 42 3.4 Reducing Information Bottleneck 46 3.4.1 Information Bottleneck 46 3.4.2 Motivation 47 3.4.3 Proposed Method 49 3.4.4 Experiments 52 3.5 Summary 60 4 Learning with Auxiliary Data 62 4.1 Introduction 62 4.2 Related Work 65 4.3 Methods 66 4.3.1 Collecting the Hard Out-of-Distribution Data 67 4.3.2 Learning with the Hard Out-of-Distribution Data 69 4.3.3 Training Segmentation Networks 71 4.4 Experiments 73 4.4.1 Experimental Setup 73 4.4.2 Experimental Results 73 4.4.3 Analysis and Discussion 76 4.5 Analysis of OoD Collection Process 81 4.6 Integrating Proposed Methods 82 4.7 Summary 83 5 Learning with Bounding Box Labels 85 5.1 Introduction 85 5.2 Related Work 87 5.3 Methods 89 5.3.1 Revisiting Object Detectors 89 5.3.2 Bounding Box Attribution Map 90 5.3.3 Training the Segmentation Network 91 5.4 Experiments 93 5.4.1 Experimental Setup 93 5.4.2 Weakly Supervised Instance Segmentation 94 5.4.3 Weakly Supervised Semantic Segmentation 96 5.4.4 Ablation Study 98 5.5 Detailed Analysis of the BBAM 100 5.6 Summary 104 6 Conclusion 105 6.1 Dissertation Summary 105 6.2 Limitations and Future Direction 107 Abstract (In Korean) 133๋ฐ•

    Cross Modal Distillation for Flood Extent Mapping

    Full text link
    The increasing intensity and frequency of floods is one of the many consequences of our changing climate. In this work, we explore ML techniques that improve the flood detection module of an operational early flood warning system. Our method exploits an unlabelled dataset of paired multi-spectral and Synthetic Aperture Radar (SAR) imagery to reduce the labeling requirements of a purely supervised learning method. Prior works have used unlabelled data by creating weak labels out of them. However, from our experiments we noticed that such a model still ends up learning the label mistakes in those weak labels. Motivated by knowledge distillation and semi supervised learning, we explore the use of a teacher to train a student with the help of a small hand labelled dataset and a large unlabelled dataset. Unlike the conventional self distillation setup, we propose a cross modal distillation framework that transfers supervision from a teacher trained on richer modality (multi-spectral images) to a student model trained on SAR imagery. The trained models are then tested on the Sen1Floods11 dataset. Our model outperforms the Sen1Floods11 baseline model trained on the weak labeled SAR imagery by an absolute margin of 6.53% Intersection-over-Union (IoU) on the test split

    A Novel Hybrid CNN Denoising Technique (HDCNN) for Image Denoising with Improved Performance

    Get PDF
    Photo denoising has been tackled by deep convolutional neural networks (CNNs) with powerful learning capabilities. Unfortunately, some CNNs perform badly on complex displays because they only train one deep network for their image blurring models. We recommend a hybrid CNN denoising technique (HDCNN) to address this problem. An HDCNN consists of a dilated interfere with, a RepVGG block, an attribute sharpening interferes with, as well as one inversion. To gather more context data, DB incorporates a stretched convolution, data sequential normalization (BN), shared convergence, and the activating function called the ReLU. Convolution, BN, and reLU are combined in parallel by RVB to obtain complimentary width characteristics. The RVB's refining characteristics are used to refine FB, which is then utilized to collect more precise data. To create a crisp image, a single convolution works in conjunction with a residual learning process. These crucial elements enable the HDCNN to carry out visual denoising efficiently. The suggested HDCNN has a good denoising performance in open data sets, according to experiments
    • โ€ฆ
    corecore