1,043 research outputs found

    MARA-Net: Single Image Deraining Network with Multi-level connections and Adaptive Regional Attentions

    Full text link
    Removing rain streaks from single images is an important problem in various computer vision tasks because rain streaks can degrade outdoor images and reduce their visibility. While recent convolutional neural network-based deraining models have succeeded in capturing rain streaks effectively, difficulties in recovering the details in rain-free images still remain. In this paper, we present a multi-level connection and adaptive regional attention network (MARA-Net) to properly restore the original background textures in rainy images. The first main idea is a multi-level connection design that repeatedly connects multi-level features of the encoder network to the decoder network. Multi-level connections encourage the decoding process to use the feature information of all levels. Channel attention is considered in multi-level connections to learn which level of features is important in the decoding process of the current level. The second main idea is a wide regional non-local block (WRNL). As rain streaks primarily exhibit a vertical distribution, we divide the grid of the image into horizontally-wide patches and apply a non-local operation to each region to explore the rich rain-free background information. Experimental results on both synthetic and real-world rainy datasets demonstrate that the proposed model significantly outperforms existing state-of-the-art models. Furthermore, the results of the joint deraining and segmentation experiment prove that our model contributes effectively to other vision tasks

    ์‹ฌ์ธต ์‹ ๊ฒฝ๋ง์˜ ์†์‹คํ‘œ๋ฉด ๋ฐ ๋”ฅ๋Ÿฌ๋‹์˜ ์—ฌ๋Ÿฌ ์ ์šฉ์— ๊ด€ํ•œ ์—ฐ๊ตฌ

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ์ž์—ฐ๊ณผํ•™๋Œ€ํ•™ ์ˆ˜๋ฆฌ๊ณผํ•™๋ถ€, 2022. 8. ๊ฐ•๋ช…์ฃผ.๋ณธ ํ•™์œ„ ๋…ผ๋ฌธ์€ ์‹ฌ์ธต ์‹ ๊ฒฝ๋ง์˜ ์†์‹ค ํ‘œ๋ฉด์— ๋Œ€ํ•˜์—ฌ ๋‹ค๋ฃฌ๋‹ค. ์‹ฌ์ธต ์‹ ๊ฒฝ๋ง์˜ ์†์‹ค ํ•จ์ˆ˜๋Š” ๋ณผ๋ก ํ•จ์ˆ˜์™€ ๊ฐ™์ด ๋‚˜์œ ๊ตญ์†Œ์ ์„ ๊ฐ€์ง€๋Š”๊ฐ€? ์กฐ๊ฐ์ ์œผ๋กœ ์„ ํ˜•์€ ํ™œ์„ฑํ•จ์ˆ˜๋ฅผ ๊ฐ€์ง€๋Š” ๊ฒฝ์šฐ์— ๋Œ€ํ•ด์„œ๋Š” ์ž˜ ์•Œ๋ ค์˜€์ง€๋งŒ, ์ผ๋ฐ˜์ ์ธ ๋งค๋„๋Ÿฌ์šด ํ™œ์„ฑํ•จ์ˆ˜๋ฅผ ๊ฐ€์ง€๋Š” ์‹ฌ์ธต ์‹ ๊ฒฝ๋ง์— ๋Œ€ํ•ด์„œ๋Š” ์•„์ง๊นŒ์ง€ ์•Œ๋ ค์ง€์ง€ ์•Š์€ ๊ฒƒ์ด ๋งŽ๋‹ค. ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ๋‚˜์œ ๊ตญ์†Œ์ ์ด ์ผ๋ฐ˜์ ์ธ ๋งค๋„๋Ÿฌ์šด ํ™œ์„ฑํ•จ์ˆ˜์—์„œ๋„ ์กด์žฌํ•จ์„ ๋ณด์ธ๋‹ค. ์ด๊ฒƒ์€ ์‹ฌ์ธต ์‹ ๊ฒฝ๋ง์˜ ์†์‹ค ํ‘œ๋ฉด์— ๋Œ€ํ•œ ์ดํ•ด์— ๋ถ€๋ถ„์ ์ธ ์„ค๋ช…์„ ์ œ๊ณตํ•ด ์ค„ ๊ฒƒ์ด๋‹ค. ์ถ”๊ฐ€์ ์œผ๋กœ ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ํ•™์Šต ์ด๋ก , ์‚ฌ์ƒํ™œ ๋ณดํ˜ธ์ ์ธ ๊ธฐ๊ณ„ ํ•™์Šต, ์ปดํ“จํ„ฐ ๋น„์ „ ๋“ฑ์˜ ๋ถ„์•ผ์—์„œ์˜ ์‹ฌ์ธต ์‹ ๊ฒฝ๋ง์˜ ๋‹ค์–‘ํ•œ ์‘์šฉ์„ ์„ ๋ณด์ผ ์˜ˆ์ •์ด๋‹ค.In this thesis, we study the loss surface of deep neural networks. Does the loss function of deep neural network have no bad local minimum like the convex function? Although it is well known for piece-wise linear activations, not much is known for the general smooth activations. We explore that a bad local minimum also exists for general smooth activations. In addition, we characterize the types of such local minima. This provides a partial explanation for the understanding of the loss surface of deep neural networks. Additionally, we present several applications of deep neural networks in learning theory, private machine learning, and computer vision.Abstract v 1 Introduction 1 2 Existence of local minimum in neural network 4 2.1 Introduction 4 2.2 Local Minima and Deep Neural Network 6 2.2.1 Notation and Model 6 2.2.2 Local Minima and Deep Linear Network 6 2.2.3 Local Minima and Deep Neural Network with piece-wise linear activations 8 2.2.4 Local Minima and Deep Neural Network with smooth activations 10 2.2.5 Local Valley and Deep Neural Network 11 2.3 Existence of local minimum for partially linear activations 12 2.4 Absence of local minimum in the shallow network for small N 17 2.5 Existence of local minimum in the shallow network 20 2.6 Local Minimum Embedding 36 3 Self-Knowledge Distillation via Dropout 40 3.1 Introduction 40 3.2 Related work 43 3.2.1 Knowledge Distillation 43 3.2.2 Self-Knowledge Distillation 44 3.2.3 Semi-supervised and Self-supervised Learning 44 3.3 Self Distillation via Dropout 45 3.3.1 Method Formulation 46 3.3.2 Collaboration with other method 47 3.3.3 Forward versus reverse KL-Divergence 48 3.4 Experiments 53 3.4.1 Implementation Details 53 3.4.2 Results 54 3.5 Conclusion 62 4 Membership inference attacks against object detection models 63 4.1 Introduction 63 4.2 Background and Related Work 65 4.2.1 Membership Inference Attack 65 4.2.2 Object Detection 66 4.2.3 Datasets 67 4.3 Attack Methodology 67 4.3.1 Motivation 69 4.3.2 Gradient Tree Boosting 69 4.3.3 Convolutional Neural Network Based Method 70 4.3.4 Transfer Attack 73 4.4 Defense 73 4.4.1 Dropout 73 4.4.2 Diff erentially Private Algorithm 74 4.5 Experiments 75 4.5.1 Target and Shadow Model Setup 75 4.5.2 Attack Model Setup 77 4.5.3 Experiment Results 78 4.5.4 Transfer Attacks 80 4.5.5 Defense 81 4.6 Conclusion 81 5 Single Image Deraining 82 5.1 Introduction 82 5.2 Related Work 86 5.3 Proposed Network 89 5.3.1 Multi-Level Connection 89 5.3.2 Wide Regional Non-Local Block 92 5.3.3 Discrete Wavelet Transform 94 5.3.4 Loss Function 94 5.4 Experiments 95 5.4.1 Datasets and Evaluation Metrics 95 5.4.2 Datasets and Experiment Details 96 5.4.3 Evaluations 97 5.4.4 Ablation Study 104 5.4.5 Applications for Other Tasks 107 5.4.6 Analysis on multi-level features 109 5.5 Conclusion 111 The bibliography 112 Abstract (in Korean) 129๋ฐ•

    ๋‹จ์ผ ์ด๋ฏธ์ง€ ๋‚ด ๋น„์ œ๊ฑฐ๋ฅผ ์œ„ํ•œ ๋‹ค์ค‘์Šค์ผ€์ผ ์—ฐ๊ฒฐ ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(์„์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ์ž์—ฐ๊ณผํ•™๋Œ€ํ•™ ํ˜‘๋™๊ณผ์ • ๊ณ„์‚ฐ๊ณผํ•™์ „๊ณต, 2021.8. ๊ฐ•๋ช…์ฃผ.๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์‹ ๊ฒฝ๋ง์—์„œ ์ƒ์„ฑ๋œ ๋ชจ๋“  ์Šค์ผ€์ผ์˜ ํŠน์ง•๋“ค์„ ํ™œ์šฉํ•˜์—ฌ ์ด๋ฏธ์ง€์˜ ์„ธ๋ถ€ ์ •๋ณด๊นŒ์ง€ ๋ณต๊ตฌํ•  ์ˆ˜ ์žˆ๋Š” ๋‹ค์ค‘์Šค์ผ€์ผ ์—ฐ๊ฒฐ ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง(MC-CNN)์„ ์ œ์•ˆํ•œ๋‹ค. ์„ธ๋ถ€ ์ •๋ณด ๋ณต๊ตฌ๋ฅผ ์œ„ํ•œ MC-CNN์˜ ์ฒซ ๋ฒˆ์งธ ํ•ต์‹ฌ์€ ๋‹ค์ค‘์Šค์ผ€์ผ ์—ฐ๊ฒฐ๋กœ, ์ธ์ฝ”๋” ๋ถ€๋ถ„์˜ ๋ชจ๋“  ์Šค์ผ€์ผ ํŠน์ง•๋“ค์„ ๋””์ฝ”๋”์— ์—ฐ๊ฒฐํ•˜์—ฌ ๊ฐ€๋Šฅํ•œ ๋งŽ์€ ์ •๋ณด๋ฅผ ํ™œ์šฉํ•˜์—ฌ ์ด๋ฏธ์ง€๋ฅผ ๋ณต๊ตฌํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•˜๋Š” ๊ฒƒ์ด๋‹ค. ๋‹ค์ค‘์Šค์ผ€์ผ ์—ฐ๊ฒฐ์€ ๋‹จ์ˆœํžˆ ๊ฐ ์Šค์ผ€์ผ์˜ ํŠน์ง•์„ ํ•ฉ์น˜๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ ์–ด๋Š ์Šค์ผ€์ผ์˜ ํŠน์ง•์ด ํ˜„์žฌ ๊ณผ์ •์—์„œ ์ค‘์š”ํ•œ์ง€ ๋ฐฐ์šธ ์ˆ˜ ์žˆ๋„๋ก ์ฑ„๋„ ์–ดํ…์…˜์„ ๊ณ ๋ คํ•œ๋‹ค. ๋‘ ๋ฒˆ์งธ ํ•ต์‹ฌ์€ ์™€์ด๋“œ ๋…ผ๋กœ์ปฌ (WRNL) ๋ธ”๋ก์ด๋‹ค. ์šฐ๋ฆฌ๋Š” ๋„“์€ ์ง์‚ฌ๊ฐํ˜•์œผ๋กœ ์ด๋ฏธ์ง€๋ฅผ ๋‚˜๋ˆŒ ๋•Œ ๊ฐ ํŒจ์น˜๊ฐ€ ๊ฐ€์žฅ ๊ณ ๋ฅธ ๋ถ„ํฌ๋ฅผ ๊ฐ€์ง„๋‹ค๋Š” ๊ฒƒ์„ ์•Œ์•„๋ƒˆ๊ณ , ์ด๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ WRNL์„ ์ œ์•ˆํ•˜์˜€๋‹ค. ํ•ฉ์„ฑ ๋ฐ ์‹ค์ œ ๋น„ ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ์ง„ํ–‰๋œ ๋งŽ์€ ์‹คํ—˜ ๊ฒฐ๊ณผ๋“ค์„ ํ†ตํ•ด MC-CNN์ด ์ •๋Ÿ‰์ ์œผ๋กœ ๊ธฐ์กด ๋ฐฉ๋ฒ•๋“ค์„ ๋Šฅ๊ฐ€ํ•˜๊ณ  ์ •์„ฑ์ ์œผ๋กœ๋„ ๋งŽ์€ ๊ฐœ์„ ์ด ์ด๋ฃจ์–ด์กŒ์Œ์„ ํ™•์ธํ•˜์˜€๋‹ค.In this thesis, we propose an end-to-end multi-scale connected convolutional neural network (MC-CNN) that leverages all scale features to remove rain streaks while recovering detailed information on images. The first key point for recovering details is a multi-scale connection, which connects all scale features of the encoder part to the decoder part to restore the image with as much information as possible. Multi-scale connection considers channel-wise attention to learn which scale features are important in the current process, rather than simply combining the features of each scale. The second key point is a wide regional non-local (WRNL) block. We find that dividing images into wide rectangular patches makes each patch have a more even distribution than the existing method and based on this, we propose a WRNL block. Experimental results on synthetic and real-world datasets demonstrate that MC-CNN quantitatively outperforms existing state-of-the-art models and qualitatively achieves several improvements.1 Introduction 1 2 Related Work 4 3 Proposed Network 6 3.1 Multi-scale Connection 8 3.2 Wide Regional Non-Local Block 9 3.2.1 Analysis 10 3.3 Discrete Wavelet Transform 12 3.4 Data Augmentation 12 3.5 Loss Function 13 4 Experiments 14 4.1 Datasets and Evaluation Metrics 14 4.2 Experiment Details 15 4.3 Results 16 4.3.1 Synthetic Datasets 16 4.3.2 Real-world Datasets 18 4.4 Ablation Study 20 4.4.1 Multi-scale connection 20 4.4.2 Region types of non-Local block 21 5 Conclusion 23 Abstract (In Korean) 32์„

    Artificial Intelligence for Multimedia Signal Processing

    Get PDF
    Artificial intelligence technologies are also actively applied to broadcasting and multimedia processing technologies. A lot of research has been conducted in a wide variety of fields, such as content creation, transmission, and security, and these attempts have been made in the past two to three years to improve image, video, speech, and other data compression efficiency in areas related to MPEG media processing technology. Additionally, technologies such as media creation, processing, editing, and creating scenarios are very important areas of research in multimedia processing and engineering. This book contains a collection of some topics broadly across advanced computational intelligence algorithms and technologies for emerging multimedia signal processing as: Computer vision field, speech/sound/text processing, and content analysis/information mining

    Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition

    Full text link
    Visual Place Recognition is a challenging task for robotics and autonomous systems, which must deal with the twin problems of appearance and viewpoint change in an always changing world. This paper introduces Patch-NetVLAD, which provides a novel formulation for combining the advantages of both local and global descriptor methods by deriving patch-level features from NetVLAD residuals. Unlike the fixed spatial neighborhood regime of existing local keypoint features, our method enables aggregation and matching of deep-learned local features defined over the feature-space grid. We further introduce a multi-scale fusion of patch features that have complementary scales (i.e. patch sizes) via an integral feature space and show that the fused features are highly invariant to both condition (season, structure, and illumination) and viewpoint (translation and rotation) changes. Patch-NetVLAD outperforms both global and local feature descriptor-based methods with comparable compute, achieving state-of-the-art visual place recognition results on a range of challenging real-world datasets, including winning the Facebook Mapillary Visual Place Recognition Challenge at ECCV2020. It is also adaptable to user requirements, with a speed-optimised version operating over an order of magnitude faster than the state-of-the-art. By combining superior performance with improved computational efficiency in a configurable framework, Patch-NetVLAD is well suited to enhance both stand-alone place recognition capabilities and the overall performance of SLAM systems.Comment: Accepted to IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2021

    A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community

    Full text link
    In recent years, deep learning (DL), a re-branding of neural networks (NNs), has risen to the top in numerous areas, namely computer vision (CV), speech recognition, natural language processing, etc. Whereas remote sensing (RS) possesses a number of unique challenges, primarily related to sensors and applications, inevitably RS draws from many of the same theories as CV; e.g., statistics, fusion, and machine learning, to name a few. This means that the RS community should be aware of, if not at the leading edge of, of advancements like DL. Herein, we provide the most comprehensive survey of state-of-the-art RS DL research. We also review recent new developments in the DL field that can be used in DL for RS. Namely, we focus on theories, tools and challenges for the RS community. Specifically, we focus on unsolved challenges and opportunities as it relates to (i) inadequate data sets, (ii) human-understandable solutions for modelling physical phenomena, (iii) Big Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and learning algorithms for spectral, spatial and temporal data, (vi) transfer learning, (vii) an improved theoretical understanding of DL systems, (viii) high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote Sensin

    Generic Object Detection and Segmentation for Real-World Environments

    Get PDF

    Video Desnowing and Deraining via Saliency and Dual Adaptive Spatiotemporal Filtering

    Get PDF
    Outdoor vision sensing systems often struggle with poor weather conditions, such as snow and rain, which poses a great challenge to existing video desnowing and deraining methods. In this paper, we propose a novel video desnowing and deraining model that utilizes the salience information of moving objects to address this problem. First, we remove the snow and rain from the video by low-rank tensor decomposition, which makes full use of the spatial location information and the correlation between the three channels of the color video. Second, because existing algorithms often regard sparse snowflakes and rain streaks as moving objects, this paper injects salience information into moving object detection, which reduces the false alarms and missed alarms of moving objects. At the same time, feature point matching is used to mine the redundant information of moving objects in continuous frames, and a dual adaptive minimum filtering algorithm in the spatiotemporal domain is proposed by us to remove snow and rain in front of moving objects. Both qualitative and quantitative experimental results show that the proposed algorithm is more competitive than other state-of-the-art snow and rain removal methods
    • โ€ฆ
    corecore