심층 신경망의 손실표면 및 딥러닝의 여러 적용에 관한 연구

Abstract

학위논문(박사) -- 서울대학교대학원 : 자연과학대학 수리과학부, 2022. 8. 강명주.본 학위 논문은 심층 신경망의 손실 표면에 대하여 다룬다. 심층 신경망의 손실 함수는 볼록 함수와 같이 나쁜 국소점을 가지는가? 조각적으로 선형은 활성함수를 가지는 경우에 대해서는 잘 알려였지만, 일반적인 매끄러운 활성함수를 가지는 심층 신경망에 대해서는 아직까지 알려지지 않은 것이 많다. 본 연구에서는 나쁜 국소점이 일반적인 매끄러운 활성함수에서도 존재함을 보인다. 이것은 심층 신경망의 손실 표면에 대한 이해에 부분적인 설명을 제공해 줄 것이다. 추가적으로 본 논문에서는 학습 이론, 사생활 보호적인 기계 학습, 컴퓨터 비전 등의 분야에서의 심층 신경망의 다양한 응용을 선보일 예정이다.In this thesis, we study the loss surface of deep neural networks. Does the loss function of deep neural network have no bad local minimum like the convex function? Although it is well known for piece-wise linear activations, not much is known for the general smooth activations. We explore that a bad local minimum also exists for general smooth activations. In addition, we characterize the types of such local minima. This provides a partial explanation for the understanding of the loss surface of deep neural networks. Additionally, we present several applications of deep neural networks in learning theory, private machine learning, and computer vision.Abstract v 1 Introduction 1 2 Existence of local minimum in neural network 4 2.1 Introduction 4 2.2 Local Minima and Deep Neural Network 6 2.2.1 Notation and Model 6 2.2.2 Local Minima and Deep Linear Network 6 2.2.3 Local Minima and Deep Neural Network with piece-wise linear activations 8 2.2.4 Local Minima and Deep Neural Network with smooth activations 10 2.2.5 Local Valley and Deep Neural Network 11 2.3 Existence of local minimum for partially linear activations 12 2.4 Absence of local minimum in the shallow network for small N 17 2.5 Existence of local minimum in the shallow network 20 2.6 Local Minimum Embedding 36 3 Self-Knowledge Distillation via Dropout 40 3.1 Introduction 40 3.2 Related work 43 3.2.1 Knowledge Distillation 43 3.2.2 Self-Knowledge Distillation 44 3.2.3 Semi-supervised and Self-supervised Learning 44 3.3 Self Distillation via Dropout 45 3.3.1 Method Formulation 46 3.3.2 Collaboration with other method 47 3.3.3 Forward versus reverse KL-Divergence 48 3.4 Experiments 53 3.4.1 Implementation Details 53 3.4.2 Results 54 3.5 Conclusion 62 4 Membership inference attacks against object detection models 63 4.1 Introduction 63 4.2 Background and Related Work 65 4.2.1 Membership Inference Attack 65 4.2.2 Object Detection 66 4.2.3 Datasets 67 4.3 Attack Methodology 67 4.3.1 Motivation 69 4.3.2 Gradient Tree Boosting 69 4.3.3 Convolutional Neural Network Based Method 70 4.3.4 Transfer Attack 73 4.4 Defense 73 4.4.1 Dropout 73 4.4.2 Diff erentially Private Algorithm 74 4.5 Experiments 75 4.5.1 Target and Shadow Model Setup 75 4.5.2 Attack Model Setup 77 4.5.3 Experiment Results 78 4.5.4 Transfer Attacks 80 4.5.5 Defense 81 4.6 Conclusion 81 5 Single Image Deraining 82 5.1 Introduction 82 5.2 Related Work 86 5.3 Proposed Network 89 5.3.1 Multi-Level Connection 89 5.3.2 Wide Regional Non-Local Block 92 5.3.3 Discrete Wavelet Transform 94 5.3.4 Loss Function 94 5.4 Experiments 95 5.4.1 Datasets and Evaluation Metrics 95 5.4.2 Datasets and Experiment Details 96 5.4.3 Evaluations 97 5.4.4 Ablation Study 104 5.4.5 Applications for Other Tasks 107 5.4.6 Analysis on multi-level features 109 5.5 Conclusion 111 The bibliography 112 Abstract (in Korean) 129박

    Similar works