Search CORE

4 research outputs found

A Novel NSNN Technique for the Estimations of TP Optical Properties

Author: Yu-Ju Chen, Chuo-Yean Chang, Rey-Chue Hwang*
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 30/09/2014
Field of study

No Abstrac

International Journal on Recent and Innovation Trends in Computing and Communication

Visualising Basins of Attraction for the Cross-Entropy and the Squared Error Neural Network Loss Functions

Author: Bosman Anna Sergeevna
Engelbrecht Andries
Helbig Mardé
Publication venue
Publication date: 09/01/2019
Field of study

Quantification of the stationary points and the associated basins of attraction of neural network loss surfaces is an important step towards a better understanding of neural network loss surfaces at large. This work proposes a novel method to visualise basins of attraction together with the associated stationary points via gradient-based random sampling. The proposed technique is used to perform an empirical study of the loss surfaces generated by two different error metrics: quadratic loss and entropic loss. The empirical observations confirm the theoretical hypothesis regarding the nature of neural network attraction basins. Entropic loss is shown to exhibit stronger gradients and fewer stationary points than quadratic loss, indicating that entropic loss has a more searchable landscape. Quadratic loss is shown to be more resilient to overfitting than entropic loss. Both losses are shown to exhibit local minima, but the number of local minima is shown to decrease with an increase in dimensionality. Thus, the proposed visualisation technique successfully captures the local minima properties exhibited by the neural network loss surfaces, and can be used for the purpose of fitness landscape analysis of neural networks.Comment: Preprint submitted to the Neural Networks journa

arXiv.org e-Print Archive

UPSpace at the University of Pretoria

Non-attracting Regions of Local Minima in Deep and Wide Neural Networks

Author: Petzka Henning
Sminchisescu Cristian
Publication venue
Publication date: 31/08/2020
Field of study

Understanding the loss surface of neural networks is essential for the design of models with predictable performance and their success in applications. Experimental results suggest that sufficiently deep and wide neural networks are not negatively impacted by suboptimal local minima. Despite recent progress, the reason for this outcome is not fully understood. Could deep networks have very few, if at all, suboptimal local optima? or could all of them be equally good? We provide a construction to show that suboptimal local minima (i.e., non-global ones), even though degenerate, exist for fully connected neural networks with sigmoid activation functions. The local minima obtained by our construction belong to a connected set of local solutions that can be escaped from via a non-increasing path on the loss curve. For extremely wide neural networks of decreasing width after the wide layer, we prove that every suboptimal local minimum belongs to such a connected set. This provides a partial explanation for the successful application of deep neural networks. In addition, we also characterize under what conditions the same construction leads to saddle points instead of local minima for deep neural networks

arXiv.org e-Print Archive

Lund University Publications

심층 신경망의 손실표면 및 딥러닝의 여러 적용에 관한 연구

Author: 박예찬
Publication venue: 서울대학교 대학원
Publication date: 01/08/2022
Field of study

학위논문(박사) -- 서울대학교대학원 : 자연과학대학 수리과학부, 2022. 8. 강명주.본 학위 논문은 심층 신경망의 손실 표면에 대하여 다룬다. 심층 신경망의 손실 함수는 볼록 함수와 같이 나쁜 국소점을 가지는가? 조각적으로 선형은 활성함수를 가지는 경우에 대해서는 잘 알려였지만, 일반적인 매끄러운 활성함수를 가지는 심층 신경망에 대해서는 아직까지 알려지지 않은 것이 많다. 본 연구에서는 나쁜 국소점이 일반적인 매끄러운 활성함수에서도 존재함을 보인다. 이것은 심층 신경망의 손실 표면에 대한 이해에 부분적인 설명을 제공해 줄 것이다. 추가적으로 본 논문에서는 학습 이론, 사생활 보호적인 기계 학습, 컴퓨터 비전 등의 분야에서의 심층 신경망의 다양한 응용을 선보일 예정이다.In this thesis, we study the loss surface of deep neural networks. Does the loss function of deep neural network have no bad local minimum like the convex function? Although it is well known for piece-wise linear activations, not much is known for the general smooth activations. We explore that a bad local minimum also exists for general smooth activations. In addition, we characterize the types of such local minima. This provides a partial explanation for the understanding of the loss surface of deep neural networks. Additionally, we present several applications of deep neural networks in learning theory, private machine learning, and computer vision.Abstract v 1 Introduction 1 2 Existence of local minimum in neural network 4 2.1 Introduction 4 2.2 Local Minima and Deep Neural Network 6 2.2.1 Notation and Model 6 2.2.2 Local Minima and Deep Linear Network 6 2.2.3 Local Minima and Deep Neural Network with piece-wise linear activations 8 2.2.4 Local Minima and Deep Neural Network with smooth activations 10 2.2.5 Local Valley and Deep Neural Network 11 2.3 Existence of local minimum for partially linear activations 12 2.4 Absence of local minimum in the shallow network for small N 17 2.5 Existence of local minimum in the shallow network 20 2.6 Local Minimum Embedding 36 3 Self-Knowledge Distillation via Dropout 40 3.1 Introduction 40 3.2 Related work 43 3.2.1 Knowledge Distillation 43 3.2.2 Self-Knowledge Distillation 44 3.2.3 Semi-supervised and Self-supervised Learning 44 3.3 Self Distillation via Dropout 45 3.3.1 Method Formulation 46 3.3.2 Collaboration with other method 47 3.3.3 Forward versus reverse KL-Divergence 48 3.4 Experiments 53 3.4.1 Implementation Details 53 3.4.2 Results 54 3.5 Conclusion 62 4 Membership inference attacks against object detection models 63 4.1 Introduction 63 4.2 Background and Related Work 65 4.2.1 Membership Inference Attack 65 4.2.2 Object Detection 66 4.2.3 Datasets 67 4.3 Attack Methodology 67 4.3.1 Motivation 69 4.3.2 Gradient Tree Boosting 69 4.3.3 Convolutional Neural Network Based Method 70 4.3.4 Transfer Attack 73 4.4 Defense 73 4.4.1 Dropout 73 4.4.2 Diff erentially Private Algorithm 74 4.5 Experiments 75 4.5.1 Target and Shadow Model Setup 75 4.5.2 Attack Model Setup 77 4.5.3 Experiment Results 78 4.5.4 Transfer Attacks 80 4.5.5 Defense 81 4.6 Conclusion 81 5 Single Image Deraining 82 5.1 Introduction 82 5.2 Related Work 86 5.3 Proposed Network 89 5.3.1 Multi-Level Connection 89 5.3.2 Wide Regional Non-Local Block 92 5.3.3 Discrete Wavelet Transform 94 5.3.4 Loss Function 94 5.4 Experiments 95 5.4.1 Datasets and Evaluation Metrics 95 5.4.2 Datasets and Experiment Details 96 5.4.3 Evaluations 97 5.4.4 Ablation Study 104 5.4.5 Applications for Other Tasks 107 5.4.6 Analysis on multi-level features 109 5.5 Conclusion 111 The bibliography 112 Abstract (in Korean) 129박

SNU Open Repository and Archive