86,002 research outputs found
Normalization and Generalization in Deep Learning
In this thesis, we discuss the importance of data normalization in deep learning and its relationship with generalization. Normalization is a staple of deep learning architectures and has been shown to improve the stability and generalizability of deep learning models, yet the reason why these normalization techniques work is still unknown and is an active area of research. Inspired by this uncertainty, we explore how different normalization techniques perform when employed in different deep learning architectures, while also exploring generalization and metrics associated with generalization in congruence with our investigation into normalization. The goal behind our experiments was to investigate if there exist any identifiable trends for the different normalization methods across an array of different training schemes with respect to the various metrics employed. We found that class similarity was seemingly the strongest predictor for train accuracy, test accuracy, and generalization ratio across all employed metrics. Overall, BatchNorm and EvoNormBO generally performed the best on measures of test and train accuracy, while InstanceNorm and Plain performed the worst
Why Clean Generalization and Robust Overfitting Both Happen in Adversarial Training
Adversarial training is a standard method to train deep neural networks to be
robust to adversarial perturbation. Similar to surprising ability in the standard deep learning setting, neural networks
trained by adversarial training also generalize well for . However, in constrast with clean generalization, while adversarial
training method is able to achieve low , there
still exists a significant , which promotes
us exploring what mechanism leads to both during learning process. In this paper, we provide
a theoretical understanding of this CGRO phenomenon in adversarial training.
First, we propose a theoretical framework of adversarial training, where we
analyze to explain how adversarial training
leads network learner to CGRO regime. Specifically, we prove that, under our
patch-structured dataset, the CNN model provably partially learns the true
feature but exactly memorizes the spurious features from training-adversarial
examples, which thus results in clean generalization and robust overfitting.
For more general data assumption, we then show the efficiency of CGRO
classifier from the perspective of . On the
empirical side, to verify our theoretical analysis in real-world vision
dataset, we investigate the during
training. Moreover, inspired by our experiments, we prove a robust
generalization bound based on of loss landscape,
which may be an independent interest.Comment: 27 pages, comments welcom
Self-similarity-based super-resolution of photoacoustic angiography from hand-drawn doodles
Deep-learning-based super-resolution photoacoustic angiography (PAA) is a
powerful tool that restores blood vessel images from under-sampled images to
facilitate disease diagnosis. Nonetheless, due to the scarcity of training
samples, PAA super-resolution models often exhibit inadequate generalization
capabilities, particularly in the context of continuous monitoring tasks. To
address this challenge, we propose a novel approach that employs a
super-resolution PAA method trained with forged PAA images. We start by
generating realistic PAA images of human lips from hand-drawn curves using a
diffusion-based image generation model. Subsequently, we train a
self-similarity-based super-resolution model with these forged PAA images.
Experimental results show that our method outperforms the super-resolution
model trained with authentic PAA images in both original-domain and
cross-domain tests. Specially, our approach boosts the quality of
super-resolution reconstruction using the images forged by the deep learning
model, indicating that the collaboration between deep learning models can
facilitate generalization, despite limited initial dataset. This approach shows
promising potential for exploring zero-shot learning neural networks for vision
tasks.Comment: 12 pages, 6 figures, journa
Exploring Deep Learning for deformative operators in vector-based cartographic road generalization
Cartographic generalisation is the process by which geographical data is simplified and abstracted to increase the legibility of maps at reduced scales. As map scales decrease, irrelevant map features are removed (selective generalisation), and relevant map features are deformed, eliminating unnec- essary details while preserving the general shapes (deformative generalisation). The automation of cartographic generalisation has been a tough nut to crack for years because it is governed not only by explicit rules but also by a large body of implicit cartographic knowledge that conven- tional automation approaches struggle to acquire and formalise. In recent years, the introduction of Deep Learning (DL) and its inductive capabilities has raised hope for further progress. This thesis explores the potential of three Deep Learning architectures — Graph Convolutional Neural Network (GCNN), Auto Encoder, and Recurrent Neural Network (RNN) — in their application on the deformative generalisation of roads using a vector-based approach. The generated small- scale representations of the input roads differ substantially across the architectures, not only in their included frequency spectra but also in their ability to apply certain generalisation operators. However, the most apparent learnt and applied generalisation operator by all architectures is the smoothing of the large-scale roads. The outcome of this thesis has been encouraging but suggests to pursue further research about the effect of the pre-processing of the input geometries and the inclusion of spatial context and the combination of map features (e.g. buildings) to better capture the implicit knowledge engrained in the products of mapping agencies used for training the DL models
Exploit Where Optimizer Explores via Residuals
In order to train the neural networks faster, many efforts have been devoted
to exploring a better solution trajectory, but few have been put into
exploiting the existing solution trajectory. To exploit the trajectory of
(momentum) stochastic gradient descent (SGD(m)) method, we propose a novel
method named SGD(m) with residuals (RSGD(m)), which leads to a performance
boost of both the convergence and generalization. Our new method can also be
applied to other optimizers such as ASGD and Adam. We provide theoretical
analysis to show that RSGD achieves a smaller growth rate of the generalization
error and the same (but empirically better) convergence rate compared with SGD.
Extensive deep learning experiments on image classification, language modeling
and graph convolutional neural networks show that the proposed algorithm is
faster than SGD(m)/Adam at the initial training stage, and similar to or better
than SGD(m) at the end of training with better generalization error
Energy Confused Adversarial Metric Learning for Zero-Shot Image Retrieval and Clustering
Deep metric learning has been widely applied in many computer vision tasks,
and recently, it is more attractive in \emph{zero-shot image retrieval and
clustering}(ZSRC) where a good embedding is requested such that the unseen
classes can be distinguished well. Most existing works deem this 'good'
embedding just to be the discriminative one and thus race to devise powerful
metric objectives or hard-sample mining strategies for leaning discriminative
embedding. However, in this paper, we first emphasize that the generalization
ability is a core ingredient of this 'good' embedding as well and largely
affects the metric performance in zero-shot settings as a matter of fact. Then,
we propose the Energy Confused Adversarial Metric Learning(ECAML) framework to
explicitly optimize a robust metric. It is mainly achieved by introducing an
interesting Energy Confusion regularization term, which daringly breaks away
from the traditional metric learning idea of discriminative objective devising,
and seeks to 'confuse' the learned model so as to encourage its generalization
ability by reducing overfitting on the seen classes. We train this confusion
term together with the conventional metric objective in an adversarial manner.
Although it seems weird to 'confuse' the network, we show that our ECAML indeed
serves as an efficient regularization technique for metric learning and is
applicable to various conventional metric methods. This paper empirically and
experimentally demonstrates the importance of learning embedding with good
generalization, achieving state-of-the-art performances on the popular CUB,
CARS, Stanford Online Products and In-Shop datasets for ZSRC tasks.
\textcolor[rgb]{1, 0, 0}{Code available at http://www.bhchen.cn/}.Comment: AAAI 2019, Spotligh
- …