86,002 research outputs found

    Normalization and Generalization in Deep Learning

    Get PDF
    In this thesis, we discuss the importance of data normalization in deep learning and its relationship with generalization. Normalization is a staple of deep learning architectures and has been shown to improve the stability and generalizability of deep learning models, yet the reason why these normalization techniques work is still unknown and is an active area of research. Inspired by this uncertainty, we explore how different normalization techniques perform when employed in different deep learning architectures, while also exploring generalization and metrics associated with generalization in congruence with our investigation into normalization. The goal behind our experiments was to investigate if there exist any identifiable trends for the different normalization methods across an array of different training schemes with respect to the various metrics employed. We found that class similarity was seemingly the strongest predictor for train accuracy, test accuracy, and generalization ratio across all employed metrics. Overall, BatchNorm and EvoNormBO generally performed the best on measures of test and train accuracy, while InstanceNorm and Plain performed the worst

    Why Clean Generalization and Robust Overfitting Both Happen in Adversarial Training

    Full text link
    Adversarial training is a standard method to train deep neural networks to be robust to adversarial perturbation. Similar to surprising clean generalization\textit{clean generalization} ability in the standard deep learning setting, neural networks trained by adversarial training also generalize well for unseen clean data\textit{unseen clean data}. However, in constrast with clean generalization, while adversarial training method is able to achieve low robust training error\textit{robust training error}, there still exists a significant robust generalization gap\textit{robust generalization gap}, which promotes us exploring what mechanism leads to both clean generalization and robust overfitting (CGRO)\textit{clean generalization and robust overfitting (CGRO)} during learning process. In this paper, we provide a theoretical understanding of this CGRO phenomenon in adversarial training. First, we propose a theoretical framework of adversarial training, where we analyze feature learning process\textit{feature learning process} to explain how adversarial training leads network learner to CGRO regime. Specifically, we prove that, under our patch-structured dataset, the CNN model provably partially learns the true feature but exactly memorizes the spurious features from training-adversarial examples, which thus results in clean generalization and robust overfitting. For more general data assumption, we then show the efficiency of CGRO classifier from the perspective of representation complexity\textit{representation complexity}. On the empirical side, to verify our theoretical analysis in real-world vision dataset, we investigate the dynamics of loss landscape\textit{dynamics of loss landscape} during training. Moreover, inspired by our experiments, we prove a robust generalization bound based on global flatness\textit{global flatness} of loss landscape, which may be an independent interest.Comment: 27 pages, comments welcom

    Self-similarity-based super-resolution of photoacoustic angiography from hand-drawn doodles

    Full text link
    Deep-learning-based super-resolution photoacoustic angiography (PAA) is a powerful tool that restores blood vessel images from under-sampled images to facilitate disease diagnosis. Nonetheless, due to the scarcity of training samples, PAA super-resolution models often exhibit inadequate generalization capabilities, particularly in the context of continuous monitoring tasks. To address this challenge, we propose a novel approach that employs a super-resolution PAA method trained with forged PAA images. We start by generating realistic PAA images of human lips from hand-drawn curves using a diffusion-based image generation model. Subsequently, we train a self-similarity-based super-resolution model with these forged PAA images. Experimental results show that our method outperforms the super-resolution model trained with authentic PAA images in both original-domain and cross-domain tests. Specially, our approach boosts the quality of super-resolution reconstruction using the images forged by the deep learning model, indicating that the collaboration between deep learning models can facilitate generalization, despite limited initial dataset. This approach shows promising potential for exploring zero-shot learning neural networks for vision tasks.Comment: 12 pages, 6 figures, journa

    Exploring Deep Learning for deformative operators in vector-based cartographic road generalization

    Full text link
    Cartographic generalisation is the process by which geographical data is simplified and abstracted to increase the legibility of maps at reduced scales. As map scales decrease, irrelevant map features are removed (selective generalisation), and relevant map features are deformed, eliminating unnec- essary details while preserving the general shapes (deformative generalisation). The automation of cartographic generalisation has been a tough nut to crack for years because it is governed not only by explicit rules but also by a large body of implicit cartographic knowledge that conven- tional automation approaches struggle to acquire and formalise. In recent years, the introduction of Deep Learning (DL) and its inductive capabilities has raised hope for further progress. This thesis explores the potential of three Deep Learning architectures — Graph Convolutional Neural Network (GCNN), Auto Encoder, and Recurrent Neural Network (RNN) — in their application on the deformative generalisation of roads using a vector-based approach. The generated small- scale representations of the input roads differ substantially across the architectures, not only in their included frequency spectra but also in their ability to apply certain generalisation operators. However, the most apparent learnt and applied generalisation operator by all architectures is the smoothing of the large-scale roads. The outcome of this thesis has been encouraging but suggests to pursue further research about the effect of the pre-processing of the input geometries and the inclusion of spatial context and the combination of map features (e.g. buildings) to better capture the implicit knowledge engrained in the products of mapping agencies used for training the DL models

    Exploit Where Optimizer Explores via Residuals

    Full text link
    In order to train the neural networks faster, many efforts have been devoted to exploring a better solution trajectory, but few have been put into exploiting the existing solution trajectory. To exploit the trajectory of (momentum) stochastic gradient descent (SGD(m)) method, we propose a novel method named SGD(m) with residuals (RSGD(m)), which leads to a performance boost of both the convergence and generalization. Our new method can also be applied to other optimizers such as ASGD and Adam. We provide theoretical analysis to show that RSGD achieves a smaller growth rate of the generalization error and the same (but empirically better) convergence rate compared with SGD. Extensive deep learning experiments on image classification, language modeling and graph convolutional neural networks show that the proposed algorithm is faster than SGD(m)/Adam at the initial training stage, and similar to or better than SGD(m) at the end of training with better generalization error

    Energy Confused Adversarial Metric Learning for Zero-Shot Image Retrieval and Clustering

    Full text link
    Deep metric learning has been widely applied in many computer vision tasks, and recently, it is more attractive in \emph{zero-shot image retrieval and clustering}(ZSRC) where a good embedding is requested such that the unseen classes can be distinguished well. Most existing works deem this 'good' embedding just to be the discriminative one and thus race to devise powerful metric objectives or hard-sample mining strategies for leaning discriminative embedding. However, in this paper, we first emphasize that the generalization ability is a core ingredient of this 'good' embedding as well and largely affects the metric performance in zero-shot settings as a matter of fact. Then, we propose the Energy Confused Adversarial Metric Learning(ECAML) framework to explicitly optimize a robust metric. It is mainly achieved by introducing an interesting Energy Confusion regularization term, which daringly breaks away from the traditional metric learning idea of discriminative objective devising, and seeks to 'confuse' the learned model so as to encourage its generalization ability by reducing overfitting on the seen classes. We train this confusion term together with the conventional metric objective in an adversarial manner. Although it seems weird to 'confuse' the network, we show that our ECAML indeed serves as an efficient regularization technique for metric learning and is applicable to various conventional metric methods. This paper empirically and experimentally demonstrates the importance of learning embedding with good generalization, achieving state-of-the-art performances on the popular CUB, CARS, Stanford Online Products and In-Shop datasets for ZSRC tasks. \textcolor[rgb]{1, 0, 0}{Code available at http://www.bhchen.cn/}.Comment: AAAI 2019, Spotligh
    • …
    corecore