11 research outputs found

    Negative Sampling in Variational Autoencoders

    Get PDF
    Modern deep artificial neural networks have achieved great success in the domain of computer vision and beyond. However, their application to many real-world tasks is undermined by certain limitations, such as overconfident uncertainty estimates on out-of-distribution data or performance deterioration under data distribution shifts. Several types of deep learning models used for density estimation through probabilistic generative modeling have been shown to fail to detect out-of-distribution samples by assigning higher likelihoods to anomalous data. We investigate this failure mode in Variational Autoencoder models, which are also prone to this, and improve upon the out-of-distribution generalization performance of the model by employing an alternative training scheme utilizing negative samples. We present a fully unsupervised version: when the model is trained in an adversarial manner, the generator's own outputs can be used as negative samples. We demonstrate empirically the effectiveness of the approach in reducing the overconfident likelihood estimates of out-of-distribution inputs on image data

    Gradient Regularization Improves Accuracy of Discriminative Models

    Get PDF
    Regularizing the gradient norm of the output of a neural network is a powerful technique, rediscovered several times. This paper presents evidence that gradient regularization can consistently improve classification accuracy on vision tasks, using modern deep neural networks, especially when the amount of training data is small. We introduce our regularizers as members of a broader class of Jacobian-based regularizers. We demonstrate empirically on real and synthetic data that the learning process leads to gradients controlled beyond the training points, and results in solutions that generalize well

    Negative Sampling in Variational Autoencoders

    Full text link
    We propose negative sampling as an approach to improve the notoriously bad out-of-distribution likelihood estimates of Variational Autoencoder models. Our model pushes latent images of negative samples away from the prior. When the source of negative samples is an auxiliary dataset, such a model can vastly improve on baselines when evaluated on OOD detection tasks. Perhaps more surprisingly, we present a fully unsupervised version of employing negative sampling in VAEs: when the generator is trained in an adversarial manner, using the generator's own outputs as negative samples can also significantly improve the robustness of OOD likelihood estimates

    The density of planar sets avoiding unit distances

    Full text link
    By improving upon previous estimates on a problem posed by L. Moser, we prove a conjecture of Erd\H{o}s that the density of any measurable planar set avoiding unit distances cannot exceed 1/41/4. Our argument implies the upper bound of 0.24700.2470.Comment: 24 pages, 6 figures. Final version, to appear in Mathematical Programmin

    Mode Combinability: Exploring Convex Combinations of Permutation Aligned Models

    Full text link
    We explore element-wise convex combinations of two permutation-aligned neural network parameter vectors ΘA\Theta_A and ΘB\Theta_B of size dd. We conduct extensive experiments by examining various distributions of such model combinations parametrized by elements of the hypercube [0,1]d[0,1]^{d} and its vicinity. Our findings reveal that broad regions of the hypercube form surfaces of low loss values, indicating that the notion of linear mode connectivity extends to a more general phenomenon which we call mode combinability. We also make several novel observations regarding linear mode connectivity and model re-basin. We demonstrate a transitivity property: two models re-based to a common third model are also linear mode connected, and a robustness property: even with significant perturbations of the neuron matchings the resulting combinations continue to form a working model. Moreover, we analyze the functional and weight similarity of model combinations and show that such combinations are non-vacuous in the sense that there are significant functional differences between the resulting models

    Towards Finding Longer Proofs

    Get PDF
    corecore