Search CORE

11 research outputs found

Negative Sampling in Variational Autoencoders

Author: Benkő Beatrix
Csiszárik Adrián
Varga Dániel
Publication venue
Publication date: 01/01/2019
Field of study

Modern deep artificial neural networks have achieved great success in the domain of computer vision and beyond. However, their application to many real-world tasks is undermined by certain limitations, such as overconfident uncertainty estimates on out-of-distribution data or performance deterioration under data distribution shifts. Several types of deep learning models used for density estimation through probabilistic generative modeling have been shown to fail to detect out-of-distribution samples by assigning higher likelihoods to anomalous data. We investigate this failure mode in Variational Autoencoder models, which are also prone to this, and improve upon the out-of-distribution generalization performance of the model by employing an alternative training scheme utilizing negative samples. We present a fully unsupervised version: when the model is trained in an adversarial manner, the generator's own outputs can be used as negative samples. We demonstrate empirically the effectiveness of the approach in reducing the overconfident likelihood estimates of out-of-distribution inputs on image data

arXiv.org e-Print Archive

Repository of the Academy's Library

Gradient Regularization Improves Accuracy of Discriminative Models

Author: Csiszárik Adrián
Varga Dániel
Zombori Zsolt
Publication venue: 'Uniwersytet Jagiellonski - Wydawnictwo Uniwersytetu Jagiellonskiego'
Publication date: 01/01/2018
Field of study

Regularizing the gradient norm of the output of a neural network is a powerful technique, rediscovered several times. This paper presents evidence that gradient regularization can consistently improve classification accuracy on vision tasks, using modern deep neural networks, especially when the amount of training data is small. We introduce our regularizers as members of a broader class of Jacobian-based regularizers. We demonstrate empirically on real and synthetic data that the learning process leads to gradients controlled beyond the training points, and results in solutions that generalize well

arXiv.org e-Print Archive

Repository of the Academy's Library

Negative Sampling in Variational Autoencoders

Author: Benkő Beatrix
Csiszárik Adrián
Varga Dániel
Publication venue
Publication date: 12/12/2019
Field of study

We propose negative sampling as an approach to improve the notoriously bad out-of-distribution likelihood estimates of Variational Autoencoder models. Our model pushes latent images of negative samples away from the prior. When the source of negative samples is an auxiliary dataset, such a model can vastly improve on baselines when evaluated on OOD detection tasks. Perhaps more surprisingly, we present a fully unsupervised version of employing negative sampling in VAEs: when the generator is trained in an adversarial manner, using the generator's own outputs as negative samples can also significantly improve the robustness of OOD likelihood estimates

arXiv.org e-Print Archive

The density of planar sets avoiding unit distances

Author: Ambrus Gergely
Csiszárik Adrián
Matolcsi Máté
Varga Dániel
Zsámboki Pál
Publication venue
Publication date: 28/07/2023
Field of study

By improving upon previous estimates on a problem posed by L. Moser, we prove a conjecture of Erd\H{o}s that the density of any measurable planar set avoiding unit distances cannot exceed

1/4

. Our argument implies the upper bound of

0.2470

.Comment: 24 pages, 6 figures. Final version, to appear in Mathematical Programmin

arXiv.org e-Print Archive

Mode Combinability: Exploring Convex Combinations of Permutation Aligned Models

Author: Csiszárik Adrián
Kiss Melinda F.
Kőrösi-Szabó Péter
Muntag Márton
Papp Gergely
Varga Dániel
Publication venue
Publication date: 22/08/2023
Field of study

We explore element-wise convex combinations of two permutation-aligned neural network parameter vectors

\Theta_A

and

\Theta_B

of size

d

. We conduct extensive experiments by examining various distributions of such model combinations parametrized by elements of the hypercube

[0,1]^{d}

and its vicinity. Our findings reveal that broad regions of the hypercube form surfaces of low loss values, indicating that the notion of linear mode connectivity extends to a more general phenomenon which we call mode combinability. We also make several novel observations regarding linear mode connectivity and model re-basin. We demonstrate a transitivity property: two models re-based to a common third model are also linear mode connected, and a robustness property: even with significant perturbations of the neuron matchings the resulting combinations continue to form a working model. Moreover, we analyze the functional and weight similarity of model combinations and show that such combinations are non-vacuous in the sense that there are significant functional differences between the resulting models

arXiv.org e-Print Archive

Update on FLoP, a Reinforcement Learning based Theorem Prover

Author: Csiszárik Adrián
Kaliszyk Cezary
Michalewski Henryk
Urban Josef
Zombori Zsolt
Publication venue
Publication date: 01/01/2020
Field of study

Repository of the Academy's Library

The density of planar sets avoiding unit distances

Author: Ambrus Gergely
Csiszárik Adrián
Matolcsi Máté
Varga Dániel
Zsámboki Pál
Publication venue
Publication date: 01/01/2023
Field of study

Repository of the Academy's Library

Towards Finding Longer Proofs

Author: Csiszárik Adrián
Kaliszyk Cezary
Michalewski Henryk
Urban Josef
Zombori Zsolt
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

Repository of the Academy's Library