759 research outputs found
Variance Loss in Variational Autoencoders
In this article, we highlight what appears to be major issue of Variational
Autoencoders, evinced from an extensive experimentation with different network
architectures and datasets: the variance of generated data is significantly
lower than that of training data. Since generative models are usually evaluated
with metrics such as the Frechet Inception Distance (FID) that compare the
distributions of (features of) real versus generated images, the variance loss
typically results in degraded scores. This problem is particularly relevant in
a two stage setting, where we use a second VAE to sample in the latent space of
the first VAE. The minor variance creates a mismatch between the actual
distribution of latent variables and those generated by the second VAE, that
hinders the beneficial effects of the second stage. Renormalizing the output of
the second VAE towards the expected normal spherical distribution, we obtain a
sudden burst in the quality of generated samples, as also testified in terms of
FID.Comment: Article accepted at the Sixth International Conference on Machine
Learning, Optimization, and Data Science. July 19-23, 2020 - Certosa di
Pontignano, Siena, Ital
Sparsity in Variational Autoencoders
Working in high-dimensional latent spaces, the internal encoding of data in
Variational Autoencoders becomes naturally sparse. We discuss this known but
controversial phenomenon sometimes refereed to as overpruning, to emphasize the
under-use of the model capacity. In fact, it is an important form of
self-regularization, with all the typical benefits associated with sparsity: it
forces the model to focus on the really important features, highly reducing the
risk of overfitting. Especially, it is a major methodological guide for the
correct tuning of the model capacity, progressively augmenting it to attain
sparsity, or conversely reducing the dimension of the network removing links to
zeroed out neurons. The degree of sparsity crucially depends on the network
architecture: for instance, convolutional networks typically show less
sparsity, likely due to the tighter relation of features to different spatial
regions of the input.Comment: An Extended Abstract of this survey will be presented at the 1st
International Conference on Advances in Signal Processing and Artificial
Intelligence (ASPAI' 2019), 20-22 March 2019, Barcelona, Spai
The Effectiveness of Data Augmentation for Detection of Gastrointestinal Diseases from Endoscopical Images
The lack, due to privacy concerns, of large public databases of medical
pathologies is a well-known and major problem, substantially hindering the
application of deep learning techniques in this field. In this article, we
investigate the possibility to supply to the deficiency in the number of data
by means of data augmentation techniques, working on the recent Kvasir dataset
of endoscopical images of gastrointestinal diseases. The dataset comprises
4,000 colored images labeled and verified by medical endoscopists, covering a
few common pathologies at different anatomical landmarks: Z-line, pylorus and
cecum. We show how the application of data augmentation techniques allows to
achieve sensible improvements of the classification with respect to previous
approaches, both in terms of precision and recall
Smart matching
One of the most annoying aspects in the formalization of mathematics is the
need of transforming notions to match a given, existing result. This kind of
transformations, often based on a conspicuous background knowledge in the given
scientific domain (mostly expressed in the form of equalities or isomorphisms),
are usually implicit in the mathematical discourse, and it would be highly
desirable to obtain a similar behavior in interactive provers. The paper
describes the superposition-based implementation of this feature inside the
Matita interactive theorem prover, focusing in particular on the so called
smart application tactic, supporting smart matching between a goal and a given
result.Comment: To appear in The 9th International Conference on Mathematical
Knowledge Management: MKM 201
Crawling in Rogue's dungeons with (partitioned) A3C
Rogue is a famous dungeon-crawling video-game of the 80ies, the ancestor of
its gender. Rogue-like games are known for the necessity to explore partially
observable and always different randomly-generated labyrinths, preventing any
form of level replay. As such, they serve as a very natural and challenging
task for reinforcement learning, requiring the acquisition of complex,
non-reactive behaviors involving memory and planning. In this article we show
how, exploiting a version of A3C partitioned on different situations, the agent
is able to reach the stairs and descend to the next level in 98% of cases.Comment: Accepted at the Fourth International Conference on Machine Learning,
Optimization, and Data Science (LOD 2018
Variational Autoencoders and the Variable Collapse Phenomenon
In Variational Autoencoders, when working in high-dimensional latent spaces, there is a natural collapse of latent variables with minor significance, that get altogether neglected by the generator. We discuss this known but controversial phenomenon, sometimes referred to as overpruning, to emphasize the under-use of the model capacity. In fact, it is an important form of self-regularization, with all the typical benefits associated with sparsity: it forces the model to focus on the really important features, enhancing their disentanglement and reducing the risk of overfitting. In this article, we discuss the issue, surveying past works, and particularly focusing on the exploitation of the variable collapse phenomenon as a methodological guideline for the correct tuning of the model capacity, and of the loss function parameters
A Bi-Directional Refinement Algorithm for the Calculus of (Co)Inductive Constructions
The paper describes the refinement algorithm for the Calculus of
(Co)Inductive Constructions (CIC) implemented in the interactive theorem prover
Matita. The refinement algorithm is in charge of giving a meaning to the terms,
types and proof terms directly written by the user or generated by using
tactics, decision procedures or general automation. The terms are written in an
"external syntax" meant to be user friendly that allows omission of
information, untyped binders and a certain liberal use of user defined
sub-typing. The refiner modifies the terms to obtain related well typed terms
in the internal syntax understood by the kernel of the ITP. In particular, it
acts as a type inference algorithm when all the binders are untyped. The
proposed algorithm is bi-directional: given a term in external syntax and a
type expected for the term, it propagates as much typing information as
possible towards the leaves of the term. Traditional mono-directional
algorithms, instead, proceed in a bottom-up way by inferring the type of a
sub-term and comparing (unifying) it with the type expected by its context only
at the end. We propose some novel bi-directional rules for CIC that are
particularly effective. Among the benefits of bi-directionality we have better
error message reporting and better inference of dependent types. Moreover,
thanks to bi-directionality, the coercion system for sub-typing is more
effective and type inference generates simpler unification problems that are
more likely to be solved by the inherently incomplete higher order unification
algorithms implemented. Finally we introduce in the external syntax the notion
of vector of placeholders that enables to omit at once an arbitrary number of
arguments. Vectors of placeholders allow a trivial implementation of implicit
arguments and greatly simplify the implementation of primitive and simple
tactics
- …