4,266 research outputs found
Piecewise Latent Variables for Neural Variational Text Processing
Advances in neural variational inference have facilitated the learning of
powerful directed graphical models with continuous latent variables, such as
variational autoencoders. The hope is that such models will learn to represent
rich, multi-modal latent factors in real-world data, such as natural language
text. However, current models often assume simplistic priors on the latent
variables - such as the uni-modal Gaussian distribution - which are incapable
of representing complex latent factors efficiently. To overcome this
restriction, we propose the simple, but highly flexible, piecewise constant
distribution. This distribution has the capacity to represent an exponential
number of modes of a latent target distribution, while remaining mathematically
tractable. Our results demonstrate that incorporating this new latent
distribution into different models yields substantial improvements in natural
language processing tasks such as document modeling and natural language
generation for dialogue.Comment: 19 pages, 2 figures, 8 tables; EMNLP 201
Platonic model of mind as an approximation to neurodynamics
Hierarchy of approximations involved in simplification of microscopic theories, from sub-cellural to the whole brain level, is presented. A new approximation to neural dynamics is described, leading to a Platonic-like model of mind based on psychological spaces. Objects and events in these spaces correspond to quasi-stable states of brain dynamics and may be interpreted from psychological point of view. Platonic model bridges the gap between neurosciences and psychological sciences. Static and dynamic versions of this model are outlined and Feature Space Mapping, a neurofuzzy realization of the static version of Platonic model, described. Categorization experiments with human subjects are analyzed from the neurodynamical and Platonic model points of view
A Transfer Principle: Universal Approximators Between Metric Spaces From Euclidean Universal Approximators
We build universal approximators of continuous maps between arbitrary Polish
metric spaces and using universal approximators
between Euclidean spaces as building blocks. Earlier results assume that the
output space is a topological vector space. We overcome this
limitation by "randomization": our approximators output discrete probability
measures over . When and are Polish
without additional structure, we prove very general qualitative guarantees;
when they have suitable combinatorial structure, we prove quantitative
guarantees for H\"older-like maps, including maps between finite graphs,
solution operators to rough differential equations between certain Carnot
groups, and continuous non-linear operators between Banach spaces arising in
inverse problems. In particular, we show that the required number of Dirac
measures is determined by the combinatorial structure of and
. For barycentric , including Banach spaces,
-trees, Hadamard manifolds, or Wasserstein spaces on Polish metric
spaces, our approximators reduce to -valued functions. When the
Euclidean approximators are neural networks, our constructions generalize
transformer networks, providing a new probabilistic viewpoint of geometric deep
learning.Comment: 14 Figures, 3 Tables, 78 Pages (Main 40, Proofs 26, Acknowledgments
and References 12
Deep Learning for Single Image Super-Resolution: A Brief Review
Single image super-resolution (SISR) is a notoriously challenging ill-posed
problem, which aims to obtain a high-resolution (HR) output from one of its
low-resolution (LR) versions. To solve the SISR problem, recently powerful deep
learning algorithms have been employed and achieved the state-of-the-art
performance. In this survey, we review representative deep learning-based SISR
methods, and group them into two categories according to their major
contributions to two essential aspects of SISR: the exploration of efficient
neural network architectures for SISR, and the development of effective
optimization objectives for deep SISR learning. For each category, a baseline
is firstly established and several critical limitations of the baseline are
summarized. Then representative works on overcoming these limitations are
presented based on their original contents as well as our critical
understandings and analyses, and relevant comparisons are conducted from a
variety of perspectives. Finally we conclude this review with some vital
current challenges and future trends in SISR leveraging deep learning
algorithms.Comment: Accepted by IEEE Transactions on Multimedia (TMM
A Theory of Networks for Appxoimation and Learning
Learning an input-output mapping from a set of examples, of the type that many neural networks have been constructed to perform, can be regarded as synthesizing an approximation of a multi-dimensional function, that is solving the problem of hypersurface reconstruction. From this point of view, this form of learning is closely related to classical approximation techniques, such as generalized splines and regularization theory. This paper considers the problems of an exact representation and, in more detail, of the approximation of linear and nolinear mappings in terms of simpler functions of fewer variables. Kolmogorov's theorem concerning the representation of functions of several variables in terms of functions of one variable turns out to be almost irrelevant in the context of networks for learning. We develop a theoretical framework for approximation based on regularization techniques that leads to a class of three-layer networks that we call Generalized Radial Basis Functions (GRBF), since they are mathematically related to the well-known Radial Basis Functions, mainly used for strict interpolation tasks. GRBF networks are not only equivalent to generalized splines, but are also closely related to pattern recognition methods such as Parzen windows and potential functions and to several neural network algorithms, such as Kanerva's associative memory, backpropagation and Kohonen's topology preserving map. They also have an interesting interpretation in terms of prototypes that are synthesized and optimally combined during the learning stage. The paper introduces several extensions and applications of the technique and discusses intriguing analogies with neurobiological data
- …