694 research outputs found
Entropy of Overcomplete Kernel Dictionaries
In signal analysis and synthesis, linear approximation theory considers a
linear decomposition of any given signal in a set of atoms, collected into a
so-called dictionary. Relevant sparse representations are obtained by relaxing
the orthogonality condition of the atoms, yielding overcomplete dictionaries
with an extended number of atoms. More generally than the linear decomposition,
overcomplete kernel dictionaries provide an elegant nonlinear extension by
defining the atoms through a mapping kernel function (e.g., the gaussian
kernel). Models based on such kernel dictionaries are used in neural networks,
gaussian processes and online learning with kernels.
The quality of an overcomplete dictionary is evaluated with a diversity
measure the distance, the approximation, the coherence and the Babel measures.
In this paper, we develop a framework to examine overcomplete kernel
dictionaries with the entropy from information theory. Indeed, a higher value
of the entropy is associated to a further uniform spread of the atoms over the
space. For each of the aforementioned diversity measures, we derive lower
bounds on the entropy. Several definitions of the entropy are examined, with an
extensive analysis in both the input space and the mapped feature space.Comment: 10 page
Deep Divergence-Based Approach to Clustering
A promising direction in deep learning research consists in learning
representations and simultaneously discovering cluster structure in unlabeled
data by optimizing a discriminative loss function. As opposed to supervised
deep learning, this line of research is in its infancy, and how to design and
optimize suitable loss functions to train deep neural networks for clustering
is still an open question. Our contribution to this emerging field is a new
deep clustering network that leverages the discriminative power of
information-theoretic divergence measures, which have been shown to be
effective in traditional clustering. We propose a novel loss function that
incorporates geometric regularization constraints, thus avoiding degenerate
structures of the resulting clustering partition. Experiments on synthetic
benchmarks and real datasets show that the proposed network achieves
competitive performance with respect to other state-of-the-art methods, scales
well to large datasets, and does not require pre-training steps
Probabilistic Neural Networks for Special Tasks in Electromagnetics
Tato práce pojednává o technikách behaviorálního modelování pro speciální úlohy v elektromagnetismu, které je možno formulovat jako problém aproximace, klasifikace, odhadu hustoty pravděpodobnosti nebo kombinatorické optimalizace. Zkoumané methody se dotýkají dvou základních problémů ze strojového učení a combinatorické optimalizace: ”bias vs. variance dilema” a NP výpočetní komplexity. Boltzmanův stroj je v práci navržen ke zjednodušování komplexních impedančních sítí. Bayesovský přístup ke strojovému učení je upraven pro regularizaci Parzenova okna se snahou o vytvoření obecného kritéria pro regularizaci pravděpodobnostní a regresní neuronové sítě.The thesis deals with behavioural modelling techniques capable solving special tasks in electromagnetics which can be formulated as approximation, classification, probability estimation, and combinatorial optimization problems. Concept of the work lies in applying a probabilistic approach to behavioural modelling. Examined methods address two general problems in machine learning and combinatorial optimization: ”bias vs. variance dilemma” and NP computational complexity. The Boltzmann machine is employed to simplify a complex impedance network. The Parzen window is regularized using the Bayesian strategy for obtaining a model selection criterion for probabilistic and general regression neural networks.
Deep Unsupervised Learning using Nonequilibrium Thermodynamics
A central problem in machine learning involves modeling complex data-sets
using highly flexible families of probability distributions in which learning,
sampling, inference, and evaluation are still analytically or computationally
tractable. Here, we develop an approach that simultaneously achieves both
flexibility and tractability. The essential idea, inspired by non-equilibrium
statistical physics, is to systematically and slowly destroy structure in a
data distribution through an iterative forward diffusion process. We then learn
a reverse diffusion process that restores structure in data, yielding a highly
flexible and tractable generative model of the data. This approach allows us to
rapidly learn, sample from, and evaluate probabilities in deep generative
models with thousands of layers or time steps, as well as to compute
conditional and posterior probabilities under the learned model. We
additionally release an open source reference implementation of the algorithm
Using RBF nets in rubber industry process control
This paper describes the use of a radial basis function (RBF) neural network. It approximates the process parameters for the extrusion of a rubber profile used in tyre production. After introducing the problem, we describe the RBF net algorithm and the modeling of the industrial problem. The algorithm shows good results even using only a few training samples. It turns out that the „curse of dimensions“ plays an important role in the model. The paper concludes by a discussion of possible systematic error influences and improvements
- …