694 research outputs found

    Entropy of Overcomplete Kernel Dictionaries

    Full text link
    In signal analysis and synthesis, linear approximation theory considers a linear decomposition of any given signal in a set of atoms, collected into a so-called dictionary. Relevant sparse representations are obtained by relaxing the orthogonality condition of the atoms, yielding overcomplete dictionaries with an extended number of atoms. More generally than the linear decomposition, overcomplete kernel dictionaries provide an elegant nonlinear extension by defining the atoms through a mapping kernel function (e.g., the gaussian kernel). Models based on such kernel dictionaries are used in neural networks, gaussian processes and online learning with kernels. The quality of an overcomplete dictionary is evaluated with a diversity measure the distance, the approximation, the coherence and the Babel measures. In this paper, we develop a framework to examine overcomplete kernel dictionaries with the entropy from information theory. Indeed, a higher value of the entropy is associated to a further uniform spread of the atoms over the space. For each of the aforementioned diversity measures, we derive lower bounds on the entropy. Several definitions of the entropy are examined, with an extensive analysis in both the input space and the mapped feature space.Comment: 10 page

    Deep Divergence-Based Approach to Clustering

    Get PDF
    A promising direction in deep learning research consists in learning representations and simultaneously discovering cluster structure in unlabeled data by optimizing a discriminative loss function. As opposed to supervised deep learning, this line of research is in its infancy, and how to design and optimize suitable loss functions to train deep neural networks for clustering is still an open question. Our contribution to this emerging field is a new deep clustering network that leverages the discriminative power of information-theoretic divergence measures, which have been shown to be effective in traditional clustering. We propose a novel loss function that incorporates geometric regularization constraints, thus avoiding degenerate structures of the resulting clustering partition. Experiments on synthetic benchmarks and real datasets show that the proposed network achieves competitive performance with respect to other state-of-the-art methods, scales well to large datasets, and does not require pre-training steps

    Probabilistic Neural Networks for Special Tasks in Electromagnetics

    Get PDF
    Tato práce pojednává o technikách behaviorálního modelování pro speciální úlohy v elektromagnetismu, které je možno formulovat jako problém aproximace, klasifikace, odhadu hustoty pravděpodobnosti nebo kombinatorické optimalizace. Zkoumané methody se dotýkají dvou základních problémů ze strojového učení a combinatorické optimalizace: ”bias vs. variance dilema” a NP výpočetní komplexity. Boltzmanův stroj je v práci navržen ke zjednodušování komplexních impedančních sítí. Bayesovský přístup ke strojovému učení je upraven pro regularizaci Parzenova okna se snahou o vytvoření obecného kritéria pro regularizaci pravděpodobnostní a regresní neuronové sítě.The thesis deals with behavioural modelling techniques capable solving special tasks in electromagnetics which can be formulated as approximation, classification, probability estimation, and combinatorial optimization problems. Concept of the work lies in applying a probabilistic approach to behavioural modelling. Examined methods address two general problems in machine learning and combinatorial optimization: ”bias vs. variance dilemma” and NP computational complexity. The Boltzmann machine is employed to simplify a complex impedance network. The Parzen window is regularized using the Bayesian strategy for obtaining a model selection criterion for probabilistic and general regression neural networks.

    Deep Unsupervised Learning using Nonequilibrium Thermodynamics

    Full text link
    A central problem in machine learning involves modeling complex data-sets using highly flexible families of probability distributions in which learning, sampling, inference, and evaluation are still analytically or computationally tractable. Here, we develop an approach that simultaneously achieves both flexibility and tractability. The essential idea, inspired by non-equilibrium statistical physics, is to systematically and slowly destroy structure in a data distribution through an iterative forward diffusion process. We then learn a reverse diffusion process that restores structure in data, yielding a highly flexible and tractable generative model of the data. This approach allows us to rapidly learn, sample from, and evaluate probabilities in deep generative models with thousands of layers or time steps, as well as to compute conditional and posterior probabilities under the learned model. We additionally release an open source reference implementation of the algorithm

    Using RBF nets in rubber industry process control

    Get PDF
    This paper describes the use of a radial basis function (RBF) neural network. It approximates the process parameters for the extrusion of a rubber profile used in tyre production. After introducing the problem, we describe the RBF net algorithm and the modeling of the industrial problem. The algorithm shows good results even using only a few training samples. It turns out that the „curse of dimensions“ plays an important role in the model. The paper concludes by a discussion of possible systematic error influences and improvements
    corecore