6 research outputs found
Entropy of Overcomplete Kernel Dictionaries
In signal analysis and synthesis, linear approximation theory considers a
linear decomposition of any given signal in a set of atoms, collected into a
so-called dictionary. Relevant sparse representations are obtained by relaxing
the orthogonality condition of the atoms, yielding overcomplete dictionaries
with an extended number of atoms. More generally than the linear decomposition,
overcomplete kernel dictionaries provide an elegant nonlinear extension by
defining the atoms through a mapping kernel function (e.g., the gaussian
kernel). Models based on such kernel dictionaries are used in neural networks,
gaussian processes and online learning with kernels.
The quality of an overcomplete dictionary is evaluated with a diversity
measure the distance, the approximation, the coherence and the Babel measures.
In this paper, we develop a framework to examine overcomplete kernel
dictionaries with the entropy from information theory. Indeed, a higher value
of the entropy is associated to a further uniform spread of the atoms over the
space. For each of the aforementioned diversity measures, we derive lower
bounds on the entropy. Several definitions of the entropy are examined, with an
extensive analysis in both the input space and the mapped feature space.Comment: 10 page
Approximation errors of online sparsification criteria
Many machine learning frameworks, such as resource-allocating networks,
kernel-based methods, Gaussian processes, and radial-basis-function networks,
require a sparsification scheme in order to address the online learning
paradigm. For this purpose, several online sparsification criteria have been
proposed to restrict the model definition on a subset of samples. The most
known criterion is the (linear) approximation criterion, which discards any
sample that can be well represented by the already contributing samples, an
operation with excessive computational complexity. Several computationally
efficient sparsification criteria have been introduced in the literature, such
as the distance, the coherence and the Babel criteria. In this paper, we
provide a framework that connects these sparsification criteria to the issue of
approximating samples, by deriving theoretical bounds on the approximation
errors. Moreover, we investigate the error of approximating any feature, by
proposing upper-bounds on the approximation error for each of the
aforementioned sparsification criteria. Two classes of features are described
in detail, the empirical mean and the principal axes in the kernel principal
component analysis.Comment: 10 page
Analyzing sparse dictionaries for online learning with kernels
Many signal processing and machine learning methods share essentially the
same linear-in-the-parameter model, with as many parameters as available
samples as in kernel-based machines. Sparse approximation is essential in many
disciplines, with new challenges emerging in online learning with kernels. To
this end, several sparsity measures have been proposed in the literature to
quantify sparse dictionaries and constructing relevant ones, the most prolific
ones being the distance, the approximation, the coherence and the Babel
measures. In this paper, we analyze sparse dictionaries based on these
measures. By conducting an eigenvalue analysis, we show that these sparsity
measures share many properties, including the linear independence condition and
inducing a well-posed optimization problem. Furthermore, we prove that there
exists a quasi-isometry between the parameter (i.e., dual) space and the
dictionary's induced feature space.Comment: 10 page
Online kernel adaptive algorithms with dictionary adaptation for MIMO models
International audienceNonlinear system identification has always been a challenging problem. The use of kernel methods to solve such problems becomes more prevalent. However, the complexity of these methods increases with time which makes them unsuitable for online identification. This drawback can be solved with the introduction of the coherence criterion. Furthermore, dictionary adaptation using a stochastic gradient method proved its efficiency. Mostly, all approaches are used to identify Single Output models which form a particular case of real problems. In this letter we investigate online kernel adaptive algorithms to identify Multiple Inputs Multiple Outputs model as well as the possibility of dictionary adaptation for such models
Filtrage adaptatif à l'aide de méthodes à noyau (application au contrôle d'un palier magnétique actif)
L estimation fonctionnelle basée sur les espaces de Hilbert à noyau reproduisant demeure un sujet de recherche actif pour l identification des systèmes non linéaires. L'ordre du modèle croit avec le nombre de couples entrée-sortie, ce qui rend cette méthode inadéquate pour une identification en ligne. Le critère de cohérence est une méthode de parcimonie pour contrôler l ordre du modèle. Le modèle est donc défini à partir d'un dictionnaire de faible taille qui est formé par les fonctions noyau les plus pertinentes.Une fonction noyau introduite dans le dictionnaire y demeure même si la non-stationnarité du système rend sa contribution faible dans l'estimation de la sortie courante. Il apparaît alors opportun d'adapter les éléments du dictionnaire pour réduire l'erreur quadratique instantanée et/ou mieux contrôler l'ordre du modèle.La première partie traite le sujet des algorithmes adaptatifs utilisant le critère de cohérence. L'adaptation des éléments du dictionnaire en utilisant une méthode de gradient stochastique est abordée pour deux familles de fonctions noyau. Cette partie a un autre objectif qui est la dérivation des algorithmes adaptatifs utilisant le critère de cohérence pour identifier des modèles à sorties multiples.La deuxième partie introduit d'une manière abrégée le palier magnétique actif (PMA). La proposition de contrôler un PMA par un algorithme adaptatif à noyau est présentée pour remplacer une méthode utilisant les réseaux de neurones à couches multiplesFunction approximation methods based on reproducing kernel Hilbert spaces are of great importance in kernel-based regression. However, the order of the model is equal to the number of observations, which makes this method inappropriate for online identification. To overcome this drawback, many sparsification methods have been proposed to control the order of the model. The coherence criterion is one of these sparsification methods. It has been shown possible to select a subset of the most relevant passed input vectors to form a dictionary to identify the model.A kernel function, once introduced into the dictionary, remains unchanged even if the non-stationarity of the system makes it less influent in estimating the output of the model. This observation leads to the idea of adapting the elements of the dictionary to obtain an improved one with an objective to minimize the resulting instantaneous mean square error and/or to control the order of the model.The first part deals with adaptive algorithms using the coherence criterion. The adaptation of the elements of the dictionary using a stochastic gradient method is presented for two types of kernel functions. Another topic is covered in this part which is the implementation of adaptive algorithms using the coherence criterion to identify Multiple-Outputs models.The second part introduces briefly the active magnetic bearing (AMB). A proposed method to control an AMB by an adaptive algorithm using kernel methods is presented to replace an existing method using neural networksTROYES-SCD-UTT (103872102) / SudocSudocFranceF