6 research outputs found

    Entropy of Overcomplete Kernel Dictionaries

    Full text link
    In signal analysis and synthesis, linear approximation theory considers a linear decomposition of any given signal in a set of atoms, collected into a so-called dictionary. Relevant sparse representations are obtained by relaxing the orthogonality condition of the atoms, yielding overcomplete dictionaries with an extended number of atoms. More generally than the linear decomposition, overcomplete kernel dictionaries provide an elegant nonlinear extension by defining the atoms through a mapping kernel function (e.g., the gaussian kernel). Models based on such kernel dictionaries are used in neural networks, gaussian processes and online learning with kernels. The quality of an overcomplete dictionary is evaluated with a diversity measure the distance, the approximation, the coherence and the Babel measures. In this paper, we develop a framework to examine overcomplete kernel dictionaries with the entropy from information theory. Indeed, a higher value of the entropy is associated to a further uniform spread of the atoms over the space. For each of the aforementioned diversity measures, we derive lower bounds on the entropy. Several definitions of the entropy are examined, with an extensive analysis in both the input space and the mapped feature space.Comment: 10 page

    Approximation errors of online sparsification criteria

    Full text link
    Many machine learning frameworks, such as resource-allocating networks, kernel-based methods, Gaussian processes, and radial-basis-function networks, require a sparsification scheme in order to address the online learning paradigm. For this purpose, several online sparsification criteria have been proposed to restrict the model definition on a subset of samples. The most known criterion is the (linear) approximation criterion, which discards any sample that can be well represented by the already contributing samples, an operation with excessive computational complexity. Several computationally efficient sparsification criteria have been introduced in the literature, such as the distance, the coherence and the Babel criteria. In this paper, we provide a framework that connects these sparsification criteria to the issue of approximating samples, by deriving theoretical bounds on the approximation errors. Moreover, we investigate the error of approximating any feature, by proposing upper-bounds on the approximation error for each of the aforementioned sparsification criteria. Two classes of features are described in detail, the empirical mean and the principal axes in the kernel principal component analysis.Comment: 10 page

    Analyzing sparse dictionaries for online learning with kernels

    Full text link
    Many signal processing and machine learning methods share essentially the same linear-in-the-parameter model, with as many parameters as available samples as in kernel-based machines. Sparse approximation is essential in many disciplines, with new challenges emerging in online learning with kernels. To this end, several sparsity measures have been proposed in the literature to quantify sparse dictionaries and constructing relevant ones, the most prolific ones being the distance, the approximation, the coherence and the Babel measures. In this paper, we analyze sparse dictionaries based on these measures. By conducting an eigenvalue analysis, we show that these sparsity measures share many properties, including the linear independence condition and inducing a well-posed optimization problem. Furthermore, we prove that there exists a quasi-isometry between the parameter (i.e., dual) space and the dictionary's induced feature space.Comment: 10 page

    Online kernel adaptive algorithms with dictionary adaptation for MIMO models

    No full text
    International audienceNonlinear system identification has always been a challenging problem. The use of kernel methods to solve such problems becomes more prevalent. However, the complexity of these methods increases with time which makes them unsuitable for online identification. This drawback can be solved with the introduction of the coherence criterion. Furthermore, dictionary adaptation using a stochastic gradient method proved its efficiency. Mostly, all approaches are used to identify Single Output models which form a particular case of real problems. In this letter we investigate online kernel adaptive algorithms to identify Multiple Inputs Multiple Outputs model as well as the possibility of dictionary adaptation for such models

    Filtrage adaptatif à l'aide de méthodes à noyau (application au contrôle d'un palier magnétique actif)

    Get PDF
    L estimation fonctionnelle basée sur les espaces de Hilbert à noyau reproduisant demeure un sujet de recherche actif pour l identification des systèmes non linéaires. L'ordre du modèle croit avec le nombre de couples entrée-sortie, ce qui rend cette méthode inadéquate pour une identification en ligne. Le critère de cohérence est une méthode de parcimonie pour contrôler l ordre du modèle. Le modèle est donc défini à partir d'un dictionnaire de faible taille qui est formé par les fonctions noyau les plus pertinentes.Une fonction noyau introduite dans le dictionnaire y demeure même si la non-stationnarité du système rend sa contribution faible dans l'estimation de la sortie courante. Il apparaît alors opportun d'adapter les éléments du dictionnaire pour réduire l'erreur quadratique instantanée et/ou mieux contrôler l'ordre du modèle.La première partie traite le sujet des algorithmes adaptatifs utilisant le critère de cohérence. L'adaptation des éléments du dictionnaire en utilisant une méthode de gradient stochastique est abordée pour deux familles de fonctions noyau. Cette partie a un autre objectif qui est la dérivation des algorithmes adaptatifs utilisant le critère de cohérence pour identifier des modèles à sorties multiples.La deuxième partie introduit d'une manière abrégée le palier magnétique actif (PMA). La proposition de contrôler un PMA par un algorithme adaptatif à noyau est présentée pour remplacer une méthode utilisant les réseaux de neurones à couches multiplesFunction approximation methods based on reproducing kernel Hilbert spaces are of great importance in kernel-based regression. However, the order of the model is equal to the number of observations, which makes this method inappropriate for online identification. To overcome this drawback, many sparsification methods have been proposed to control the order of the model. The coherence criterion is one of these sparsification methods. It has been shown possible to select a subset of the most relevant passed input vectors to form a dictionary to identify the model.A kernel function, once introduced into the dictionary, remains unchanged even if the non-stationarity of the system makes it less influent in estimating the output of the model. This observation leads to the idea of adapting the elements of the dictionary to obtain an improved one with an objective to minimize the resulting instantaneous mean square error and/or to control the order of the model.The first part deals with adaptive algorithms using the coherence criterion. The adaptation of the elements of the dictionary using a stochastic gradient method is presented for two types of kernel functions. Another topic is covered in this part which is the implementation of adaptive algorithms using the coherence criterion to identify Multiple-Outputs models.The second part introduces briefly the active magnetic bearing (AMB). A proposed method to control an AMB by an adaptive algorithm using kernel methods is presented to replace an existing method using neural networksTROYES-SCD-UTT (103872102) / SudocSudocFranceF