In this paper we present new algorithms for training reduced-size nonlinear
representations in the Kernel Dictionary Learning (KDL) problem. Standard KDL
has the drawback of a large size of the kernel matrix when the data set is
large. There are several ways of reducing the kernel size, notably Nystr\"om
sampling. We propose here a method more in the spirit of dictionary learning,
where the kernel vectors are obtained with a trained sparse representation of
the input signals. Moreover, we optimize directly the kernel vectors in the KDL
process, using gradient descent steps. We show with three data sets that our
algorithms are able to provide better representations, despite using a small
number of kernel vectors, and also decrease the execution time with respect to
KDL