5,874 research outputs found
End-to-End Kernel Learning with Supervised Convolutional Kernel Networks
In this paper, we introduce a new image representation based on a multilayer
kernel machine. Unlike traditional kernel methods where data representation is
decoupled from the prediction task, we learn how to shape the kernel with
supervision. We proceed by first proposing improvements of the
recently-introduced convolutional kernel networks (CKNs) in the context of
unsupervised learning; then, we derive backpropagation rules to take advantage
of labeled training data. The resulting model is a new type of convolutional
neural network, where optimizing the filters at each layer is equivalent to
learning a linear subspace in a reproducing kernel Hilbert space (RKHS). We
show that our method achieves reasonably competitive performance for image
classification on some standard "deep learning" datasets such as CIFAR-10 and
SVHN, and also for image super-resolution, demonstrating the applicability of
our approach to a large variety of image-related tasks.Comment: to appear in Advances in Neural Information Processing Systems (NIPS
Stacking-Based Deep Neural Network: Deep Analytic Network for Pattern Classification
Stacking-based deep neural network (S-DNN) is aggregated with pluralities of
basic learning modules, one after another, to synthesize a deep neural network
(DNN) alternative for pattern classification. Contrary to the DNNs trained end
to end by backpropagation (BP), each S-DNN layer, i.e., a self-learnable
module, is to be trained decisively and independently without BP intervention.
In this paper, a ridge regression-based S-DNN, dubbed deep analytic network
(DAN), along with its kernelization (K-DAN), are devised for multilayer feature
re-learning from the pre-extracted baseline features and the structured
features. Our theoretical formulation demonstrates that DAN/K-DAN re-learn by
perturbing the intra/inter-class variations, apart from diminishing the
prediction errors. We scrutinize the DAN/K-DAN performance for pattern
classification on datasets of varying domains - faces, handwritten digits,
generic objects, to name a few. Unlike the typical BP-optimized DNNs to be
trained from gigantic datasets by GPU, we disclose that DAN/K-DAN are trainable
using only CPU even for small-scale training sets. Our experimental results
disclose that DAN/K-DAN outperform the present S-DNNs and also the BP-trained
DNNs, including multiplayer perceptron, deep belief network, etc., without data
augmentation applied.Comment: 14 pages, 7 figures, 11 table
Data-Driven Representation Learning in Multimodal Feature Fusion
abstract: Modern machine learning systems leverage data and features from multiple modalities to gain more predictive power. In most scenarios, the modalities are vastly different and the acquired data are heterogeneous in nature. Consequently, building highly effective fusion algorithms is at the core to achieve improved model robustness and inferencing performance. This dissertation focuses on the representation learning approaches as the fusion strategy. Specifically, the objective is to learn the shared latent representation which jointly exploit the structural information encoded in all modalities, such that a straightforward learning model can be adopted to obtain the prediction.
We first consider sensor fusion, a typical multimodal fusion problem critical to building a pervasive computing platform. A systematic fusion technique is described to support both multiple sensors and descriptors for activity recognition. Targeted to learn the optimal combination of kernels, Multiple Kernel Learning (MKL) algorithms have been successfully applied to numerous fusion problems in computer vision etc. Utilizing the MKL formulation, next we describe an auto-context algorithm for learning image context via the fusion with low-level descriptors. Furthermore, a principled fusion algorithm using deep learning to optimize kernel machines is developed. By bridging deep architectures with kernel optimization, this approach leverages the benefits of both paradigms and is applied to a wide variety of fusion problems.
In many real-world applications, the modalities exhibit highly specific data structures, such as time sequences and graphs, and consequently, special design of the learning architecture is needed. In order to improve the temporal modeling for multivariate sequences, we developed two architectures centered around attention models. A novel clinical time series analysis model is proposed for several critical problems in healthcare. Another model coupled with triplet ranking loss as metric learning framework is described to better solve speaker diarization. Compared to state-of-the-art recurrent networks, these attention-based multivariate analysis tools achieve improved performance while having a lower computational complexity. Finally, in order to perform community detection on multilayer graphs, a fusion algorithm is described to derive node embedding from word embedding techniques and also exploit the complementary relational information contained in each layer of the graph.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201
Neural Generalization of Multiple Kernel Learning
Multiple Kernel Learning is a conventional way to learn the kernel function
in kernel-based methods. MKL algorithms enhance the performance of kernel
methods. However, these methods have a lower complexity compared to deep
learning models and are inferior to these models in terms of recognition
accuracy. Deep learning models can learn complex functions by applying
nonlinear transformations to data through several layers. In this paper, we
show that a typical MKL algorithm can be interpreted as a one-layer neural
network with linear activation functions. By this interpretation, we propose a
Neural Generalization of Multiple Kernel Learning (NGMKL), which extends the
conventional multiple kernel learning framework to a multi-layer neural network
with nonlinear activation functions. Our experiments on several benchmarks show
that the proposed method improves the complexity of MKL algorithms and leads to
higher recognition accuracy
Deep Big Simple Neural Nets Excel on Handwritten Digit Recognition
Good old on-line back-propagation for plain multi-layer perceptrons yields a
very low 0.35% error rate on the famous MNIST handwritten digits benchmark. All
we need to achieve this best result so far are many hidden layers, many neurons
per layer, numerous deformed training images, and graphics cards to greatly
speed up learning.Comment: 14 pages, 2 figures, 4 listing
- …