19,419 research outputs found
End-to-End Kernel Learning with Supervised Convolutional Kernel Networks
In this paper, we introduce a new image representation based on a multilayer
kernel machine. Unlike traditional kernel methods where data representation is
decoupled from the prediction task, we learn how to shape the kernel with
supervision. We proceed by first proposing improvements of the
recently-introduced convolutional kernel networks (CKNs) in the context of
unsupervised learning; then, we derive backpropagation rules to take advantage
of labeled training data. The resulting model is a new type of convolutional
neural network, where optimizing the filters at each layer is equivalent to
learning a linear subspace in a reproducing kernel Hilbert space (RKHS). We
show that our method achieves reasonably competitive performance for image
classification on some standard "deep learning" datasets such as CIFAR-10 and
SVHN, and also for image super-resolution, demonstrating the applicability of
our approach to a large variety of image-related tasks.Comment: to appear in Advances in Neural Information Processing Systems (NIPS
SHADHO: Massively Scalable Hardware-Aware Distributed Hyperparameter Optimization
Computer vision is experiencing an AI renaissance, in which machine learning
models are expediting important breakthroughs in academic research and
commercial applications. Effectively training these models, however, is not
trivial due in part to hyperparameters: user-configured values that control a
model's ability to learn from data. Existing hyperparameter optimization
methods are highly parallel but make no effort to balance the search across
heterogeneous hardware or to prioritize searching high-impact spaces. In this
paper, we introduce a framework for massively Scalable Hardware-Aware
Distributed Hyperparameter Optimization (SHADHO). Our framework calculates the
relative complexity of each search space and monitors performance on the
learning task over all trials. These metrics are then used as heuristics to
assign hyperparameters to distributed workers based on their hardware. We first
demonstrate that our framework achieves double the throughput of a standard
distributed hyperparameter optimization framework by optimizing SVM for MNIST
using 150 distributed workers. We then conduct model search with SHADHO over
the course of one week using 74 GPUs across two compute clusters to optimize
U-Net for a cell segmentation task, discovering 515 models that achieve a lower
validation loss than standard U-Net.Comment: 10 pages, 6 figure
Machine learning-guided directed evolution for protein engineering
Machine learning (ML)-guided directed evolution is a new paradigm for
biological design that enables optimization of complex functions. ML methods
use data to predict how sequence maps to function without requiring a detailed
model of the underlying physics or biological pathways. To demonstrate
ML-guided directed evolution, we introduce the steps required to build ML
sequence-function models and use them to guide engineering, making
recommendations at each stage. This review covers basic concepts relevant to
using ML for protein engineering as well as the current literature and
applications of this new engineering paradigm. ML methods accelerate directed
evolution by learning from information contained in all measured variants and
using that information to select sequences that are likely to be improved. We
then provide two case studies that demonstrate the ML-guided directed evolution
process. We also look to future opportunities where ML will enable discovery of
new protein functions and uncover the relationship between protein sequence and
function.Comment: Made significant revisions to focus on aspects most relevant to
applying machine learning to speed up directed evolutio
Fast Bayesian Optimization of Machine Learning Hyperparameters on Large Datasets
Bayesian optimization has become a successful tool for hyperparameter
optimization of machine learning algorithms, such as support vector machines or
deep neural networks. Despite its success, for large datasets, training and
validating a single configuration often takes hours, days, or even weeks, which
limits the achievable performance. To accelerate hyperparameter optimization,
we propose a generative model for the validation error as a function of
training set size, which is learned during the optimization process and allows
exploration of preliminary configurations on small subsets, by extrapolating to
the full dataset. We construct a Bayesian optimization procedure, dubbed
Fabolas, which models loss and training time as a function of dataset size and
automatically trades off high information gain about the global optimum against
computational cost. Experiments optimizing support vector machines and deep
neural networks show that Fabolas often finds high-quality solutions 10 to 100
times faster than other state-of-the-art Bayesian optimization methods or the
recently proposed bandit strategy Hyperband
- …