23,941 research outputs found
Neural Network Memory Architectures for Autonomous Robot Navigation
This paper highlights the significance of including memory structures in
neural networks when the latter are used to learn perception-action loops for
autonomous robot navigation. Traditional navigation approaches rely on global
maps of the environment to overcome cul-de-sacs and plan feasible motions. Yet,
maintaining an accurate global map may be challenging in real-world settings. A
possible way to mitigate this limitation is to use learning techniques that
forgo hand-engineered map representations and infer appropriate control
responses directly from sensed information. An important but unexplored aspect
of such approaches is the effect of memory on their performance. This work is a
first thorough study of memory structures for deep-neural-network-based robot
navigation, and offers novel tools to train such networks from supervision and
quantify their ability to generalize to unseen scenarios. We analyze the
separation and generalization abilities of feedforward, long short-term memory,
and differentiable neural computer networks. We introduce a new method to
evaluate the generalization ability by estimating the VC-dimension of networks
with a final linear readout layer. We validate that the VC estimates are good
predictors of actual test performance. The reported method can be applied to
deep learning problems beyond robotics
Distributed Training Large-Scale Deep Architectures
Scale of data and scale of computation infrastructures together enable the
current deep learning renaissance. However, training large-scale deep
architectures demands both algorithmic improvement and careful system
configuration. In this paper, we focus on employing the system approach to
speed up large-scale training. Via lessons learned from our routine
benchmarking effort, we first identify bottlenecks and overheads that hinter
data parallelism. We then devise guidelines that help practitioners to
configure an effective system and fine-tune parameters to achieve desired
speedup. Specifically, we develop a procedure for setting minibatch size and
choosing computation algorithms. We also derive lemmas for determining the
quantity of key components such as the number of GPUs and parameter servers.
Experiments and examples show that these guidelines help effectively speed up
large-scale deep learning training
High-performance Kernel Machines with Implicit Distributed Optimization and Randomization
In order to fully utilize "big data", it is often required to use "big
models". Such models tend to grow with the complexity and size of the training
data, and do not make strong parametric assumptions upfront on the nature of
the underlying statistical dependencies. Kernel methods fit this need well, as
they constitute a versatile and principled statistical methodology for solving
a wide range of non-parametric modelling problems. However, their high
computational costs (in storage and time) pose a significant barrier to their
widespread adoption in big data applications.
We propose an algorithmic framework and high-performance implementation for
massive-scale training of kernel-based statistical models, based on combining
two key technical ingredients: (i) distributed general purpose convex
optimization, and (ii) the use of randomization to improve the scalability of
kernel methods. Our approach is based on a block-splitting variant of the
Alternating Directions Method of Multipliers, carefully reconfigured to handle
very large random feature matrices, while exploiting hybrid parallelism
typically found in modern clusters of multicore machines. Our implementation
supports a variety of statistical learning tasks by enabling several loss
functions, regularization schemes, kernels, and layers of randomized
approximations for both dense and sparse datasets, in a highly extensible
framework. We evaluate the ability of our framework to learn models on data
from applications, and provide a comparison against existing sequential and
parallel libraries.Comment: Work presented at MMDS 2014 (June 2014) and JSM 201
Converting Your Thoughts to Texts: Enabling Brain Typing via Deep Feature Learning of EEG Signals
An electroencephalography (EEG) based Brain Computer Interface (BCI) enables
people to communicate with the outside world by interpreting the EEG signals of
their brains to interact with devices such as wheelchairs and intelligent
robots. More specifically, motor imagery EEG (MI-EEG), which reflects a
subjects active intent, is attracting increasing attention for a variety of BCI
applications. Accurate classification of MI-EEG signals while essential for
effective operation of BCI systems, is challenging due to the significant noise
inherent in the signals and the lack of informative correlation between the
signals and brain activities. In this paper, we propose a novel deep neural
network based learning framework that affords perceptive insights into the
relationship between the MI-EEG data and brain activities. We design a joint
convolutional recurrent neural network that simultaneously learns robust
high-level feature presentations through low-dimensional dense embeddings from
raw MI-EEG signals. We also employ an Autoencoder layer to eliminate various
artifacts such as background activities. The proposed approach has been
evaluated extensively on a large- scale public MI-EEG dataset and a limited but
easy-to-deploy dataset collected in our lab. The results show that our approach
outperforms a series of baselines and the competitive state-of-the- art
methods, yielding a classification accuracy of 95.53%. The applicability of our
proposed approach is further demonstrated with a practical BCI system for
typing.Comment: 10 page
- …