2,973 research outputs found
Dynamic Routing Between Capsules
A capsule is a group of neurons whose activity vector represents the
instantiation parameters of a specific type of entity such as an object or an
object part. We use the length of the activity vector to represent the
probability that the entity exists and its orientation to represent the
instantiation parameters. Active capsules at one level make predictions, via
transformation matrices, for the instantiation parameters of higher-level
capsules. When multiple predictions agree, a higher level capsule becomes
active. We show that a discrimininatively trained, multi-layer capsule system
achieves state-of-the-art performance on MNIST and is considerably better than
a convolutional net at recognizing highly overlapping digits. To achieve these
results we use an iterative routing-by-agreement mechanism: A lower-level
capsule prefers to send its output to higher level capsules whose activity
vectors have a big scalar product with the prediction coming from the
lower-level capsule
Fast Dynamic Routing Based on Weighted Kernel Density Estimation
Capsules as well as dynamic routing between them are most recently proposed
structures for deep neural networks. A capsule groups data into vectors or
matrices as poses rather than conventional scalars to represent specific
properties of target instance. Besides of pose, a capsule should be attached
with a probability (often denoted as activation) for its presence. The dynamic
routing helps capsules achieve more generalization capacity with many fewer
model parameters. However, the bottleneck that prevents widespread applications
of capsule is the expense of computation during routing. To address this
problem, we generalize existing routing methods within the framework of
weighted kernel density estimation, and propose two fast routing methods with
different optimization strategies. Our methods prompt the time efficiency of
routing by nearly 40\% with negligible performance degradation. By stacking a
hybrid of convolutional layers and capsule layers, we construct a network
architecture to handle inputs at a resolution of pixels. The
proposed models achieve a parallel performance with other leading methods in
multiple benchmarks.Comment: 16 pages, 4 figures, submitted to eccv 201
Capsule Networks with Max-Min Normalization
Capsule Networks (CapsNet) use the Softmax function to convert the logits of
the routing coefficients into a set of normalized values that signify the
assignment probabilities between capsules in adjacent layers. We show that the
use of Softmax prevents capsule layers from forming optimal couplings between
lower and higher-level capsules. Softmax constrains the dynamic range of the
routing coefficients and leads to probabilities that remain mostly uniform
after several routing iterations. Instead, we propose the use of Max-Min
normalization. Max-Min performs a scale-invariant normalization of the logits
that allows each lower-level capsule to take on an independent value,
constrained only by the bounds of normalization. Max-Min provides consistent
improvement in test accuracy across five datasets and allows more routing
iterations without a decrease in network performance. A single CapsNet trained
using Max-Min achieves an improved test error of 0.20% on the MNIST dataset.
With a simple 3-model majority vote, we achieve a test error of 0.17% on MNIST
Training capsules as a routing-weighted product of expert neurons
Capsules are the multidimensional analogue to scalar neurons in neural
networks, and because they are multidimensional, much more complex routing
schemes can be used to pass information forward through the network than what
can be used in traditional neural networks. This work treats capsules as
collections of neurons in a fully connected neural network, where sub-networks
connecting capsules are weighted according to the routing coefficients
determined by routing by agreement. An energy function is designed to reflect
this model, and it follows that capsule networks with dynamic routing can be
formulated as a product of expert neurons. By alternating between dynamic
routing, which acts to both find subnetworks within the overall network as well
as to mix the model distribution, and updating the parameters by the gradient
of the contrastive divergence, a bottom-up, unsupervised learning algorithm is
constructed for capsule networks with dynamic routing. The model and its
training algorithm are qualitatively tested in the generative sense, and is
able to produce realistic looking images from standard vision datasets
Fast Inference in Capsule Networks Using Accumulated Routing Coefficients
We present a method for fast inference in Capsule Networks (CapsNets) by
taking advantage of a key insight regarding the routing coefficients that link
capsules between adjacent network layers. Since the routing coefficients are
responsible for assigning object parts to wholes, and an object whole generally
contains similar intra-class and dissimilar inter-class parts, the routing
coefficients tend to form a unique signature for each object class. For fast
inference, a network is first trained in the usual manner using examples from
the training dataset. Afterward, the routing coefficients associated with the
training examples are accumulated offline and used to create a set of "master"
routing coefficients. During inference, these master routing coefficients are
used in place of the dynamically calculated routing coefficients. Our method
effectively replaces the for-loop iterations in the dynamic routing procedure
with a single matrix multiply operation, providing a significant boost in
inference speed. Compared with the dynamic routing procedure, fast inference
decreases the test accuracy for the MNIST, Background MNIST, Fashion MNIST, and
Rotated MNIST datasets by less than 0.5% and by approximately 5% for CIFAR10
Attention routing between capsules
In this paper, we propose a new capsule network architecture called Attention
Routing CapsuleNet (AR CapsNet). We replace the dynamic routing and squash
activation function of the capsule network with dynamic routing (CapsuleNet)
with the attention routing and capsule activation. The attention routing is a
routing between capsules through an attention module. The attention routing is
a fast forward-pass while keeping spatial information. On the other hand, the
intuitive interpretation of the dynamic routing is finding a centroid of the
prediction capsules. Thus, the squash activation function and its variant focus
on preserving a vector orientation. However, the capsule activation focuses on
performing a capsule-scale activation function.
We evaluate our proposed model on the MNIST, affNIST, and CIFAR-10
classification tasks. The proposed model achieves higher accuracy with fewer
parameters (x0.65 in the MNIST, x0.82 in the CIFAR-10) and less training time
than CapsuleNet (x0.19 in the MNIST, x0.35 in the CIFAR-10). These results
validate that designing a capsule-scale operation is a key factor to implement
the capsule concept.
Also, our experiment shows that our proposed model is transformation
equivariant as CapsuleNet. As we perturb each element of the output capsule,
the decoder attached to the output capsules shows global variations. Further
experiments show that the difference in the capsule features caused by applying
affine transformations on an input image is significantly aligned in one
direction
DeepCaps: Going Deeper with Capsule Networks
Capsule Network is a promising concept in deep learning, yet its true
potential is not fully realized thus far, providing sub-par performance on
several key benchmark datasets with complex data. Drawing intuition from the
success achieved by Convolutional Neural Networks (CNNs) by going deeper, we
introduce DeepCaps1, a deep capsule network architecture which uses a novel 3D
convolution based dynamic routing algorithm. With DeepCaps, we surpass the
state-of-the-art results in the capsule network domain on CIFAR10, SVHN and
Fashion MNIST, while achieving a 68% reduction in the number of parameters.
Further, we propose a class-independent decoder network, which strengthens the
use of reconstruction loss as a regularization term. This leads to an
interesting property of the decoder, which allows us to identify and control
the physical attributes of the images represented by the instantiation
parameters
Multi-Interest Network with Dynamic Routing for Recommendation at Tmall
Industrial recommender systems usually consist of the matching stage and the
ranking stage, in order to handle the billion-scale of users and items. The
matching stage retrieves candidate items relevant to user interests, while the
ranking stage sorts candidate items by user interests. Thus, the most critical
ability is to model and represent user interests for either stage. Most of the
existing deep learning-based models represent one user as a single vector which
is insufficient to capture the varying nature of user's interests. In this
paper, we approach this problem from a different view, to represent one user
with multiple vectors encoding the different aspects of the user's interests.
We propose the Multi-Interest Network with Dynamic routing (MIND) for dealing
with user's diverse interests in the matching stage. Specifically, we design a
multi-interest extractor layer based on capsule routing mechanism, which is
applicable for clustering historical behaviors and extracting diverse
interests. Furthermore, we develop a technique named label-aware attention to
help learn a user representation with multiple vectors. Through extensive
experiments on several public benchmarks and one large-scale industrial dataset
from Tmall, we demonstrate that MIND can achieve superior performance than
state-of-the-art methods for recommendation. Currently, MIND has been deployed
for handling major online traffic at the homepage on Mobile Tmall App
Dynamic Past and Future for Neural Machine Translation
Previous studies have shown that neural machine translation (NMT) models can
benefit from explicitly modeling translated (Past) and untranslated (Future) to
groups of translated and untranslated contents through parts-to-wholes
assignment. The assignment is learned through a novel variant of
routing-by-agreement mechanism (Sabour et al., 2017), namely {\em Guided
Dynamic Routing}, where the translating status at each decoding step {\em
guides} the routing process to assign each source word to its associated group
(i.e., translated or untranslated content) represented by a capsule, enabling
translation to be made from holistic context. Experiments show that our
approach achieves substantial improvements over both RNMT and Transformer by
producing more adequate translations. Extensive analysis demonstrates that our
method is highly interpretable, which is able to recognize the translated and
untranslated contents as expected.Comment: Camera-ready version. Accepted to EMNLP 2019 as a long pape
Text Classification using Capsules
This paper presents an empirical exploration of the use of capsule networks
for text classification. While it has been shown that capsule networks are
effective for image classification, their validity in the domain of text has
not been explored. In this paper, we show that capsule networks indeed have the
potential for text classification and that they have several advantages over
convolutional neural networks. We further suggest a simple routing method that
effectively reduces the computational complexity of dynamic routing. We
utilized seven benchmark datasets to demonstrate that capsule networks, along
with the proposed routing method provide comparable results
- …