423 research outputs found
Stacked Capsule Autoencoders
Objects are composed of a set of geometrically organized parts. We introduce
an unsupervised capsule autoencoder (SCAE), which explicitly uses geometric
relationships between parts to reason about objects. Since these relationships
do not depend on the viewpoint, our model is robust to viewpoint changes. SCAE
consists of two stages. In the first stage, the model predicts presences and
poses of part templates directly from the image and tries to reconstruct the
image by appropriately arranging the templates. In the second stage, SCAE
predicts parameters of a few object capsules, which are then used to
reconstruct part poses. Inference in this model is amortized and performed by
off-the-shelf neural encoders, unlike in previous capsule networks. We find
that object capsule presences are highly informative of the object class, which
leads to state-of-the-art results for unsupervised classification on SVHN (55%)
and MNIST (98.7%). The code is available at
https://github.com/google-research/google-research/tree/master/stacked_capsule_autoencodersComment: NeurIPS 2019; 14 pages, 7 figures, 4 tables, code is available at
https://github.com/google-research/google-research/tree/master/stacked_capsule_autoencoder
An Evasion Attack against Stacked Capsule Autoencoder
Capsule network is a type of neural network that uses the spatial
relationship between features to classify images. By capturing the poses and
relative positions between features, its ability to recognize affine
transformation is improved, and it surpasses traditional convolutional neural
networks (CNNs) when handling translation, rotation and scaling. The Stacked
Capsule Autoencoder (SCAE) is the state-of-the-art capsule network. The SCAE
encodes an image as capsules, each of which contains poses of features and
their correlations. The encoded contents are then input into the downstream
classifier to predict the categories of the images. Existing research mainly
focuses on the security of capsule networks with dynamic routing or EM routing,
and little attention has been given to the security and robustness of the SCAE.
In this paper, we propose an evasion attack against the SCAE. After a
perturbation is generated based on the output of the object capsules in the
model, it is added to an image to reduce the contribution of the object
capsules related to the original category of the image so that the perturbed
image will be misclassified. We evaluate the attack using an image
classification experiment, and the experimental results indicate that the
attack can achieve high success rates and stealthiness. It confirms that the
SCAE has a security vulnerability whereby it is possible to craft adversarial
samples without changing the original structure of the image to fool the
classifiers. We hope that our work will make the community aware of the threat
of this attack and raise the attention given to the SCAE's security
ElegansNet: a brief scientific report and initial experiments
This research report introduces ElegansNet, a neural network that mimics
real-world neuronal network circuitry, with the goal of better understanding
the interplay between connectome topology and deep learning systems. The
proposed approach utilizes the powerful representational capabilities of living
beings' neuronal circuitry to design and generate improved deep learning
systems with a topology similar to natural networks. The Caenorhabditis elegans
connectome is used as a reference due to its completeness, reasonable size, and
functional neuron classes annotations. It is demonstrated that the connectome
of simple organisms exhibits specific functional relationships between neurons,
and once transformed into learnable tensor networks and integrated into modern
architectures, it offers bio-plausible structures that efficiently solve
complex tasks. The performance of the models is demonstrated against randomly
wired networks and compared to artificial networks ranked on global benchmarks.
In the first case, ElegansNet outperforms randomly wired networks.
Interestingly, ElegansNet models show slightly similar performance with only
those based on the Watts-Strogatz small-world property. When compared to
state-of-the-art artificial neural networks, such as transformers or
attention-based autoencoders, ElegansNet outperforms well-known deep learning
and traditional models in both supervised image classification tasks and
unsupervised hand-written digits reconstruction, achieving top-1 accuracy of
99.99% on Cifar10 and 99.84% on MNIST Unsup on the validation sets.Comment: 4 pages, short report before full paper submissio
Shifting capsule networks from the cloud to the deep edge
Capsule networks (CapsNets) are an emerging trend in image processing. In contrast to a convolutional neural network, CapsNets are not vulnerable to object deformation, as the relative spatial information of the objects is preserved across the network. However, their complexity is mainly related to the capsule structure and the dynamic routing mechanism, which makes it almost unreasonable to deploy a CapsNet, in its original form, in a resource-constrained device powered by a small microcontroller (MCU). In an era where intelligence is rapidly shifting from the cloud to the edge, this high complexity imposes serious challenges to the adoption of CapsNets at the very edge. To tackle this issue, we present an API for the execution of quantized CapsNets in Arm Cortex-M and RISC-V MCUs. Our software kernels extend the Arm CMSIS-NN and RISC-V PULP-NN to support capsule operations with 8-bit integers as operands. Along with it, we propose a framework to perform post-training quantization of a CapsNet. Results show a reduction in memory footprint of almost 75%, with accuracy loss ranging from 0.07% to 0.18%. In terms of throughput, our Arm Cortex-M API enables the execution of primary capsule and capsule layers with medium-sized kernels in just 119.94 and 90.60 milliseconds (ms), respectively (STM32H755ZIT6U, Cortex-M7 @ 480 MHz). For the GAP-8 SoC (RISC-V RV32IMCXpulp @ 170 MHz), the latency drops to 7.02 and 38.03 ms, respectively
Deep Learning in Cardiology
The medical field is creating large amount of data that physicians are unable
to decipher and use efficiently. Moreover, rule-based expert systems are
inefficient in solving complicated medical tasks or for creating insights using
big data. Deep learning has emerged as a more accurate and effective technology
in a wide range of medical problems such as diagnosis, prediction and
intervention. Deep learning is a representation learning method that consists
of layers that transform the data non-linearly, thus, revealing hierarchical
relationships and structures. In this review we survey deep learning
application papers that use structured data, signal and imaging modalities from
cardiology. We discuss the advantages and limitations of applying deep learning
in cardiology that also apply in medicine in general, while proposing certain
directions as the most viable for clinical use.Comment: 27 pages, 2 figures, 10 table
- …