6,152 research outputs found
Deformable Capsules for Object Detection
In this study, we introduce a new family of capsule networks, deformable
capsules (DeformCaps), to address a very important problem in computer vision:
object detection. We propose two new algorithms associated with our DeformCaps:
a novel capsule structure (SplitCaps), and a novel dynamic routing algorithm
(SE-Routing), which balance computational efficiency with the need for modeling
a large number of objects and classes, which have never been achieved with
capsule networks before. We demonstrate that the proposed methods allow
capsules to efficiently scale-up to large-scale computer vision tasks for the
first time, and create the first-ever capsule network for object detection in
the literature. Our proposed architecture is a one-stage detection framework
and obtains results on MS COCO which are on-par with state-of-the-art one-stage
CNN-based methods, while producing fewer false positive detections,
generalizing to unusual poses/viewpoints of objects
Polyphonic Sound Event Detection by using Capsule Neural Networks
Artificial sound event detection (SED) has the aim to mimic the human ability
to perceive and understand what is happening in the surroundings. Nowadays,
Deep Learning offers valuable techniques for this goal such as Convolutional
Neural Networks (CNNs). The Capsule Neural Network (CapsNet) architecture has
been recently introduced in the image processing field with the intent to
overcome some of the known limitations of CNNs, specifically regarding the
scarce robustness to affine transformations (i.e., perspective, size,
orientation) and the detection of overlapped images. This motivated the authors
to employ CapsNets to deal with the polyphonic-SED task, in which multiple
sound events occur simultaneously. Specifically, we propose to exploit the
capsule units to represent a set of distinctive properties for each individual
sound event. Capsule units are connected through a so-called "dynamic routing"
that encourages learning part-whole relationships and improves the detection
performance in a polyphonic context. This paper reports extensive evaluations
carried out on three publicly available datasets, showing how the CapsNet-based
algorithm not only outperforms standard CNNs but also allows to achieve the
best results with respect to the state of the art algorithms
3D Point Capsule Networks
In this paper, we propose 3D point-capsule networks, an auto-encoder designed
to process sparse 3D point clouds while preserving spatial arrangements of the
input data. 3D capsule networks arise as a direct consequence of our novel
unified 3D auto-encoder formulation. Their dynamic routing scheme and the
peculiar 2D latent space deployed by our approach bring in improvements for
several common point cloud-related tasks, such as object classification, object
reconstruction and part segmentation as substantiated by our extensive
evaluations. Moreover, it enables new applications such as part interpolation
and replacement.Comment: As published in CVPR 2019 (camera ready version), with supplementary
materia
3D Point Capsule Networks
In this paper, we propose 3D point-capsule networks, an auto-encoder designed
to process sparse 3D point clouds while preserving spatial arrangements of the
input data. 3D capsule networks arise as a direct consequence of our novel
unified 3D auto-encoder formulation. Their dynamic routing scheme and the
peculiar 2D latent space deployed by our approach bring in improvements for
several common point cloud-related tasks, such as object classification, object
reconstruction and part segmentation as substantiated by our extensive
evaluations. Moreover, it enables new applications such as part interpolation
and replacement
SECaps: A Sequence Enhanced Capsule Model for Charge Prediction
Automatic charge prediction aims to predict appropriate final charges
according to the fact descriptions for a given criminal case. Automatic charge
prediction plays a critical role in assisting judges and lawyers to improve the
efficiency of legal decisions, and thus has received much attention.
Nevertheless, most existing works on automatic charge prediction perform
adequately on high-frequency charges but are not yet capable of predicting
few-shot charges with limited cases. In this paper, we propose a Sequence
Enhanced Capsule model, dubbed as SECaps model, to relieve this problem.
Specifically, following the work of capsule networks, we propose the seq-caps
layer, which considers sequence information and spatial information of legal
texts simultaneously. Then we design a attention residual unit, which provides
auxiliary information for charge prediction. In addition, our SECaps model
introduces focal loss, which relieves the problem of imbalanced charges.
Comparing the state-of-the-art methods, our SECaps model obtains 4.5% and 6.4%
absolutely considerable improvements under Macro F1 in Criminal-S and
Criminal-L respectively. The experimental results consistently demonstrate the
superiorities and competitiveness of our proposed model.Comment: 13 pages, 3figures, 5 table
Capsule Routing for Sound Event Detection
The detection of acoustic scenes is a challenging problem in which
environmental sound events must be detected from a given audio signal. This
includes classifying the events as well as estimating their onset and offset
times. We approach this problem with a neural network architecture that uses
the recently-proposed capsule routing mechanism. A capsule is a group of
activation units representing a set of properties for an entity of interest,
and the purpose of routing is to identify part-whole relationships between
capsules. That is, a capsule in one layer is assumed to belong to a capsule in
the layer above in terms of the entity being represented. Using capsule
routing, we wish to train a network that can learn global coherence implicitly,
thereby improving generalization performance. Our proposed method is evaluated
on Task 4 of the DCASE 2017 challenge. Results show that classification
performance is state-of-the-art, achieving an F-score of 58.6%. In addition,
overfitting is reduced considerably compared to other architectures.Comment: Paper accepted for 26th European Signal Processing Conference
(EUSIPCO 2018
- …