66,557 research outputs found
Independent Modular Networks
Monolithic neural networks that make use of a single set of weights to learn
useful representations for downstream tasks explicitly dismiss the
compositional nature of data generation processes. This characteristic exists
in data where every instance can be regarded as the combination of an identity
concept, such as the shape of an object, combined with modifying concepts, such
as orientation, color, and size. The dismissal of compositionality is
especially detrimental in robotics, where state estimation relies heavily on
the compositional nature of physical mechanisms (e.g., rotations and
transformations) to model interactions. To accommodate this data
characteristic, modular networks have been proposed. However, a lack of
structure in each module's role, and modular network-specific issues such as
module collapse have restricted their usability. We propose a modular network
architecture that accommodates the mentioned decompositional concept by
proposing a unique structure that splits the modules into predetermined roles.
Additionally, we provide regularizations that improve the resiliency of the
modular network to the problem of module collapse while improving the
decomposition accuracy of the model.Comment: ICRA23 RAP4Robots Worksho
Transitioning between Convolutional and Fully Connected Layers in Neural Networks
Digital pathology has advanced substantially over the last decade however
tumor localization continues to be a challenging problem due to highly complex
patterns and textures in the underlying tissue bed. The use of convolutional
neural networks (CNNs) to analyze such complex images has been well adopted in
digital pathology. However in recent years, the architecture of CNNs have
altered with the introduction of inception modules which have shown great
promise for classification tasks. In this paper, we propose a modified
"transition" module which learns global average pooling layers from filters of
varying sizes to encourage class-specific filters at multiple spatial
resolutions. We demonstrate the performance of the transition module in AlexNet
and ZFNet, for classifying breast tumors in two independent datasets of scanned
histology sections, of which the transition module was superior.Comment: This work is to appear at the 3rd workshop on Deep Learning in
Medical Image Analysis (DLMIA), MICCAI 201
CARPe Posterum: A Convolutional Approach for Real-time Pedestrian Path Prediction
Pedestrian path prediction is an essential topic in computer vision and video
understanding. Having insight into the movement of pedestrians is crucial for
ensuring safe operation in a variety of applications including autonomous
vehicles, social robots, and environmental monitoring. Current works in this
area utilize complex generative or recurrent methods to capture many possible
futures. However, despite the inherent real-time nature of predicting future
paths, little work has been done to explore accurate and computationally
efficient approaches for this task. To this end, we propose a convolutional
approach for real-time pedestrian path prediction, CARPe. It utilizes a
variation of Graph Isomorphism Networks in combination with an agile
convolutional neural network design to form a fast and accurate path prediction
approach. Notable results in both inference speed and prediction accuracy are
achieved, improving FPS considerably in comparison to current state-of-the-art
methods while delivering competitive accuracy on well-known path prediction
datasets.Comment: AAAI-21 Camera Read
Signed Distance-based Deep Memory Recommender
Personalized recommendation algorithms learn a user's preference for an item
by measuring a distance/similarity between them. However, some of the existing
recommendation models (e.g., matrix factorization) assume a linear relationship
between the user and item. This approach limits the capacity of recommender
systems, since the interactions between users and items in real-world
applications are much more complex than the linear relationship. To overcome
this limitation, in this paper, we design and propose a deep learning framework
called Signed Distance-based Deep Memory Recommender, which captures non-linear
relationships between users and items explicitly and implicitly, and work well
in both general recommendation task and shopping basket-based recommendation
task. Through an extensive empirical study on six real-world datasets in the
two recommendation tasks, our proposed approach achieved significant
improvement over ten state-of-the-art recommendation models
Res2Net: A New Multi-scale Backbone Architecture
Representing features at multiple scales is of great importance for numerous
vision tasks. Recent advances in backbone convolutional neural networks (CNNs)
continually demonstrate stronger multi-scale representation ability, leading to
consistent performance gains on a wide range of applications. However, most
existing methods represent the multi-scale features in a layer-wise manner. In
this paper, we propose a novel building block for CNNs, namely Res2Net, by
constructing hierarchical residual-like connections within one single residual
block. The Res2Net represents multi-scale features at a granular level and
increases the range of receptive fields for each network layer. The proposed
Res2Net block can be plugged into the state-of-the-art backbone CNN models,
e.g., ResNet, ResNeXt, and DLA. We evaluate the Res2Net block on all these
models and demonstrate consistent performance gains over baseline models on
widely-used datasets, e.g., CIFAR-100 and ImageNet. Further ablation studies
and experimental results on representative computer vision tasks, i.e., object
detection, class activation mapping, and salient object detection, further
verify the superiority of the Res2Net over the state-of-the-art baseline
methods. The source code and trained models are available on
https://mmcheng.net/res2net/.Comment: 11 pages, 7 figure
- …