77 research outputs found
Oriented Response Networks
Deep Convolution Neural Networks (DCNNs) are capable of learning
unprecedentedly effective image representations. However, their ability in
handling significant local and global image rotations remains limited. In this
paper, we propose Active Rotating Filters (ARFs) that actively rotate during
convolution and produce feature maps with location and orientation explicitly
encoded. An ARF acts as a virtual filter bank containing the filter itself and
its multiple unmaterialised rotated versions. During back-propagation, an ARF
is collectively updated using errors from all its rotated versions. DCNNs using
ARFs, referred to as Oriented Response Networks (ORNs), can produce
within-class rotation-invariant deep features while maintaining inter-class
discrimination for classification tasks. The oriented response produced by ORNs
can also be used for image and object orientation estimation tasks. Over
multiple state-of-the-art DCNN architectures, such as VGG, ResNet, and STN, we
consistently observe that replacing regular filters with the proposed ARFs
leads to significant reduction in the number of network parameters and
improvement in classification performance. We report the best results on
several commonly used benchmarks.Comment: Accepted in CVPR 2017. Source code available at http://yzhou.work/OR
Self-supervised Pretraining for Decision Foundation Model: Formulation, Pipeline and Challenges
Decision-making is a dynamic process requiring perception, memory, and
reasoning to make choices and find optimal policies. Traditional approaches to
decision-making suffer from sample efficiency and generalization, while
large-scale self-supervised pretraining has enabled fast adaptation with
fine-tuning or few-shot learning in language and vision. We thus argue to
integrate knowledge acquired from generic large-scale self-supervised
pretraining into downstream decision-making problems. We propose
Pretrain-Then-Adapt pipeline and survey recent work on data collection,
pretraining objectives and adaptation strategies for decision-making
pretraining and downstream inference. Finally, we identify critical challenges
and future directions for developing decision foundation model with the help of
generic and flexible self-supervised pretraining
BadRL: Sparse Targeted Backdoor Attack Against Reinforcement Learning
Backdoor attacks in reinforcement learning (RL) have previously employed
intense attack strategies to ensure attack success. However, these methods
suffer from high attack costs and increased detectability. In this work, we
propose a novel approach, BadRL, which focuses on conducting highly sparse
backdoor poisoning efforts during training and testing while maintaining
successful attacks. Our algorithm, BadRL, strategically chooses state
observations with high attack values to inject triggers during training and
testing, thereby reducing the chances of detection. In contrast to the previous
methods that utilize sample-agnostic trigger patterns, BadRL dynamically
generates distinct trigger patterns based on targeted state observations,
thereby enhancing its effectiveness. Theoretical analysis shows that the
targeted backdoor attack is always viable and remains stealthy under specific
assumptions. Empirical results on various classic RL tasks illustrate that
BadRL can substantially degrade the performance of a victim agent with minimal
poisoning efforts 0.003% of total training steps) during training and
infrequent attacks during testing.Comment: Extended version of the submission accepted by AAAI 2024. It is
revised by integrating review comment
Spatial Self-Distillation for Object Detection with Inaccurate Bounding Boxes
Object detection via inaccurate bounding boxes supervision has boosted a
broad interest due to the expensive high-quality annotation data or the
occasional inevitability of low annotation quality (\eg tiny objects). The
previous works usually utilize multiple instance learning (MIL), which highly
depends on category information, to select and refine a low-quality box. Those
methods suffer from object drift, group prediction and part domination problems
without exploring spatial information. In this paper, we heuristically propose
a \textbf{Spatial Self-Distillation based Object Detector (SSD-Det)} to mine
spatial information to refine the inaccurate box in a self-distillation
fashion. SSD-Det utilizes a Spatial Position Self-Distillation \textbf{(SPSD)}
module to exploit spatial information and an interactive structure to combine
spatial information and category information, thus constructing a high-quality
proposal bag. To further improve the selection procedure, a Spatial Identity
Self-Distillation \textbf{(SISD)} module is introduced in SSD-Det to obtain
spatial confidence to help select the best proposals. Experiments on MS-COCO
and VOC datasets with noisy box annotation verify our method's effectiveness
and achieve state-of-the-art performance. The code is available at
https://github.com/ucas-vg/PointTinyBenchmark/tree/SSD-Det.Comment: accepted by ICCV 202
- …