1,062 research outputs found
Fine-grained Discriminative Localization via Saliency-guided Faster R-CNN
Discriminative localization is essential for fine-grained image
classification task, which devotes to recognizing hundreds of subcategories in
the same basic-level category. Reflecting on discriminative regions of objects,
key differences among different subcategories are subtle and local. Existing
methods generally adopt a two-stage learning framework: The first stage is to
localize the discriminative regions of objects, and the second is to encode the
discriminative features for training classifiers. However, these methods
generally have two limitations: (1) Separation of the two-stage learning is
time-consuming. (2) Dependence on object and parts annotations for
discriminative localization learning leads to heavily labor-consuming labeling.
It is highly challenging to address these two important limitations
simultaneously. Existing methods only focus on one of them. Therefore, this
paper proposes the discriminative localization approach via saliency-guided
Faster R-CNN to address the above two limitations at the same time, and our
main novelties and advantages are: (1) End-to-end network based on Faster R-CNN
is designed to simultaneously localize discriminative regions and encode
discriminative features, which accelerates classification speed. (2)
Saliency-guided localization learning is proposed to localize the
discriminative region automatically, avoiding labor-consuming labeling. Both
are jointly employed to simultaneously accelerate classification speed and
eliminate dependence on object and parts annotations. Comparing with the
state-of-the-art methods on the widely-used CUB-200-2011 dataset, our approach
achieves both the best classification accuracy and efficiency.Comment: 9 pages, to appear in ACM MM 201
Centroid Distance Keypoint Detector for Colored Point Clouds
Keypoint detection serves as the basis for many computer vision and robotics
applications. Despite the fact that colored point clouds can be readily
obtained, most existing keypoint detectors extract only geometry-salient
keypoints, which can impede the overall performance of systems that intend to
(or have the potential to) leverage color information. To promote advances in
such systems, we propose an efficient multi-modal keypoint detector that can
extract both geometry-salient and color-salient keypoints in colored point
clouds. The proposed CEntroid Distance (CED) keypoint detector comprises an
intuitive and effective saliency measure, the centroid distance, that can be
used in both 3D space and color space, and a multi-modal non-maximum
suppression algorithm that can select keypoints with high saliency in two or
more modalities. The proposed saliency measure leverages directly the
distribution of points in a local neighborhood and does not require normal
estimation or eigenvalue decomposition. We evaluate the proposed method in
terms of repeatability and computational efficiency (i.e. running time) against
state-of-the-art keypoint detectors on both synthetic and real-world datasets.
Results demonstrate that our proposed CED keypoint detector requires minimal
computational time while attaining high repeatability. To showcase one of the
potential applications of the proposed method, we further investigate the task
of colored point cloud registration. Results suggest that our proposed CED
detector outperforms state-of-the-art handcrafted and learning-based keypoint
detectors in the evaluated scenes. The C++ implementation of the proposed
method is made publicly available at
https://github.com/UCR-Robotics/CED_Detector.Comment: Accepted to IEEE/CVF Winter Conference on Applications of Computer
Vision (WACV) 2023; copyright will be transferred to IEEE upon publicatio
- …