499 research outputs found
Target before Shooting: Accurate Anomaly Detection and Localization under One Millisecond via Cascade Patch Retrieval
In this work, by re-examining the "matching" nature of Anomaly Detection
(AD), we propose a new AD framework that simultaneously enjoys new records of
AD accuracy and dramatically high running speed. In this framework, the anomaly
detection problem is solved via a cascade patch retrieval procedure that
retrieves the nearest neighbors for each test image patch in a coarse-to-fine
fashion. Given a test sample, the top-K most similar training images are first
selected based on a robust histogram matching process. Secondly, the nearest
neighbor of each test patch is retrieved over the similar geometrical locations
on those "global nearest neighbors", by using a carefully trained local metric.
Finally, the anomaly score of each test image patch is calculated based on the
distance to its "local nearest neighbor" and the "non-background" probability.
The proposed method is termed "Cascade Patch Retrieval" (CPR) in this work.
Different from the conventional patch-matching-based AD algorithms, CPR selects
proper "targets" (reference images and locations) before "shooting"
(patch-matching). On the well-acknowledged MVTec AD, BTAD and MVTec-3D AD
datasets, the proposed algorithm consistently outperforms all the comparing
SOTA methods by remarkable margins, measured by various AD metrics.
Furthermore, CPR is extremely efficient. It runs at the speed of 113 FPS with
the standard setting while its simplified version only requires less than 1 ms
to process an image at the cost of a trivial accuracy drop. The code of CPR is
available at https://github.com/flyinghu123/CPR.Comment: 13 pages,8 figure
On-device Scalable Image-based Localization via Prioritized Cascade Search and Fast One-Many RANSAC.
We present the design of an entire on-device system for large-scale urban
localization using images. The proposed design integrates compact image
retrieval and 2D-3D correspondence search to estimate the location in extensive
city regions. Our design is GPS agnostic and does not require network
connection. In order to overcome the resource constraints of mobile devices, we
propose a system design that leverages the scalability advantage of image
retrieval and accuracy of 3D model-based localization. Furthermore, we propose
a new hashing-based cascade search for fast computation of 2D-3D
correspondences. In addition, we propose a new one-many RANSAC for accurate
pose estimation. The new one-many RANSAC addresses the challenge of repetitive
building structures (e.g. windows, balconies) in urban localization. Extensive
experiments demonstrate that our 2D-3D correspondence search achieves
state-of-the-art localization accuracy on multiple benchmark datasets.
Furthermore, our experiments on a large Google Street View (GSV) image dataset
show the potential of large-scale localization entirely on a typical mobile
device
- …