56 research outputs found
Efficient Anomaly Detection with Budget Annotation Using Semi-Supervised Residual Transformer
Anomaly Detection is challenging as usually only the normal samples are seen
during training and the detector needs to discover anomalies on-the-fly. The
recently proposed deep-learning-based approaches could somehow alleviate the
problem but there is still a long way to go in obtaining an industrial-class
anomaly detector for real-world applications. On the other hand, in some
particular AD tasks, a few anomalous samples are labeled manually for achieving
higher accuracy. However, this performance gain is at the cost of considerable
annotation efforts, which can be intractable in many practical scenarios.
In this work, the above two problems are addressed in a unified framework.
Firstly, inspired by the success of the patch-matching-based AD algorithms, we
train a sliding vision transformer over the residuals generated by a novel
position-constrained patch-matching. Secondly, the conventional pixel-wise
segmentation problem is cast into a block-wise classification problem. Thus the
sliding transformer can attain even higher accuracy with much less annotation
labor. Thirdly, to further reduce the labeling cost, we propose to label the
anomalous regions using only bounding boxes. The unlabeled regions caused by
the weak labels are effectively exploited using a highly-customized
semi-supervised learning scheme equipped with two novel data augmentation
methods. The proposed method outperforms all the state-of-the-art approaches
using all the evaluation metrics in both the unsupervised and supervised
scenarios. On the popular MVTec-AD dataset, our SemiREST algorithm obtains the
Average Precision (AP) of 81.2% in the unsupervised condition and 84.4% AP for
supervised anomaly detection. Surprisingly, with the bounding-box-based
semi-supervisions, SemiREST still outperforms the SOTA methods with full
supervision (83.8% AP) on MVTec-AD.Comment: 20 pages,6 figure
The metaphoric nature of the ordinal position effect
Serial orders are thought to be spatially represented in working memory: The beginning items in the memorised sequence are associated with the left side of space and the ending items are associated with the right side of space. However, the origin of this ordinal position effect has remained unclear. It was suggested that the direction of serial order–space interaction is related to the reading/writing experience. An alternative hypothesis is that it originates from the “more is right”/“more is up” spatial metaphors we use in daily life. We can adjudicate between the two viewpoints in Chinese readers; they read left-to-right but also have a culturally ancient top-to-bottom reading/writing direction. Thus, the reading/writing viewpoint predicts no or a top-to-bottom effect in serial order–space interaction; whereas the spatial metaphor theory predicts a clear bottom-to-top effect. We designed four experiments to investigate this issue. First, we found a left-to-right ordinal position effect, replicating results obtained in Western populations. However, the vertical ordinal position effect was in the bottom-to-top direction; moreover, it was modulated by hand position (e.g., left hand bottom or up). We suggest that order–space interactions may originate from different sources and are driven by metaphoric comprehension, which itself may ground cognitive processing
Target before Shooting: Accurate Anomaly Detection and Localization under One Millisecond via Cascade Patch Retrieval
In this work, by re-examining the "matching" nature of Anomaly Detection
(AD), we propose a new AD framework that simultaneously enjoys new records of
AD accuracy and dramatically high running speed. In this framework, the anomaly
detection problem is solved via a cascade patch retrieval procedure that
retrieves the nearest neighbors for each test image patch in a coarse-to-fine
fashion. Given a test sample, the top-K most similar training images are first
selected based on a robust histogram matching process. Secondly, the nearest
neighbor of each test patch is retrieved over the similar geometrical locations
on those "global nearest neighbors", by using a carefully trained local metric.
Finally, the anomaly score of each test image patch is calculated based on the
distance to its "local nearest neighbor" and the "non-background" probability.
The proposed method is termed "Cascade Patch Retrieval" (CPR) in this work.
Different from the conventional patch-matching-based AD algorithms, CPR selects
proper "targets" (reference images and locations) before "shooting"
(patch-matching). On the well-acknowledged MVTec AD, BTAD and MVTec-3D AD
datasets, the proposed algorithm consistently outperforms all the comparing
SOTA methods by remarkable margins, measured by various AD metrics.
Furthermore, CPR is extremely efficient. It runs at the speed of 113 FPS with
the standard setting while its simplified version only requires less than 1 ms
to process an image at the cost of a trivial accuracy drop. The code of CPR is
available at https://github.com/flyinghu123/CPR.Comment: 13 pages,8 figure
Estimator: An Effective and Scalable Framework for Transportation Mode Classification over Trajectories
Transportation mode classification, the process of predicting the class
labels of moving objects transportation modes, has been widely applied to a
variety of real world applications, such as traffic management, urban
computing, and behavior study. However, existing studies of transportation mode
classification typically extract the explicit features of trajectory data but
fail to capture the implicit features that affect the classification
performance. In addition, most of the existing studies also prefer to apply
RNN-based models to embed trajectories, which is only suitable for classifying
small-scale data. To tackle the above challenges, we propose an effective and
scalable framework for transportation mode classification over GPS
trajectories, abbreviated Estimator. Estimator is established on a developed
CNN-TCN architecture, which is capable of leveraging the spatial and temporal
hidden features of trajectories to achieve high effectiveness and efficiency.
Estimator partitions the entire traffic space into disjointed spatial regions
according to traffic conditions, which enhances the scalability significantly
and thus enables parallel transportation classification. Extensive experiments
using eight public real-life datasets offer evidence that Estimator i) achieves
superior model effectiveness (i.e., 99% Accuracy and 0.98 F1-score), which
outperforms state-of-the-arts substantially; ii) exhibits prominent model
efficiency, and obtains 7-40x speedups up over state-of-the-arts learning-based
methods; and iii) shows high model scalability and robustness that enables
large-scale classification analytics.Comment: 12 pages, 8 figure
- …