1,633 research outputs found
From Global to Local: Multi-scale Out-of-distribution Detection
Out-of-distribution (OOD) detection aims to detect "unknown" data whose
labels have not been seen during the in-distribution (ID) training process.
Recent progress in representation learning gives rise to distance-based OOD
detection that recognizes inputs as ID/OOD according to their relative
distances to the training data of ID classes. Previous approaches calculate
pairwise distances relying only on global image representations, which can be
sub-optimal as the inevitable background clutter and intra-class variation may
drive image-level representations from the same ID class far apart in a given
representation space. In this work, we overcome this challenge by proposing
Multi-scale OOD DEtection (MODE), a first framework leveraging both global
visual information and local region details of images to maximally benefit OOD
detection. Specifically, we first find that existing models pretrained by
off-the-shelf cross-entropy or contrastive losses are incompetent to capture
valuable local representations for MODE, due to the scale-discrepancy between
the ID training and OOD detection processes. To mitigate this issue and
encourage locally discriminative representations in ID training, we propose
Attention-based Local PropAgation (ALPA), a trainable objective that exploits a
cross-attention mechanism to align and highlight the local regions of the
target objects for pairwise examples. During test-time OOD detection, a
Cross-Scale Decision (CSD) function is further devised on the most
discriminative multi-scale representations to distinguish ID/OOD data more
faithfully. We demonstrate the effectiveness and flexibility of MODE on several
benchmarks -- on average, MODE outperforms the previous state-of-the-art by up
to 19.24% in FPR, 2.77% in AUROC. Code is available at
https://github.com/JimZAI/MODE-OOD.Comment: 13 page
BoWFire: Detection of Fire in Still Images by Integrating Pixel Color and Texture Analysis
Emergency events involving fire are potentially harmful, demanding a fast and
precise decision making. The use of crowdsourcing image and videos on crisis
management systems can aid in these situations by providing more information
than verbal/textual descriptions. Due to the usual high volume of data,
automatic solutions need to discard non-relevant content without losing
relevant information. There are several methods for fire detection on video
using color-based models. However, they are not adequate for still image
processing, because they can suffer on high false-positive results. These
methods also suffer from parameters with little physical meaning, which makes
fine tuning a difficult task. In this context, we propose a novel fire
detection method for still images that uses classification based on color
features combined with texture classification on superpixel regions. Our method
uses a reduced number of parameters if compared to previous works, easing the
process of fine tuning the method. Results show the effectiveness of our method
of reducing false-positives while its precision remains compatible with the
state-of-the-art methods.Comment: 8 pages, Proceedings of the 28th SIBGRAPI Conference on Graphics,
Patterns and Images, IEEE Pres
- …