59 research outputs found
Contextual Affinity Distillation for Image Anomaly Detection
Previous works on unsupervised industrial anomaly detection mainly focus on
local structural anomalies such as cracks and color contamination. While
achieving significantly high detection performance on this kind of anomaly,
they are faced with logical anomalies that violate the long-range dependencies
such as a normal object placed in the wrong position. In this paper, based on
previous knowledge distillation works, we propose to use two students (local
and global) to better mimic the teacher's behavior. The local student, which is
used in previous studies mainly focuses on structural anomaly detection while
the global student pays attention to logical anomalies. To further encourage
the global student's learning to capture long-range dependencies, we design the
global context condensing block (GCCB) and propose a contextual affinity loss
for the student training and anomaly scoring. Experimental results show the
proposed method doesn't need cumbersome training techniques and achieves a new
state-of-the-art performance on the MVTec LOCO AD dataset
RefVSR++: Exploiting Reference Inputs for Reference-based Video Super-resolution
Smartphones equipped with a multi-camera system comprising multiple cameras
with different field-of-view (FoVs) are becoming more prevalent. These camera
configurations are compatible with reference-based SR and video SR, which can
be executed simultaneously while recording video on the device. Thus, combining
these two SR methods can improve image quality. Recently, Lee et al. have
presented such a method, RefVSR. In this paper, we consider how to optimally
utilize the observations obtained, including input low-resolution (LR) video
and reference (Ref) video. RefVSR extends conventional video SR quite simply,
aggregating the LR and Ref inputs over time in a single bidirectional stream.
However, considering the content difference between LR and Ref images due to
their FoVs, we can derive the maximum information from the two image sequences
by aggregating them independently in the temporal direction. Then, we propose
an improved method, RefVSR++, which can aggregate two features in parallel in
the temporal direction, one for aggregating the fused LR and Ref inputs and the
other for Ref inputs over time. Furthermore, we equip RefVSR++ with enhanced
mechanisms to align image features over time, which is the key to the success
of video SR. We experimentally show that RefVSR++ outperforms RefVSR by over
1dB in PSNR, achieving the new state-of-the-art
That's BAD: Blind Anomaly Detection by Implicit Local Feature Clustering
Recent studies on visual anomaly detection (AD) of industrial
objects/textures have achieved quite good performance. They consider an
unsupervised setting, specifically the one-class setting, in which we assume
the availability of a set of normal (\textit{i.e.}, anomaly-free) images for
training. In this paper, we consider a more challenging scenario of
unsupervised AD, in which we detect anomalies in a given set of images that
might contain both normal and anomalous samples. The setting does not assume
the availability of known normal data and thus is completely free from human
annotation, which differs from the standard AD considered in recent studies.
For clarity, we call the setting blind anomaly detection (BAD). We show that
BAD can be converted into a local outlier detection problem and propose a novel
method named PatchCluster that can accurately detect image- and pixel-level
anomalies. Experimental results show that PatchCluster shows a promising
performance without the knowledge of normal data, even comparable to the SOTA
methods applied in the one-class setting needing it
Reference-based Motion Blur Removal: Learning to Utilize Sharpness in the Reference Image
Despite the recent advancement in the study of removing motion blur in an
image, it is still hard to deal with strong blurs. While there are limits in
removing blurs from a single image, it has more potential to use multiple
images, e.g., using an additional image as a reference to deblur a blurry
image. A typical setting is deburring an image using a nearby sharp image(s) in
a video sequence, as in the studies of video deblurring. This paper proposes a
better method to use the information present in a reference image. The method
does not need a strong assumption on the reference image. We can utilize an
alternative shot of the identical scene, just like in video deblurring, or we
can even employ a distinct image from another scene. Our method first matches
local patches of the target and reference images and then fuses their features
to estimate a sharp image. We employ a patch-based feature matching strategy to
solve the difficult problem of matching the blurry image with the sharp
reference. Our method can be integrated into pre-existing networks designed for
single image deblurring. The experimental results show the effectiveness of the
proposed method
- …