9 research outputs found
Co-mining: Self-Supervised Learning for Sparsely Annotated Object Detection
Object detectors usually achieve promising results with the supervision of
complete instance annotations. However, their performance is far from
satisfactory with sparse instance annotations. Most existing methods for
sparsely annotated object detection either re-weight the loss of hard negative
samples or convert the unlabeled instances into ignored regions to reduce the
interference of false negatives. We argue that these strategies are
insufficient since they can at most alleviate the negative effect caused by
missing annotations. In this paper, we propose a simple but effective
mechanism, called Co-mining, for sparsely annotated object detection. In our
Co-mining, two branches of a Siamese network predict the pseudo-label sets for
each other. To enhance multi-view learning and better mine unlabeled instances,
the original image and corresponding augmented image are used as the inputs of
two branches of the Siamese network, respectively. Co-mining can serve as a
general training mechanism applied to most of modern object detectors.
Experiments are performed on MS COCO dataset with three different sparsely
annotated settings using two typical frameworks: anchor-based detector
RetinaNet and anchor-free detector FCOS. Experimental results show that our
Co-mining with RetinaNet achieves 1.4%~2.1% improvements compared with
different baselines and surpasses existing methods under the same sparsely
annotated setting. Code is available at
https://github.com/megvii-research/Co-mining.Comment: Accepted to AAAI 2021. Code is available at
https://github.com/megvii-research/Co-minin
Projection Regret: Reducing Background Bias for Novelty Detection via Diffusion Models
Novelty detection is a fundamental task of machine learning which aims to
detect abnormal ( out-of-distribution (OOD)) samples. Since
diffusion models have recently emerged as the de facto standard generative
framework with surprising generation results, novelty detection via diffusion
models has also gained much attention. Recent methods have mainly utilized the
reconstruction property of in-distribution samples. However, they often suffer
from detecting OOD samples that share similar background information to the
in-distribution data. Based on our observation that diffusion models can
\emph{project} any sample to an in-distribution sample with similar background
information, we propose \emph{Projection Regret (PR)}, an efficient novelty
detection method that mitigates the bias of non-semantic information. To be
specific, PR computes the perceptual distance between the test image and its
diffusion-based projection to detect abnormality. Since the perceptual distance
often fails to capture semantic changes when the background information is
dominant, we cancel out the background bias by comparing it against recursive
projections. Extensive experiments demonstrate that PR outperforms the prior
art of generative-model-based novelty detection methods by a significant
margin.Comment: NeurIPS 202
Watermarking for Out-of-distribution Detection
Out-of-distribution (OOD) detection aims to identify OOD data based on
representations extracted from well-trained deep models. However, existing
methods largely ignore the reprogramming property of deep models and thus may
not fully unleash their intrinsic strength: without modifying parameters of a
well-trained deep model, we can reprogram this model for a new purpose via
data-level manipulation (e.g., adding a specific feature perturbation to the
data). This property motivates us to reprogram a classification model to excel
at OOD detection (a new task), and thus we propose a general methodology named
watermarking in this paper. Specifically, we learn a unified pattern that is
superimposed onto features of original data, and the model's detection
capability is largely boosted after watermarking. Extensive experiments verify
the effectiveness of watermarking, demonstrating the significance of the
reprogramming property of deep models in OOD detection
Learning from alternative sources of supervision
With the rise of the internet, data of many varieties including: images, audio, text
and video are abundant. Unfortunately for a very specific task one might have, the
data for that problem is not typically abundant unless you are lucky. Typically one
might have only a small amount of labelled data, or only noisy labels, or labels for a
different task, or perhaps a simulator and reward function but no demonstrations, or
even a simulator but no reward function at all. However, arguably no task is truly novel
and so it is often possible for neural networks to benefit from the abundant data that
is related to your current task. This thesis documents three methods for learning from
alternative sources of supervision, an alternative to the more preferable case of simply
having unlimited amounts of direct examples of your task. Firstly we show how having
data from many related tasks could be described with a simple graphical model and
fit using a Variational-Autoencoder - directly modelling and representing the relations
amongst tasks. Secondly we investigate various forms of prediction-based intrinsic
rewards for agents in a simulator with no extrinsic rewards. Thirdly we introduce a
novel intrinsic reward and investigate how to best combine it with an extrinsic reward
for best performance
深層学習を用いたクラス分類における分布外入力の検出に関する研究
Tohoku University岡谷貴之課