3 research outputs found
Spatially Constrained Location Prior for Scene Parsing
Semantic context is an important and useful cue for scene parsing in
complicated natural images with a substantial amount of variations in objects
and the environment. This paper proposes Spatially Constrained Location Prior
(SCLP) for effective modelling of global and local semantic context in the
scene in terms of inter-class spatial relationships. Unlike existing studies
focusing on either relative or absolute location prior of objects, the SCLP
effectively incorporates both relative and absolute location priors by
calculating object co-occurrence frequencies in spatially constrained image
blocks. The SCLP is general and can be used in conjunction with various visual
feature-based prediction models, such as Artificial Neural Networks and Support
Vector Machine (SVM), to enforce spatial contextual constraints on class
labels. Using SVM classifiers and a linear regression model, we demonstrate
that the incorporation of SCLP achieves superior performance compared to the
state-of-the-art methods on the Stanford background and SIFT Flow datasets.Comment: authors' pre-print version of a article published in IJCNN 201
Empirical Upper Bound in Object Detection and More
Object detection remains as one of the most notorious open problems in
computer vision. Despite large strides in accuracy in recent years, modern
object detectors have started to saturate on popular benchmarks raising the
question of how far we can reach with deep learning tools and tricks. Here, by
employing 2 state-of-the-art object detection benchmarks, and analyzing more
than 15 models over 4 large scale datasets, we I) carefully determine the
upperbound in AP, which is 91.6% on VOC (test2007), 78.2% on COCO (val2017),
and 58.9% on OpenImages V4 (validation), regardless of the IOU. These numbers
are much better than the mAP of the best model1 (47.9% on VOC, and 46.9% on
COCO; IOUs=.5:.95), II) characterize the sources of errors in object detectors,
in a novel and intuitive way, and find that classification error (confusion
with other classes and misses) explains the largest fraction of errors and
weighs more than localization and duplicate errors, and III) analyze the
invariance properties of models when surrounding context of an object is
removed, when an object is placed in an incongruent background, and when images
are blurred or flipped vertically. We find that models generate boxes on empty
regions and that context is more important for detecting small objects than
larger ones. Our work taps into the tight relationship between recognition and
detection and offers insights for building better models
Empirical Upper Bound, Error Diagnosis and Invariance Analysis of Modern Object Detectors
Object detection remains as one of the most notorious open problems in
computer vision. Despite large strides in accuracy in recent years, modern
object detectors have started to saturate on popular benchmarks raising the
question of how far we can reach with deep learning tools and tricks. Here, by
employing 2 state-of-the-art object detection benchmarks, and analyzing more
than 15 models over 4 large scale datasets, we I) carefully determine the upper
bound in AP, which is 91.6% on VOC (test2007), 78.2% on COCO (val2017), and
58.9% on OpenImages V4 (validation), regardless of the IOU threshold. These
numbers are much better than the mAP of the best model (47.9% on VOC, and 46.9%
on COCO; IOUs=.5:.05:.95), II) characterize the sources of errors in object
detectors, in a novel and intuitive way, and find that classification error
(confusion with other classes and misses) explains the largest fraction of
errors and weighs more than localization and duplicate errors, and III) analyze
the invariance properties of models when surrounding context of an object is
removed, when an object is placed in an incongruent background, and when images
are blurred or flipped vertically. We find that models generate a lot of boxes
on empty regions and that context is more important for detecting small objects
than larger ones. Our work taps into the tight relationship between object
detection and object recognition and offers insights for building better
models. Our code is publicly available at
https://github.com/aliborji/Deetctionupper bound.git.Comment: arXiv admin note: substantial text overlap with arXiv:1911.1245