60,698 research outputs found
3D Object Class Detection in the Wild
Object class detection has been a synonym for 2D bounding box localization
for the longest time, fueled by the success of powerful statistical learning
techniques, combined with robust image representations. Only recently, there
has been a growing interest in revisiting the promise of computer vision from
the early days: to precisely delineate the contents of a visual scene, object
by object, in 3D. In this paper, we draw from recent advances in object
detection and 2D-3D object lifting in order to design an object class detector
that is particularly tailored towards 3D object class detection. Our 3D object
class detection method consists of several stages gradually enriching the
object detection output with object viewpoint, keypoints and 3D shape
estimates. Following careful design, in each stage it constantly improves the
performance and achieves state-ofthe-art performance in simultaneous 2D
bounding box and viewpoint estimation on the challenging Pascal3D+ dataset
Driving Scene Perception Network: Real-time Joint Detection, Depth Estimation and Semantic Segmentation
As the demand for enabling high-level autonomous driving has increased in
recent years and visual perception is one of the critical features to enable
fully autonomous driving, in this paper, we introduce an efficient approach for
simultaneous object detection, depth estimation and pixel-level semantic
segmentation using a shared convolutional architecture. The proposed network
model, which we named Driving Scene Perception Network (DSPNet), uses
multi-level feature maps and multi-task learning to improve the accuracy and
efficiency of object detection, depth estimation and image segmentation tasks
from a single input image. Hence, the resulting network model uses less than
850 MiB of GPU memory and achieves 14.0 fps on NVIDIA GeForce GTX 1080 with a
1024x512 input image, and both precision and efficiency have been improved over
combination of single tasks.Comment: 9 pages, 7 figures, WACV'1
Holistic, Instance-Level Human Parsing
Object parsing -- the task of decomposing an object into its semantic parts
-- has traditionally been formulated as a category-level segmentation problem.
Consequently, when there are multiple objects in an image, current methods
cannot count the number of objects in the scene, nor can they determine which
part belongs to which object. We address this problem by segmenting the parts
of objects at an instance-level, such that each pixel in the image is assigned
a part label, as well as the identity of the object it belongs to. Moreover, we
show how this approach benefits us in obtaining segmentations at coarser
granularities as well. Our proposed network is trained end-to-end given
detections, and begins with a category-level segmentation module. Thereafter, a
differentiable Conditional Random Field, defined over a variable number of
instances for every input image, reasons about the identity of each part by
associating it with a human detection. In contrast to other approaches, our
method can handle the varying number of people in each image and our holistic
network produces state-of-the-art results in instance-level part and human
segmentation, together with competitive results in category-level part
segmentation, all achieved by a single forward-pass through our neural network.Comment: Poster at BMVC 201
Weakly Supervised Localization using Deep Feature Maps
Object localization is an important computer vision problem with a variety of
applications. The lack of large scale object-level annotations and the relative
abundance of image-level labels makes a compelling case for weak supervision in
the object localization task. Deep Convolutional Neural Networks are a class of
state-of-the-art methods for the related problem of object recognition. In this
paper, we describe a novel object localization algorithm which uses
classification networks trained on only image labels. This weakly supervised
method leverages local spatial and semantic patterns captured in the
convolutional layers of classification networks. We propose an efficient beam
search based approach to detect and localize multiple objects in images. The
proposed method significantly outperforms the state-of-the-art in standard
object localization data-sets with a 8 point increase in mAP scores
Multiple Moving Object Recognitions in video based on Log Gabor-PCA Approach
Object recognition in the video sequence or images is one of the sub-field of
computer vision. Moving object recognition from a video sequence is an
appealing topic with applications in various areas such as airport safety,
intrusion surveillance, video monitoring, intelligent highway, etc. Moving
object recognition is the most challenging task in intelligent video
surveillance system. In this regard, many techniques have been proposed based
on different methods. Despite of its importance, moving object recognition in
complex environments is still far from being completely solved for low
resolution videos, foggy videos, and also dim video sequences. All in all,
these make it necessary to develop exceedingly robust techniques. This paper
introduces multiple moving object recognition in the video sequence based on
LoG Gabor-PCA approach and Angle based distance Similarity measures techniques
used to recognize the object as a human, vehicle etc. Number of experiments are
conducted for indoor and outdoor video sequences of standard datasets and also
our own collection of video sequences comprising of partial night vision video
sequences. Experimental results show that our proposed approach achieves an
excellent recognition rate. Results obtained are satisfactory and competent.Comment: 8,26,conferenc
- …