352 research outputs found

    Object Detection in 20 Years: A Survey

    Full text link
    Object detection, as of one the most fundamental and challenging problems in computer vision, has received great attention in recent years. Its development in the past two decades can be regarded as an epitome of computer vision history. If we think of today's object detection as a technical aesthetics under the power of deep learning, then turning back the clock 20 years we would witness the wisdom of cold weapon era. This paper extensively reviews 400+ papers of object detection in the light of its technical evolution, spanning over a quarter-century's time (from the 1990s to 2019). A number of topics have been covered in this paper, including the milestone detectors in history, detection datasets, metrics, fundamental building blocks of the detection system, speed up techniques, and the recent state of the art detection methods. This paper also reviews some important detection applications, such as pedestrian detection, face detection, text detection, etc, and makes an in-deep analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible publicatio

    Utilizing radiation for smart robotic applications using visible, thermal, and polarization images.

    Get PDF
    The domain of this research is the use of computer vision methodologies in utilizing radiation for smart robotic applications for driving assistance. Radiation can be emitted by an object, reflected or transmitted. Understanding the nature and the properties of the radiation forming an image is essential in interpreting the information in that image which can then be used by a machine e.g. a smart vehicle to make a decision and perform an action. Throughout this work, different types of images are used to help a robotic vehicle make a decision and perform a certain action. This work presents three smart robotic applications; the first one deals with polarization images, the second one deals with thermal images and the third one deals with visible images. Each type of these images is formed by light (radiation) but in a way different from other types where the information embedded in an image depends on the way it was formed and how the light was generated. For polarization imaging, a direct method utilizing shading and polarization for unambiguous shape recovery without the need for nonlinear optimization routines is proposed. The proposed method utilizes simultaneously polarization and shading to find the surface normals, thus eliminating the reconstruction ambiguity. This can be useful to help a smart vehicle gain knowledge about the terrain surface geometry. Regarding thermal imaging, an automatic method for constructing an annotated thermal imaging pedestrian dataset is proposed. This is done by transferring detections from registered visible images simultaneously captured at day-time where pedestrian detection is well developed in visible images. Histogram of Oriented Gradients (HOG) features are extracted from the constructed dataset and then fed to a discriminatively trained deformable part based classifier that can be used to detect pedestrians at night. The resulting classifier was tested for night driving assistance and succeeded in detecting pedestrians even in the situations where visible imaging pedestrian detectors failed because of low light or glare of oncoming traffic. For visible images, a new feature based on HOG is proposed to be used for pedestrian detection. The proposed feature was augmented to two state of the art pedestrian detectors; the discriminatively trained Deformable Part based models (DPM) and the Integral Channel Features (ICF) using fast feature pyramids. The proposed approach is based on computing the image mixed partial derivatives to be used to redefine the gradients of some pixels and to reweigh the vote at all pixels with respect to the original HOG. The approach was tested on the PASCAL2007, INRIA and Caltech datasets and showed to have an outstanding performance

    Fast and Robust Object Detection Using Visual Subcategories

    Full text link
    Object classes generally contain large intra-class varia-tion, which poses a challenge to object detection schemes. In this work, we study visual subcategorization as a means of capturing appearance variation. First, training data is clustered using color and gradient features. Second, the clustering is used to learn an ensemble of models that cap-ture visual variation due to varying orientation, truncation, and occlusion degree. Fast object detection is achieved with integral image features and pixel lookup features. The framework is studied in the context of vehicle detection on the challenging KITTI dataset. 1

    REAL TIME PEDESTRIAN DETECTION-BASED FASTER HOG/DPM AND DEEP LEARNING APPROACH

    Get PDF
    International audienceThe work presented aims to show the feasibility of scientific and technological concepts in embedded vision dedicated to the extraction of image characteristics allowing the detection and the recognition/localization of objects. Object and pedestrian detection are carried out by two methods: 1. Classical image processing approach, which are improved with Histogram Oriented Gradient (HOG) and Deformable Part Model (DPM) based detection and pattern recognition. We present how we have improved the HOG/DPM approach to make pedestrian detection as a real time task by reducing calculation time. The developed approach allows us not only a pedestrian detection but also calculates the distance between pedestrians and vehicle. 2. Pedestrian detection based Artificial Intelligence (AI) approaches such as Deep Learning (DL). This work has first been validated on a closed circuit and subsequently under real traffic conditions through mobile platforms (mobile robot, drone and vehicles). Several tests have been carried out in the city center of Rouen in order to validate the platform developed

    A 58.6 mW 30 Frames/s Real-Time Programmable Multiobject Detection Accelerator With Deformable Parts Models on Full HD 1920Ă—1080 Videos

    Get PDF
    This paper presents a programmable, energy-efficient, and real-time object detection hardware accelerator for low power and high throughput applications using deformable parts models, with 2x higher detection accuracy than traditional rigid body models. Three methods are used to address the high computational complexity of eight deformable parts detection: classification pruning for 33x fewer part classification, vector quantization for 15x memory size reduction, and feature basis projection for 2x reduction in the cost of each classification. The chip was fabricated in a 65 nm CMOS technology, and can process full high definition 1920 Ă— 1080 videos at 60 frames/s without any OFF-chip storage. The chip has two programmable classification engines (CEs) for multiobject detection. At 30 frames/s, the chip consumes only 58.6 mW (0.94 nJ/pixel, 1168 GOPS/W). At a higher throughput of 60 frames/s, the CEs can be time multiplexed to detect even more than two object classes. This proposed accelerator enables object detection to be as energy-efficient as video compression, which is found in most cameras today.United States. Defense Advanced Research Projects AgencyTexas Instruments Incorporate

    Pedestrian Detection at Day/Night Time with Visible and FIR Cameras : A Comparison

    Get PDF
    Altres ajuts: DGT (SPIP2014-01352)Despite all the significant advances in pedestrian detection brought by computer vision for driving assistance, it is still a challenging problem. One reason is the extremely varying lighting conditions under which such a detector should operate, namely day and nighttime. Recent research has shown that the combination of visible and non-visible imaging modalities may increase detection accuracy, where the infrared spectrum plays a critical role. The goal of this paper is to assess the accuracy gain of different pedestrian models (holistic, part-based, patch-based) when training with images in the far infrared spectrum. Specifically, we want to compare detection accuracy on test images recorded at day and nighttime if trained (and tested) using (a) plain color images; (b) just infrared images; and (c) both of them. In order to obtain results for the last item, we propose an early fusion approach to combine features from both modalities. We base the evaluation on a new dataset that we have built for this purpose as well as on the publicly available KAIST multispectral dataset
    • …
    corecore