Search CORE

1,009 research outputs found

Region of Interest Generation for Pedestrian Detection using Stereo Vision

Author: Chauhan Korra Abhishek
Publication venue
Publication date: 01/01/2016
Field of study

Pedestrian detection is an active research area in the field of computer vision. The sliding window paradigm is usually followed to extract all possible detector windows, however, it is very time consuming. Subsequently, stereo vision using a pair of camera is preferred to reduce the search space that includes the depth information. Disparity map generation using feature correspondence is an integral part and a prior task to depth estimation. In our work, we apply the ORB features to fasten the feature correspondence process. Once the ROI generation phase is over, the extracted detector window is represented by low level histogram of oriented gradient (HOG) features. Subsequently, Linear Support Vector Machine (SVM) is applied to classify them as either pedestrian or non-pedestrian. The experimental results reveal that ORB driven depth estimation is at least seven times faster than the SURF descriptor and ten times faster than the SIFT descriptor

ethesis@nitr

Multi-View 3D Object Detection Network for Autonomous Driving

Author: Chen Xiaozhi
Li Bo
Ma Huimin
Wan Ji
Xia Tian
Publication venue
Publication date: 21/06/2017
Field of study

This paper aims at high-accuracy 3D object detection in autonomous driving scenario. We propose Multi-View 3D networks (MV3D), a sensory-fusion framework that takes both LIDAR point cloud and RGB images as input and predicts oriented 3D bounding boxes. We encode the sparse 3D point cloud with a compact multi-view representation. The network is composed of two subnetworks: one for 3D object proposal generation and another for multi-view feature fusion. The proposal network generates 3D candidate boxes efficiently from the bird's eye view representation of 3D point cloud. We design a deep fusion scheme to combine region-wise features from multiple views and enable interactions between intermediate layers of different paths. Experiments on the challenging KITTI benchmark show that our approach outperforms the state-of-the-art by around 25% and 30% AP on the tasks of 3D localization and 3D detection. In addition, for 2D detection, our approach obtains 10.3% higher AP than the state-of-the-art on the hard data among the LIDAR-based methods.Comment: To appear in IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 201

arXiv.org e-Print Archive

Crossref

What Can Help Pedestrian Detection?

Author: Cao Zhimin
Jiang Yuning
Mao Jiayuan
Xiao Tete
Publication venue
Publication date: 08/05/2017
Field of study

Aggregating extra features has been considered as an effective approach to boost traditional pedestrian detection methods. However, there is still a lack of studies on whether and how CNN-based pedestrian detectors can benefit from these extra features. The first contribution of this paper is exploring this issue by aggregating extra features into CNN-based pedestrian detection framework. Through extensive experiments, we evaluate the effects of different kinds of extra features quantitatively. Moreover, we propose a novel network architecture, namely HyperLearner, to jointly learn pedestrian detection as well as the given extra feature. By multi-task training, HyperLearner is able to utilize the information of given features and improve detection performance without extra inputs in inference. The experimental results on multiple pedestrian benchmarks validate the effectiveness of the proposed HyperLearner.Comment: Accepted to IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 201

arXiv.org e-Print Archive

Crossref

Sensor fusion in driving assistance systems

Author: Ponz Vila Aurelio
Publication venue
Publication date: 01/01/2017
Field of study

Mención Internacional en el título de doctorLa vida diaria en los países desarrollados y en vías de desarrollo depende en gran medida del transporte urbano y en carretera. Esta actividad supone un coste importante para sus usuarios activos y pasivos en términos de polución y accidentes, muy habitualmente debidos al factor humano. Los nuevos desarrollos en seguridad y asistencia a la conducción, llamados Advanced Driving Assistance Systems (ADAS), buscan mejorar la seguridad en el transporte, y a medio plazo, llegar a la conducción autónoma. Los ADAS, al igual que la conducción humana, están basados en sensores que proporcionan información acerca del entorno, y la fiabilidad de los sensores es crucial para las aplicaciones ADAS al igual que las capacidades sensoriales lo son para la conducción humana. Una de las formas de aumentar la fiabilidad de los sensores es el uso de la Fusión Sensorial, desarrollando nuevas estrategias para el modelado del entorno de conducción gracias al uso de diversos sensores, y obteniendo una información mejorada a partid de los datos disponibles. La presente tesis pretende ofrecer una solución novedosa para la detección y clasificación de obstáculos en aplicaciones de automoción, usando fusión vii sensorial con dos sensores ampliamente disponibles en el mercado: la cámara de espectro visible y el escáner láser. Cámaras y láseres son sensores comúnmente usados en la literatura científica, cada vez más accesibles y listos para ser empleados en aplicaciones reales. La solución propuesta permite la detección y clasificación de algunos de los obstáculos comúnmente presentes en la vía, como son ciclistas y peatones. En esta tesis se han explorado novedosos enfoques para la detección y clasificación, desde la clasificación empleando clusters de nubes de puntos obtenidas desde el escáner láser, hasta las técnicas de domain adaptation para la creación de bases de datos de imágenes sintéticas, pasando por la extracción inteligente de clusters y la detección y eliminación del suelo en nubes de puntos.Life in developed and developing countries is highly dependent on road and urban motor transport. This activity involves a high cost for its active and passive users in terms of pollution and accidents, which are largely attributable to the human factor. New developments in safety and driving assistance, called Advanced Driving Assistance Systems (ADAS), are intended to improve security in transportation, and, in the mid-term, lead to autonomous driving. ADAS, like the human driving, are based on sensors, which provide information about the environment, and sensors’ reliability is crucial for ADAS applications in the same way the sensing abilities are crucial for human driving. One of the ways to improve reliability for sensors is the use of Sensor Fusion, developing novel strategies for environment modeling with the help of several sensors and obtaining an enhanced information from the combination of the available data. The present thesis is intended to offer a novel solution for obstacle detection and classification in automotive applications using sensor fusion with two highly available sensors in the market: visible spectrum camera and laser scanner. Cameras and lasers are commonly used sensors in the scientific literature, increasingly affordable and ready to be deployed in real world applications. The solution proposed provides obstacle detection and classification for some obstacles commonly present in the road, such as pedestrians and bicycles. Novel approaches for detection and classification have been explored in this thesis, from point cloud clustering classification for laser scanner, to domain adaptation techniques for synthetic dataset creation, and including intelligent clustering extraction and ground detection and removal from point clouds.Programa Oficial de Doctorado en Ingeniería Eléctrica, Electrónica y AutomáticaPresidente: Cristina Olaverri Monreal.- Secretario: Arturo de la Escalera Hueso.- Vocal: José Eugenio Naranjo Hernánde

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Universidad Carlos III de Madrid e-Archivo

PV-RCNN++: Point-Voxel Feature Set Abstraction With Local Vector Representation for 3D Object Detection

Author: Deng Jiajun
Guo Chaoxu
Jiang Li
Li Hongsheng
Shi Jianping
Shi Shaoshuai
Wang Xiaogang
Wang Zhe
Publication venue
Publication date: 07/11/2022
Field of study

3D object detection is receiving increasing attention from both industry and academia thanks to its wide applications in various fields. In this paper, we propose Point-Voxel Region-based Convolution Neural Networks (PV-RCNNs) for 3D object detection on point clouds. First, we propose a novel 3D detector, PV-RCNN, which boosts the 3D detection performance by deeply integrating the feature learning of both point-based set abstraction and voxel-based sparse convolution through two novel steps, i.e., the voxel-to-keypoint scene encoding and the keypoint-to-grid RoI feature abstraction. Second, we propose an advanced framework, PV-RCNN++, for more efficient and accurate 3D object detection. It consists of two major improvements: sectorized proposal-centric sampling for efficiently producing more representative keypoints, and VectorPool aggregation for better aggregating local point features with much less resource consumption. With these two strategies, our PV-RCNN++ is about

3\times

faster than PV-RCNN, while also achieving better performance. The experiments demonstrate that our proposed PV-RCNN++ framework achieves state-of-the-art 3D detection performance on the large-scale and highly-competitive Waymo Open Dataset with 10 FPS inference speed on the detection range of 150m * 150m.Comment: Accepted by International Journal of Computer Vision (IJCV), code is available at https://github.com/open-mmlab/OpenPCDe

arXiv.org e-Print Archive

{PV-RCNN}++: {P}oint-Voxel Feature Set Abstraction With Local Vector Representation for {3D} Object Detection

Author: Deng J.
Guo C.
Jiang L.
Li H.
Shi J.
Shi S.
Wang X.
Wang Z.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

MPG.PuRe