664 research outputs found
FisheyeMultiNet: Real-time Multi-task Learning Architecture for Surround-view Automated Parking System.
Automated Parking is a low speed manoeuvring scenario which is quite unstructured and complex, requiring full 360° near-field sensing around the vehicle. In this paper, we discuss the design and implementation of an automated parking system from the perspective of camera based deep learning algorithms. We provide a holistic overview of an industrial system covering the embedded system, use cases and the deep learning architecture. We demonstrate a real-time multi-task deep learning network called FisheyeMultiNet, which detects all the necessary objects for parking on a low-power embedded system. FisheyeMultiNet runs at 15 fps for 4 cameras and it has three tasks namely object detection, semantic segmentation and soiling detection. To encourage further research, we release a partial dataset of 5,000 images containing semantic segmentation and bounding box detection ground truth via WoodScape project [Yogamani et al., 2019]
Detection of Empty/Occupied States of Parking Slots in Multicamera system using Mask R-CNN Classifier
A fast growth of vehicles in big cities has an impact of arising road loads and difficulty of finding empty parking spaces. One solution to cope with the problem is to develop a parking management system which can provide useful information of available parking spaces to the potential users. This paper discusses about a new multicamera arrangement and the function to evaluate the empty/occupied states of the parking slots, as an alternative solution to the existing single camera system, The system adopted Mask R-CNN for its classifier, because of its capability to provide the polygon outputs for its detected objects, compared with the existing bounding box outputs provided by other classifiers. The proposed function has optimized the available information from all cameras, by considering the relative position of each camera to the parking spaces, and also capable of overcoming occlusion problem occurs in some cameras, The experiment shows that the capability of overcoming the occlusion problem has been validated, and its performance to evaluate the empty/occupied states of the parking slots was better than the single camera system to a certain threshold
LineMarkNet: Line Landmark Detection for Valet Parking
We aim for accurate and efficient line landmark detection for valet parking,
which is a long-standing yet unsolved problem in autonomous driving. To this
end, we present a deep line landmark detection system where we carefully design
the modules to be lightweight. Specifically, we first empirically design four
general line landmarks including three physical lines and one novel mental
line. The four line landmarks are effective for valet parking. We then develop
a deep network (LineMarkNet) to detect line landmarks from surround-view
cameras where we, via the pre-calibrated homography, fuse context from four
separate cameras into the unified bird-eye-view (BEV) space, specifically we
fuse the surroundview features and BEV features, then employ the multi-task
decoder to detect multiple line landmarks where we apply the center-based
strategy for object detection task, and design our graph transformer to enhance
the vision transformer with hierarchical level graph reasoning for semantic
segmentation task. At last, we further parameterize the detected line landmarks
(e.g., intercept-slope form) whereby a novel filtering backend incorporates
temporal and multi-view consistency to achieve smooth and stable detection.
Moreover, we annotate a large-scale dataset to validate our method.
Experimental results show that our framework achieves the enhanced performance
compared with several line detection methods and validate the multi-task
network's efficiency about the real-time line landmark detection on the
Qualcomm 820A platform while meantime keeps superior accuracy, with our deep
line landmark detection system.Comment: 29 pages, 12 figure
Near-field Perception for Low-Speed Vehicle Automation using Surround-view Fisheye Cameras
Cameras are the primary sensor in automated driving systems. They provide
high information density and are optimal for detecting road infrastructure cues
laid out for human vision. Surround-view camera systems typically comprise of
four fisheye cameras with 190{\deg}+ field of view covering the entire
360{\deg} around the vehicle focused on near-field sensing. They are the
principal sensors for low-speed, high accuracy, and close-range sensing
applications, such as automated parking, traffic jam assistance, and low-speed
emergency braking. In this work, we provide a detailed survey of such vision
systems, setting up the survey in the context of an architecture that can be
decomposed into four modular components namely Recognition, Reconstruction,
Relocalization, and Reorganization. We jointly call this the 4R Architecture.
We discuss how each component accomplishes a specific aspect and provide a
positional argument that they can be synergized to form a complete perception
system for low-speed automation. We support this argument by presenting results
from previous works and by presenting architecture proposals for such a system.
Qualitative results are presented in the video at https://youtu.be/ae8bCOF77uY.Comment: Accepted for publication at IEEE Transactions on Intelligent
Transportation System
Surround-view Fisheye BEV-Perception for Valet Parking: Dataset, Baseline and Distortion-insensitive Multi-task Framework
Surround-view fisheye perception under valet parking scenes is fundamental
and crucial in autonomous driving. Environmental conditions in parking lots
perform differently from the common public datasets, such as imperfect light
and opacity, which substantially impacts on perception performance. Most
existing networks based on public datasets may generalize suboptimal results on
these valet parking scenes, also affected by the fisheye distortion. In this
article, we introduce a new large-scale fisheye dataset called Fisheye Parking
Dataset(FPD) to promote the research in dealing with diverse real-world
surround-view parking cases. Notably, our compiled FPD exhibits excellent
characteristics for different surround-view perception tasks. In addition, we
also propose our real-time distortion-insensitive multi-task framework Fisheye
Perception Network (FPNet), which improves the surround-view fisheye BEV
perception by enhancing the fisheye distortion operation and multi-task
lightweight designs. Extensive experiments validate the effectiveness of our
approach and the dataset's exceptional generalizability.Comment: 12 pages, 11 figure
- …