9,335 research outputs found
DALNet: A Rail Detection Network Based on Dynamic Anchor Line
Rail detection is one of the key factors for intelligent train. In the paper,
motivated by the anchor line-based lane detection methods, we propose a rail
detection network called DALNet based on dynamic anchor line. Aiming to solve
the problem that the predefined anchor line is image agnostic, we design a
novel dynamic anchor line mechanism. It utilizes a dynamic anchor line
generator to dynamically generate an appropriate anchor line for each rail
instance based on the position and shape of the rails in the input image. These
dynamically generated anchor lines can be considered as better position
references to accurately localize the rails than the predefined anchor lines.
In addition, we present a challenging urban rail detection dataset DL-Rail with
high-quality annotations and scenario diversity. DL-Rail contains 7000 pairs of
images and annotations along with scene tags, and it is expected to encourage
the development of rail detection. We extensively compare DALNet with many
competitive lane methods. The results show that our DALNet achieves
state-of-the-art performance on our DL-Rail rail detection dataset and the
popular Tusimple and LLAMAS lane detection benchmarks. The code will be
released at https://github.com/Yzichen/mmLaneDet
LATR: 3D Lane Detection from Monocular Images with Transformer
3D lane detection from monocular images is a fundamental yet challenging task
in autonomous driving. Recent advances primarily rely on structural 3D
surrogates (e.g., bird's eye view) built from front-view image features and
camera parameters. However, the depth ambiguity in monocular images inevitably
causes misalignment between the constructed surrogate feature map and the
original image, posing a great challenge for accurate lane detection. To
address the above issue, we present a novel LATR model, an end-to-end 3D lane
detector that uses 3D-aware front-view features without transformed view
representation. Specifically, LATR detects 3D lanes via cross-attention based
on query and key-value pairs, constructed using our lane-aware query generator
and dynamic 3D ground positional embedding. On the one hand, each query is
generated based on 2D lane-aware features and adopts a hybrid embedding to
enhance lane information. On the other hand, 3D space information is injected
as positional embedding from an iteratively-updated 3D ground plane. LATR
outperforms previous state-of-the-art methods on both synthetic Apollo,
realistic OpenLane and ONCE-3DLanes by large margins (e.g., 11.4 gain in terms
of F1 score on OpenLane). Code will be released at
https://github.com/JMoonr/LATR .Comment: Accepted by ICCV2023 (Oral
Real-Time Fully Unsupervised Domain Adaptation for Lane Detection in Autonomous Driving
While deep neural networks are being utilized heavily for autonomous driving,
they need to be adapted to new unseen environmental conditions for which they
were not trained. We focus on a safety critical application of lane detection,
and propose a lightweight, fully unsupervised, real-time adaptation approach
that only adapts the batch-normalization parameters of the model. We
demonstrate that our technique can perform inference, followed by on-device
adaptation, under a tight constraint of 30 FPS on Nvidia Jetson Orin. It shows
similar accuracy (avg. of 92.19%) as a state-of-the-art semi-supervised
adaptation algorithm but which does not support real-time adaptation.Comment: Accepted in 2023 Design, Automation & Test in Europe Conference (DATE
2023) - Late Breaking Result
LineMarkNet: Line Landmark Detection for Valet Parking
We aim for accurate and efficient line landmark detection for valet parking,
which is a long-standing yet unsolved problem in autonomous driving. To this
end, we present a deep line landmark detection system where we carefully design
the modules to be lightweight. Specifically, we first empirically design four
general line landmarks including three physical lines and one novel mental
line. The four line landmarks are effective for valet parking. We then develop
a deep network (LineMarkNet) to detect line landmarks from surround-view
cameras where we, via the pre-calibrated homography, fuse context from four
separate cameras into the unified bird-eye-view (BEV) space, specifically we
fuse the surroundview features and BEV features, then employ the multi-task
decoder to detect multiple line landmarks where we apply the center-based
strategy for object detection task, and design our graph transformer to enhance
the vision transformer with hierarchical level graph reasoning for semantic
segmentation task. At last, we further parameterize the detected line landmarks
(e.g., intercept-slope form) whereby a novel filtering backend incorporates
temporal and multi-view consistency to achieve smooth and stable detection.
Moreover, we annotate a large-scale dataset to validate our method.
Experimental results show that our framework achieves the enhanced performance
compared with several line detection methods and validate the multi-task
network's efficiency about the real-time line landmark detection on the
Qualcomm 820A platform while meantime keeps superior accuracy, with our deep
line landmark detection system.Comment: 29 pages, 12 figure
- …