32 research outputs found
Learning Lightweight Lane Detection CNNs by Self Attention Distillation
Training deep models for lane detection is challenging due to the very subtle
and sparse supervisory signals inherent in lane annotations. Without learning
from much richer context, these models often fail in challenging scenarios,
e.g., severe occlusion, ambiguous lanes, and poor lighting conditions. In this
paper, we present a novel knowledge distillation approach, i.e., Self Attention
Distillation (SAD), which allows a model to learn from itself and gains
substantial improvement without any additional supervision or labels.
Specifically, we observe that attention maps extracted from a model trained to
a reasonable level would encode rich contextual information. The valuable
contextual information can be used as a form of 'free' supervision for further
representation learning through performing topdown and layer-wise attention
distillation within the network itself. SAD can be easily incorporated in any
feedforward convolutional neural networks (CNN) and does not increase the
inference time. We validate SAD on three popular lane detection benchmarks
(TuSimple, CULane and BDD100K) using lightweight models such as ENet, ResNet-18
and ResNet-34. The lightest model, ENet-SAD, performs comparatively or even
surpasses existing algorithms. Notably, ENet-SAD has 20 x fewer parameters and
runs 10 x faster compared to the state-of-the-art SCNN, while still achieving
compelling performance in all benchmarks. Our code is available at
https://github.com/cardwing/Codes-for-Lane-Detection.Comment: 9 pages, 8 figures; This paper is accepted by ICCV 2019; Our code is
available at https://github.com/cardwing/Codes-for-Lane-Detectio
Lane Detection in Low-light Conditions Using an Efficient Data Enhancement : Light Conditions Style Transfer
Nowadays, deep learning techniques are widely used for lane detection, but
application in low-light conditions remains a challenge until this day.
Although multi-task learning and contextual-information-based methods have been
proposed to solve the problem, they either require additional manual
annotations or introduce extra inference overhead respectively. In this paper,
we propose a style-transfer-based data enhancement method, which uses
Generative Adversarial Networks (GANs) to generate images in low-light
conditions, that increases the environmental adaptability of the lane detector.
Our solution consists of three parts: the proposed SIM-CycleGAN, light
conditions style transfer and lane detection network. It does not require
additional manual annotations nor extra inference overhead. We validated our
methods on the lane detection benchmark CULane using ERFNet. Empirically, lane
detection model trained using our method demonstrated adaptability in low-light
conditions and robustness in complex scenarios. Our code for this paper will be
publicly available.Comment: Accepted by IV 202
A light-weight method to foster the (Grad)CAM interpretability and explainability of classification networks
We consider a light-weight method which allows to improve the explainability
of localized classification networks. The method considers (Grad)CAM maps
during the training process by modification of the training loss and does not
require additional structural elements. It is demonstrated that the (Grad)CAM
interpretability, as measured by several indicators, can be improved in this
way. Since the method shall be applicable on embedded systems and on standard
deeper architectures, it essentially takes advantage of second order
derivatives during the training and does not require additional model layers.Comment: 2020 10th International Conference on Advanced Computer Information
Technologie
Heatmap-based Vanishing Point boosts Lane Detection
Vision-based lane detection (LD) is a key part of autonomous driving
technology, and it is also a challenging problem. As one of the important
constraints of scene composition, vanishing point (VP) may provide a useful
clue for lane detection. In this paper, we proposed a new multi-task fusion
network architecture for high-precision lane detection. Firstly, the ERFNet was
used as the backbone to extract the hierarchical features of the road image.
Then, the lanes were detected using image segmentation. Finally, combining the
output of lane detection and the hierarchical features extracted by the
backbone, the lane VP was predicted using heatmap regression. The proposed
fusion strategy was tested using the public CULane dataset. The experimental
results suggest that the lane detection accuracy of our method outperforms
those of state-of-the-art (SOTA) methods.Comment: 5 pages, 3 figures, submitted to IEEE journal, under revie
Real-time Multi-target Path Prediction and Planning for Autonomous Driving aided by FCN
Real-time multi-target path planning is a key issue in the field of
autonomous driving. Although multiple paths can be generated in real-time with
polynomial curves, the generated paths are not flexible enough to deal with
complex road scenes such as S-shaped road and unstructured scenes such as
parking lots. Search and sampling-based methods, such as A* and RRT and their
derived methods, are flexible in generating paths for these complex road
environments. However, the existing algorithms require significant time to plan
to multiple targets, which greatly limits their application in autonomous
driving. In this paper, a real-time path planning method for multi-targets is
proposed. We train a fully convolutional neural network (FCN) to predict a path
region for the target at first. By taking the predicted path region as soft
constraints, the A* algorithm is then applied to search the exact path to the
target. Experiments show that FCN can make multiple predictions in a very short
time (50 times in 40ms), and the predicted path region effectively restrict the
searching space for the following A* search. Therefore, the A* can search much
faster so that the multi-target path planning can be achieved in real-time (3
targets in less than 100ms)
Categorical Relation-Preserving Contrastive Knowledge Distillation for Medical Image Classification
The amount of medical images for training deep classification models is
typically very scarce, making these deep models prone to overfit the training
data. Studies showed that knowledge distillation (KD), especially the
mean-teacher framework which is more robust to perturbations, can help mitigate
the over-fitting effect. However, directly transferring KD from computer vision
to medical image classification yields inferior performance as medical images
suffer from higher intra-class variance and class imbalance. To address these
issues, we propose a novel Categorical Relation-preserving Contrastive
Knowledge Distillation (CRCKD) algorithm, which takes the commonly used
mean-teacher model as the supervisor. Specifically, we propose a novel
Class-guided Contrastive Distillation (CCD) module to pull closer positive
image pairs from the same class in the teacher and student models, while
pushing apart negative image pairs from different classes. With this
regularization, the feature distribution of the student model shows higher
intra-class similarity and inter-class variance. Besides, we propose a
Categorical Relation Preserving (CRP) loss to distill the teacher's relational
knowledge in a robust and class-balanced manner. With the contribution of the
CCD and CRP, our CRCKD algorithm can distill the relational knowledge more
comprehensively. Extensive experiments on the HAM10000 and APTOS datasets
demonstrate the superiority of the proposed CRCKD method
3D-LaneNet+: Anchor Free Lane Detection using a Semi-Local Representation
3D-LaneNet+ is a camera-based DNN method for anchor free 3D lane detection
which is able to detect 3d lanes of any arbitrary topology such as splits,
merges, as well as short and perpendicular lanes. We follow recently proposed
3D-LaneNet, and extend it to enable the detection of these previously
unsupported lane topologies. Our output representation is an anchor free,
semi-local tile representation that breaks down lanes into simple lane segments
whose parameters can be learnt. In addition we learn, per lane instance,
feature embedding that reasons for the global connectivity of locally detected
segments to form full 3d lanes. This combination allows 3D-LaneNet+ to avoid
using lane anchors, non-maximum suppression, and lane model fitting as in the
original 3D-LaneNet. We demonstrate the efficacy of 3D-LaneNet+ using both
synthetic and real world data. Results show significant improvement relative to
the original 3D-LaneNet that can be attributed to better generalization to
complex lane topologies, curvatures and surface geometries.Comment: arXiv admin note: substantial text overlap with arXiv:2003.0525
Model-Agnostic Defense for Lane Detection against Adversarial Attack
Susceptibility of neural networks to adversarial attack prompts serious
safety concerns for lane detection efforts, a domain where such models have
been widely applied. Recent work on adversarial road patches have successfully
induced perception of lane lines with arbitrary form, presenting an avenue for
rogue control of vehicle behavior. In this paper, we propose a modular lane
verification system that can catch such threats before the autonomous driving
system is misled while remaining agnostic to the particular lane detection
model. Our experiments show that implementing the system with a simple
convolutional neural network (CNN) can defend against a wide gamut of attacks
on lane detection models. With a 10% impact to inference time, we can detect
96% of bounded non-adaptive attacks, 90% of bounded adaptive attacks, and 98%
of patch attacks while preserving accurate identification at least 95% of true
lanes, indicating that our proposed verification system is effective at
mitigating lane detection security risks with minimal overhead.Comment: 6 pages, 6 figures, 3 tables. Part of AutoSec 2021 proceeding
A system of vision sensor based deep neural networks for complex driving scene analysis in support of crash risk assessment and prevention
To assist human drivers and autonomous vehicles in assessing crash risks,
driving scene analysis using dash cameras on vehicles and deep learning
algorithms is of paramount importance. Although these technologies are
increasingly available, driving scene analysis for this purpose still remains a
challenge. This is mainly due to the lack of annotated large image datasets for
analyzing crash risk indicators and crash likelihood, and the lack of an
effective method to extract lots of required information from complex driving
scenes. To fill the gap, this paper develops a scene analysis system. The
Multi-Net of the system includes two multi-task neural networks that perform
scene classification to provide four labels for each scene. The DeepLab v3 and
YOLO v3 are combined by the system to detect and locate risky pedestrians and
the nearest vehicles. All identified information can provide the situational
awareness to autonomous vehicles or human drivers for identifying crash risks
from the surrounding traffic. To address the scarcity of annotated image
datasets for studying traffic crashes, two completely new datasets have been
developed by this paper and made available to the public, which were proved to
be effective in training the proposed deep neural networks. The paper further
evaluates the performance of the Multi-Net and the efficiency of the developed
system. Comprehensive scene analysis is further illustrated with representative
examples. Results demonstrate the effectiveness of the developed system and
datasets for driving scene analysis, and their supportiveness for crash risk
assessment and crash prevention.Comment: 11 Pages, 8 Figures, Presented in TRB conferenc
PolyLaneNet: Lane Estimation via Deep Polynomial Regression
One of the main factors that contributed to the large advances in autonomous
driving is the advent of deep learning. For safer self-driving vehicles, one of
the problems that has yet to be solved completely is lane detection. Since
methods for this task have to work in real-time (+30 FPS), they not only have
to be effective (i.e., have high accuracy) but they also have to be efficient
(i.e., fast). In this work, we present a novel method for lane detection that
uses as input an image from a forward-looking camera mounted in the vehicle and
outputs polynomials representing each lane marking in the image, via deep
polynomial regression. The proposed method is shown to be competitive with
existing state-of-the-art methods in the TuSimple dataset while maintaining its
efficiency (115 FPS). Additionally, extensive qualitative results on two
additional public datasets are presented, alongside with limitations in the
evaluation metrics used by recent works for lane detection. Finally, we provide
source code and trained models that allow others to replicate all the results
shown in this paper, which is surprisingly rare in state-of-the-art lane
detection methods. The full source code and pretrained models are available at
https://github.com/lucastabelini/PolyLaneNet.Comment: Accepted to ICPR 202