3,855 research outputs found
Classification of Occluded Objects using Fast Recurrent Processing
Recurrent neural networks are powerful tools for handling incomplete data
problems in computer vision, thanks to their significant generative
capabilities. However, the computational demand for these algorithms is too
high to work in real time, without specialized hardware or software solutions.
In this paper, we propose a framework for augmenting recurrent processing
capabilities into a feedforward network without sacrificing much from
computational efficiency. We assume a mixture model and generate samples of the
last hidden layer according to the class decisions of the output layer, modify
the hidden layer activity using the samples, and propagate to lower layers. For
visual occlusion problem, the iterative procedure emulates feedforward-feedback
loop, filling-in the missing hidden layer activity with meaningful
representations. The proposed algorithm is tested on a widely used dataset, and
shown to achieve 2 improvement in classification accuracy for occluded
objects. When compared to Restricted Boltzmann Machines, our algorithm shows
superior performance for occluded object classification.Comment: arXiv admin note: text overlap with arXiv:1409.8576 by other author
Fuzzy logic traffic signal controller enhancement based on aggressive driver behavior classification
The rise in population worldwide and especially in Egypt, together with the increase in the number of vehicles present serious complications regarding traffic congestion and road safety. The elementary solution towards improving congestion is to expand road capacities by building new lanes. This, however, requires time and effort and therefore new methodologies are being implemented. Intelligent transportation systems (ITS) try to approach traffic congestion through the application of computational and engineering techniques. Traffic signal control is a branch of intelligent transportation systems which focuses on improving traffic signal conditions. A traffic signal controllers’ main objective is to improve this assignment in a way which reduces delays. This research proposes a new approach to enhancing traffic signal control and reducing delays of a single intersection, through the integration of an aggressive driving behavior classifier. Previous approaches dealt with traffic control and driver behavior separately, and therefore their successful integration is a new challenging area in the field. Multiple experiment sets were conducted to provide an indication to the effectiveness of our approach. Firstly, an aggressive driver behavior classifier using feed-forward neural network was successfully built utilizing Virginia Tech 100-car naturalistic driving study data. Its performance was compared against long short-term memory recurrent neural networks and support vector machines, and it resulted in better performance as shown by the area under the curve. To the best of our knowledge, this classifier is the first of its kind to be built on this 100-car study data. Secondly, a representation of aggressive driving behavior was constructed in the simulated environment, based on real life data and statistics. Finally, Mamdani’s fuzzy logic controller was modified to accommodate for the integration of the aggressive behavior classifier. The integration results were encouraging and yielded significant improvements at higher traffic flow volumes when compared against the built Mamdani’s controller. The results are promising and provide an initial step towards the integration of driver behavior classification and traffic signal control
Online Domain Adaptation for Multi-Object Tracking
Automatically detecting, labeling, and tracking objects in videos depends
first and foremost on accurate category-level object detectors. These might,
however, not always be available in practice, as acquiring high-quality large
scale labeled training datasets is either too costly or impractical for all
possible real-world application scenarios. A scalable solution consists in
re-using object detectors pre-trained on generic datasets. This work is the
first to investigate the problem of on-line domain adaptation of object
detectors for causal multi-object tracking (MOT). We propose to alleviate the
dataset bias by adapting detectors from category to instances, and back: (i) we
jointly learn all target models by adapting them from the pre-trained one, and
(ii) we also adapt the pre-trained model on-line. We introduce an on-line
multi-task learning algorithm to efficiently share parameters and reduce drift,
while gradually improving recall. Our approach is applicable to any linear
object detector, and we evaluate both cheap "mini-Fisher Vectors" and expensive
"off-the-shelf" ConvNet features. We quantitatively measure the benefit of our
domain adaptation strategy on the KITTI tracking benchmark and on a new dataset
(PASCAL-to-KITTI) we introduce to study the domain mismatch problem in MOT.Comment: To appear at BMVC 201
Towards holistic scene understanding:Semantic segmentation and beyond
This dissertation addresses visual scene understanding and enhances
segmentation performance and generalization, training efficiency of networks,
and holistic understanding. First, we investigate semantic segmentation in the
context of street scenes and train semantic segmentation networks on
combinations of various datasets. In Chapter 2 we design a framework of
hierarchical classifiers over a single convolutional backbone, and train it
end-to-end on a combination of pixel-labeled datasets, improving
generalizability and the number of recognizable semantic concepts. Chapter 3
focuses on enriching semantic segmentation with weak supervision and proposes a
weakly-supervised algorithm for training with bounding box-level and
image-level supervision instead of only with per-pixel supervision. The memory
and computational load challenges that arise from simultaneous training on
multiple datasets are addressed in Chapter 4. We propose two methodologies for
selecting informative and diverse samples from datasets with weak supervision
to reduce our networks' ecological footprint without sacrificing performance.
Motivated by memory and computation efficiency requirements, in Chapter 5, we
rethink simultaneous training on heterogeneous datasets and propose a universal
semantic segmentation framework. This framework achieves consistent increases
in performance metrics and semantic knowledgeability by exploiting various
scene understanding datasets. Chapter 6 introduces the novel task of part-aware
panoptic segmentation, which extends our reasoning towards holistic scene
understanding. This task combines scene and parts-level semantics with
instance-level object detection. In conclusion, our contributions span over
convolutional network architectures, weakly-supervised learning, part and
panoptic segmentation, paving the way towards a holistic, rich, and sustainable
visual scene understanding.Comment: PhD Thesis, Eindhoven University of Technology, October 202
Lidar-based Obstacle Detection and Recognition for Autonomous Agricultural Vehicles
Today, agricultural vehicles are available that can drive autonomously and follow exact route plans more precisely than human operators. Combined with advancements in precision agriculture, autonomous agricultural robots can reduce manual labor, improve workflow, and optimize yield. However, as of today, human operators are still required for monitoring the environment and acting upon potential obstacles in front of the vehicle. To eliminate this need, safety must be ensured by accurate and reliable obstacle detection and avoidance systems.In this thesis, lidar-based obstacle detection and recognition in agricultural environments has been investigated. A rotating multi-beam lidar generating 3D point clouds was used for point-wise classification of agricultural scenes, while multi-modal fusion with cameras and radar was used to increase performance and robustness. Two research perception platforms were presented and used for data acquisition. The proposed methods were all evaluated on recorded datasets that represented a wide range of realistic agricultural environments and included both static and dynamic obstacles.For 3D point cloud classification, two methods were proposed for handling density variations during feature extraction. One method outperformed a frequently used generic 3D feature descriptor, whereas the other method showed promising preliminary results using deep learning on 2D range images. For multi-modal fusion, four methods were proposed for combining lidar with color camera, thermal camera, and radar. Gradual improvements in classification accuracy were seen, as spatial, temporal, and multi-modal relationships were introduced in the models. Finally, occupancy grid mapping was used to fuse and map detections globally, and runtime obstacle detection was applied on mapped detections along the vehicle path, thus simulating an actual traversal.The proposed methods serve as a first step towards full autonomy for agricultural vehicles. The study has thus shown that recent advancements in autonomous driving can be transferred to the agricultural domain, when accurate distinctions are made between obstacles and processable vegetation. Future research in the domain has further been facilitated with the release of the multi-modal obstacle dataset, FieldSAFE
- …