Search CORE

1,066 research outputs found

Application and evaluation of direct sparse visual odometry in marine vessels

Author: Benderius Ola
Berger Christian
Blanch Krister
Nguyen Bj\uf6rnborg
Petersson Anna
Publication venue: 'Elsevier BV'
Publication date: 01/01/2022
Field of study

With the international community pushing for a computer vision based option to the laws requiring a look-out for marine vehicles, there is now a significant motivation to provide digital solutions for navigation using these envisioned mandatory visual sensors. This paper explores the monocular direct sparse odometry algorithm when applied to a typical marine environment. The method uses a single camera to estimate a vessel\u27s motion and position over time and is then compared to ground truth to establish feasibility as both a local and global navigation system. Whilst it was inconsistent in accurately estimating vessel position, it was found that it could consistently estimate the vessel\u27s orientation in the majority of the situations the vessel was tasked with. It is therefore shown that monocular direct sparse odometry is partially suitable as a standalone navigation system and is a strong base for a multi-sensor solution

Chalmers Research

OmniDet: Surround View Cameras based Multi-task Visual Perception Network for Autonomous Driving

Author: Kumar Varun Ravi
Leang Isabelle
Milz Stefan
Mäder Patrick
Rashed Hazem
Sitsu Ganesh
Witt Christian
Yogamani Senthil
Publication venue
Publication date: 15/02/2021
Field of study

Surround View fisheye cameras are commonly deployed in automated driving for 360\deg{} near-field sensing around the vehicle. This work presents a multi-task visual perception network on unrectified fisheye images to enable the vehicle to sense its surrounding environment. It consists of six primary tasks necessary for an autonomous driving system: depth estimation, visual odometry, semantic segmentation, motion segmentation, object detection, and lens soiling detection. We demonstrate that the jointly trained model performs better than the respective single task versions. Our multi-task model has a shared encoder providing a significant computational advantage and has synergized decoders where tasks support each other. We propose a novel camera geometry based adaptation mechanism to encode the fisheye distortion model both at training and inference. This was crucial to enable training on the WoodScape dataset, comprised of data from different parts of the world collected by 12 different cameras mounted on three different cars with different intrinsics and viewpoints. Given that bounding boxes is not a good representation for distorted fisheye images, we also extend object detection to use a polygon with non-uniformly sampled vertices. We additionally evaluate our model on standard automotive datasets, namely KITTI and Cityscapes. We obtain the state-of-the-art results on KITTI for depth estimation and pose estimation tasks and competitive performance on the other tasks. We perform extensive ablation studies on various architecture choices and task weighting methodologies. A short video at https://youtu.be/xbSjZ5OfPes provides qualitative results.Comment: Camera ready version accepted for RA-L and ICRA 2021 publicatio

arXiv.org e-Print Archive

Vehicle Distance Detection Using Monocular Vision and Machine Learning

Author: Hanna Maryam Samir Naguib Girgis
Publication venue: 'University of Windsor Leddy Library'
Publication date: 23/11/2019
Field of study

With the development of new cutting-edge technology, autonomous vehicles (AVs) have become the main topic in the majority of the automotive industries. For an AV to be safely used on the public roads it needs to be able to perceive its surrounding environment and calculate decisions within real-time. A perfect AV still does not exist for the majority of public use, but advanced driver assistance systems (ADAS) have been already integrated into everyday vehicles. It is predicted that these systems will evolve to work together to become a fully AV of the future. This thesis’ main focus is the combination of ADAS with artificial intelligence (AI) models. Since neural networks (NNs) could be unpredictable at many occasions, the main aspect of this thesis is the research of which neural network architecture will be most accurate in perceiving distance between vehicles. Hence, the study of integration of ADAS with AI, and studying whether AI can safely be used as a central processor for AV needs resolution. The created ADAS in this thesis mainly focuses on using monocular vision and machine training. A dataset of 200,000 images was used to train a neural network (NN) model, which accurately detect whether an image is a license plate or not by 96.75% accuracy. A sliding window reads whether a sub-section of an image is a license plate; the process achieved if it is, and the algorithm stores that sub-section image. The sub-images are run through a heatmap threshold to help minimize false detections. Upon detecting the license plate, the final algorithm determines the distance of the vehicle of the license plate detected. It then calculates the distance and outputs the data to the user. This process achieves results with up to a 1-meter distance accuracy. This ADAS has been aimed to be useable by the public, and easily integrated into future AV systems

A Comparison of Monocular Visual SLAM and Visual Odometry Methods Applied to 3D Reconstruction

Author: Herrera-Granda Erick P.
Peluffo-Ordóñez Diego Hernán
Rosales Andrés
Torres Cantero Juan Carlos
Publication venue: MDPI
Publication date: 31/07/2023
Field of study

This work was supported by the SDAS Research Group (www.sdas-group.com accessed on 16 June 2023).Pure monocular 3D reconstruction is a complex problem that has attracted the research community's interest due to the affordability and availability of RGB sensors. SLAM, VO, and SFM are disciplines formulated to solve the 3D reconstruction problem and estimate the camera's ego-motion; so, many methods have been proposed. However, most of these methods have not been evaluated on large datasets and under various motion patterns, have not been tested under the same metrics, and most of them have not been evaluated following a taxonomy, making their comparison and selection difficult. In this research, we performed a comparison of ten publicly available SLAM and VO methods following a taxonomy, including one method for each category of the primary taxonomy, three machine-learning-based methods, and two updates of the best methods to identify the advantages and limitations of each category of the taxonomy and test whether the addition of machine learning or updates on those methods improved them significantly. Thus, we evaluated each algorithm using the TUM-Mono dataset and benchmark, and we performed an inferential statistical analysis to identify the significant differences through its metrics. The results determined that the sparse-direct methods significantly outperformed the rest of the taxonomy, and fusing them with machine learning techniques significantly enhanced the geometric-based methods' performance from different perspectives.SDAS Research Grou

Repositorio Institucional Universidad de Granada

Surround-view Fisheye BEV-Perception for Valet Parking: Dataset, Baseline and Distortion-insensitive Multi-task Framework

Author: Gan Yuanzhu
Li Xianzhi
Wang Fan
Wang Xiaoquan
Wu Yunzhe
Wu Zizhang
Xu Tianhao
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/12/2022
Field of study

Surround-view fisheye perception under valet parking scenes is fundamental and crucial in autonomous driving. Environmental conditions in parking lots perform differently from the common public datasets, such as imperfect light and opacity, which substantially impacts on perception performance. Most existing networks based on public datasets may generalize suboptimal results on these valet parking scenes, also affected by the fisheye distortion. In this article, we introduce a new large-scale fisheye dataset called Fisheye Parking Dataset(FPD) to promote the research in dealing with diverse real-world surround-view parking cases. Notably, our compiled FPD exhibits excellent characteristics for different surround-view perception tasks. In addition, we also propose our real-time distortion-insensitive multi-task framework Fisheye Perception Network (FPNet), which improves the surround-view fisheye BEV perception by enhancing the fisheye distortion operation and multi-task lightweight designs. Extensive experiments validate the effectiveness of our approach and the dataset's exceptional generalizability.Comment: 12 pages, 11 figure

arXiv.org e-Print Archive