1,066 research outputs found

    Application and evaluation of direct sparse visual odometry in marine vessels

    Get PDF
    With the international community pushing for a computer vision based option to the laws requiring a look-out for marine vehicles, there is now a significant motivation to provide digital solutions for navigation using these envisioned mandatory visual sensors. This paper explores the monocular direct sparse odometry algorithm when applied to a typical marine environment. The method uses a single camera to estimate a vessel\u27s motion and position over time and is then compared to ground truth to establish feasibility as both a local and global navigation system. Whilst it was inconsistent in accurately estimating vessel position, it was found that it could consistently estimate the vessel\u27s orientation in the majority of the situations the vessel was tasked with. It is therefore shown that monocular direct sparse odometry is partially suitable as a standalone navigation system and is a strong base for a multi-sensor solution

    OmniDet: Surround View Cameras based Multi-task Visual Perception Network for Autonomous Driving

    Full text link
    Surround View fisheye cameras are commonly deployed in automated driving for 360\deg{} near-field sensing around the vehicle. This work presents a multi-task visual perception network on unrectified fisheye images to enable the vehicle to sense its surrounding environment. It consists of six primary tasks necessary for an autonomous driving system: depth estimation, visual odometry, semantic segmentation, motion segmentation, object detection, and lens soiling detection. We demonstrate that the jointly trained model performs better than the respective single task versions. Our multi-task model has a shared encoder providing a significant computational advantage and has synergized decoders where tasks support each other. We propose a novel camera geometry based adaptation mechanism to encode the fisheye distortion model both at training and inference. This was crucial to enable training on the WoodScape dataset, comprised of data from different parts of the world collected by 12 different cameras mounted on three different cars with different intrinsics and viewpoints. Given that bounding boxes is not a good representation for distorted fisheye images, we also extend object detection to use a polygon with non-uniformly sampled vertices. We additionally evaluate our model on standard automotive datasets, namely KITTI and Cityscapes. We obtain the state-of-the-art results on KITTI for depth estimation and pose estimation tasks and competitive performance on the other tasks. We perform extensive ablation studies on various architecture choices and task weighting methodologies. A short video at https://youtu.be/xbSjZ5OfPes provides qualitative results.Comment: Camera ready version accepted for RA-L and ICRA 2021 publicatio

    Vehicle Distance Detection Using Monocular Vision and Machine Learning

    Get PDF
    With the development of new cutting-edge technology, autonomous vehicles (AVs) have become the main topic in the majority of the automotive industries. For an AV to be safely used on the public roads it needs to be able to perceive its surrounding environment and calculate decisions within real-time. A perfect AV still does not exist for the majority of public use, but advanced driver assistance systems (ADAS) have been already integrated into everyday vehicles. It is predicted that these systems will evolve to work together to become a fully AV of the future. This thesis’ main focus is the combination of ADAS with artificial intelligence (AI) models. Since neural networks (NNs) could be unpredictable at many occasions, the main aspect of this thesis is the research of which neural network architecture will be most accurate in perceiving distance between vehicles. Hence, the study of integration of ADAS with AI, and studying whether AI can safely be used as a central processor for AV needs resolution. The created ADAS in this thesis mainly focuses on using monocular vision and machine training. A dataset of 200,000 images was used to train a neural network (NN) model, which accurately detect whether an image is a license plate or not by 96.75% accuracy. A sliding window reads whether a sub-section of an image is a license plate; the process achieved if it is, and the algorithm stores that sub-section image. The sub-images are run through a heatmap threshold to help minimize false detections. Upon detecting the license plate, the final algorithm determines the distance of the vehicle of the license plate detected. It then calculates the distance and outputs the data to the user. This process achieves results with up to a 1-meter distance accuracy. This ADAS has been aimed to be useable by the public, and easily integrated into future AV systems

    A Comparison of Monocular Visual SLAM and Visual Odometry Methods Applied to 3D Reconstruction

    Get PDF
    This work was supported by the SDAS Research Group (www.sdas-group.com accessed on 16 June 2023).Pure monocular 3D reconstruction is a complex problem that has attracted the research community's interest due to the affordability and availability of RGB sensors. SLAM, VO, and SFM are disciplines formulated to solve the 3D reconstruction problem and estimate the camera's ego-motion; so, many methods have been proposed. However, most of these methods have not been evaluated on large datasets and under various motion patterns, have not been tested under the same metrics, and most of them have not been evaluated following a taxonomy, making their comparison and selection difficult. In this research, we performed a comparison of ten publicly available SLAM and VO methods following a taxonomy, including one method for each category of the primary taxonomy, three machine-learning-based methods, and two updates of the best methods to identify the advantages and limitations of each category of the taxonomy and test whether the addition of machine learning or updates on those methods improved them significantly. Thus, we evaluated each algorithm using the TUM-Mono dataset and benchmark, and we performed an inferential statistical analysis to identify the significant differences through its metrics. The results determined that the sparse-direct methods significantly outperformed the rest of the taxonomy, and fusing them with machine learning techniques significantly enhanced the geometric-based methods' performance from different perspectives.SDAS Research Grou

    Surround-view Fisheye BEV-Perception for Valet Parking: Dataset, Baseline and Distortion-insensitive Multi-task Framework

    Full text link
    Surround-view fisheye perception under valet parking scenes is fundamental and crucial in autonomous driving. Environmental conditions in parking lots perform differently from the common public datasets, such as imperfect light and opacity, which substantially impacts on perception performance. Most existing networks based on public datasets may generalize suboptimal results on these valet parking scenes, also affected by the fisheye distortion. In this article, we introduce a new large-scale fisheye dataset called Fisheye Parking Dataset(FPD) to promote the research in dealing with diverse real-world surround-view parking cases. Notably, our compiled FPD exhibits excellent characteristics for different surround-view perception tasks. In addition, we also propose our real-time distortion-insensitive multi-task framework Fisheye Perception Network (FPNet), which improves the surround-view fisheye BEV perception by enhancing the fisheye distortion operation and multi-task lightweight designs. Extensive experiments validate the effectiveness of our approach and the dataset's exceptional generalizability.Comment: 12 pages, 11 figure
    • …
    corecore