790 research outputs found

    The Right (Angled) Perspective: Improving the Understanding of Road Scenes Using Boosted Inverse Perspective Mapping

    Full text link
    Many tasks performed by autonomous vehicles such as road marking detection, object tracking, and path planning are simpler in bird's-eye view. Hence, Inverse Perspective Mapping (IPM) is often applied to remove the perspective effect from a vehicle's front-facing camera and to remap its images into a 2D domain, resulting in a top-down view. Unfortunately, however, this leads to unnatural blurring and stretching of objects at further distance, due to the resolution of the camera, limiting applicability. In this paper, we present an adversarial learning approach for generating a significantly improved IPM from a single camera image in real time. The generated bird's-eye-view images contain sharper features (e.g. road markings) and a more homogeneous illumination, while (dynamic) objects are automatically removed from the scene, thus revealing the underlying road layout in an improved fashion. We demonstrate our framework using real-world data from the Oxford RobotCar Dataset and show that scene understanding tasks directly benefit from our boosted IPM approach.Comment: equal contribution of first two authors, 8 full pages, 6 figures, accepted at IV 201

    Improving Autonomous Vehicle Mapping and Navigation in Work Zones Using Crowdsourcing Vehicle Trajectories

    Full text link
    Prevalent solutions for Connected and Autonomous vehicle (CAV) mapping include high definition map (HD map) or real-time Simultaneous Localization and Mapping (SLAM). Both methods only rely on vehicle itself (onboard sensors or embedded maps) and can not adapt well to temporarily changed drivable areas such as work zones. Navigating CAVs in such areas heavily relies on how the vehicle defines drivable areas based on perception information. Difficulties in improving perception accuracy and ensuring the correct interpretation of perception results are challenging to the vehicle in these situations. This paper presents a prototype that introduces crowdsourcing trajectories information into the mapping process to enhance CAV's understanding on the drivable area and traffic rules. A Gaussian Mixture Model (GMM) is applied to construct the temporarily changed drivable area and occupancy grid map (OGM) based on crowdsourcing trajectories. The proposed method is compared with SLAM without any human driving information. Our method has adapted well with the downstream path planning and vehicle control module, and the CAV did not violate driving rule, which a pure SLAM method did not achieve.Comment: Presented at TRBAM. Journal version in progres

    Context Exploitation in Data Fusion

    Get PDF
    Complex and dynamic environments constitute a challenge for existing tracking algorithms. For this reason, modern solutions are trying to utilize any available information which could help to constrain, improve or explain the measurements. So called Context Information (CI) is understood as information that surrounds an element of interest, whose knowledge may help understanding the (estimated) situation and also in reacting to that situation. However, context discovery and exploitation are still largely unexplored research topics. Until now, the context has been extensively exploited as a parameter in system and measurement models which led to the development of numerous approaches for the linear or non-linear constrained estimation and target tracking. More specifically, the spatial or static context is the most common source of the ambient information, i.e. features, utilized for recursive enhancement of the state variables either in the prediction or the measurement update of the filters. In the case of multiple model estimators, context can not only be related to the state but also to a certain mode of the filter. Common practice for multiple model scenarios is to represent states and context as a joint distribution of Gaussian mixtures. These approaches are commonly referred as the join tracking and classification. Alternatively, the usefulness of context was also demonstrated in aiding the measurement data association. Process of formulating a hypothesis, which assigns a particular measurement to the track, is traditionally governed by the empirical knowledge of the noise characteristics of sensors and operating environment, i.e. probability of detection, false alarm, clutter noise, which can be further enhanced by conditioning on context. We believe that interactions between the environment and the object could be classified into actions, activities and intents, and formed into structured graphs with contextual links translated into arcs. By learning the environment model we will be able to make prediction on the target\u2019s future actions based on its past observation. Probability of target future action could be utilized in the fusion process to adjust tracker confidence on measurements. By incorporating contextual knowledge of the environment, in the form of a likelihood function, in the filter measurement update step, we have been able to reduce uncertainties of the tracking solution and improve the consistency of the track. The promising results demonstrate that the fusion of CI brings a significant performance improvement in comparison to the regular tracking approaches

    An Explicit Method for Fast Monocular Depth Recovery in Corridor Environments

    Full text link
    Monocular cameras are extensively employed in indoor robotics, but their performance is limited in visual odometry, depth estimation, and related applications due to the absence of scale information.Depth estimation refers to the process of estimating a dense depth map from the corresponding input image, existing researchers mostly address this issue through deep learning-based approaches, yet their inference speed is slow, leading to poor real-time capabilities. To tackle this challenge, we propose an explicit method for rapid monocular depth recovery specifically designed for corridor environments, leveraging the principles of nonlinear optimization. We adopt the virtual camera assumption to make full use of the prior geometric features of the scene. The depth estimation problem is transformed into an optimization problem by minimizing the geometric residual. Furthermore, a novel depth plane construction technique is introduced to categorize spatial points based on their possible depths, facilitating swift depth estimation in enclosed structural scenarios, such as corridors. We also propose a new corridor dataset, named Corr\_EH\_z, which contains images as captured by the UGV camera of a variety of corridors. An exhaustive set of experiments in different corridors reveal the efficacy of the proposed algorithm.Comment: 10 pages, 8 figures. arXiv admin note: text overlap with arXiv:2111.08600 by other author

    Multi-Lane Perception Using Feature Fusion Based on GraphSLAM

    Full text link
    An extensive, precise and robust recognition and modeling of the environment is a key factor for next generations of Advanced Driver Assistance Systems and development of autonomous vehicles. In this paper, a real-time approach for the perception of multiple lanes on highways is proposed. Lane markings detected by camera systems and observations of other traffic participants provide the input data for the algorithm. The information is accumulated and fused using GraphSLAM and the result constitutes the basis for a multilane clothoid model. To allow incorporation of additional information sources, input data is processed in a generic format. Evaluation of the method is performed by comparing real data, collected with an experimental vehicle on highways, to a ground truth map. The results show that ego and adjacent lanes are robustly detected with high quality up to a distance of 120 m. In comparison to serial lane detection, an increase in the detection range of the ego lane and a continuous perception of neighboring lanes is achieved. The method can potentially be utilized for the longitudinal and lateral control of self-driving vehicles

    Online Monocular Lane Mapping Using Catmull-Rom Spline

    Full text link
    In this study, we introduce an online monocular lane mapping approach that solely relies on a single camera and odometry for generating spline-based maps. Our proposed technique models the lane association process as an assignment issue utilizing a bipartite graph, and assigns weights to the edges by incorporating Chamfer distance, pose uncertainty, and lateral sequence consistency. Furthermore, we meticulously design control point initialization, spline parameterization, and optimization to progressively create, expand, and refine splines. In contrast to prior research that assessed performance using self-constructed datasets, our experiments are conducted on the openly accessible OpenLane dataset. The experimental outcomes reveal that our suggested approach enhances lane association and odometry precision, as well as overall lane map quality. We have open-sourced our code1 for this project.Comment: Accepted by IROS202

    An Inverse Perspective Mapping Approach using Monocular Camera of Pepper Humanoid Robot to Determine the Position of Other Moving Robot in Plane

    Get PDF
    This article presents a method to know the position of object or moving robot in the plane while the camera is moving dynamically. An Inverse Perspective mapping (IPM) approach has been embedded in a monocular camera on Head of Pepper Humanoid Robot (Softbank Robotics) for real time position determination of other object or robot in plane. While the Pepper head is moving, it is difficult to determine position or distance to objects in front of the robot with any different degree of certainity. By applying IPM, a linear relationship between the IPM frame and world frame becomes the key element to know the position of object while the head is static but when the head orientation changes the IPM is modified to adapt the linear relationship between both frames. So, the proposed method is based on the extraction of accurate bird\u2019s-eye view. The method includes the Image Acquistion, IPM Filtering, Detection Phase, Region of Interest Selection and Pixel remapping
    corecore