42 research outputs found

    Dense Point-Cloud Representation of a Scene using Monocular Vision

    Get PDF
    We present a three-dimensional (3-D) reconstruction system designed to support various autonomous navigation applications. The system presented focuses on the 3-D reconstruction of a scene using only a single moving camera. Utilizing video frames captured at different points in time allows us to determine the depths of a scene. In this way, the system can be used to construct a point-cloud model of its unknown surroundings. We present the step-by-step methodology and analysis used in developing the 3-D reconstruction technique. We present a reconstruction framework that generates a primitive point cloud, which is computed based on feature matching and depth triangulation analysis. To populate the reconstruction, we utilized optical flow features to create an extremely dense representation model. With the third algorithmic modification, we introduce the addition of the preprocessing step of nonlinear single-image super resolution. With this addition, the depth accuracy of the point cloud, which relies on precise disparity measurement, has significantly increased. Our final contribution is an additional postprocessing step designed to filter noise points and mismatched features unveiling the complete dense point-cloud representation (DPR) technique. We measure the success of DPR by evaluating the visual appeal, density, accuracy, and computational expense and compare with two state-of-the-art techniques

    Past, Present, and Future of Simultaneous Localization And Mapping: Towards the Robust-Perception Age

    Get PDF
    Simultaneous Localization and Mapping (SLAM)consists in the concurrent construction of a model of the environment (the map), and the estimation of the state of the robot moving within it. The SLAM community has made astonishing progress over the last 30 years, enabling large-scale real-world applications, and witnessing a steady transition of this technology to industry. We survey the current state of SLAM. We start by presenting what is now the de-facto standard formulation for SLAM. We then review related work, covering a broad set of topics including robustness and scalability in long-term mapping, metric and semantic representations for mapping, theoretical performance guarantees, active SLAM and exploration, and other new frontiers. This paper simultaneously serves as a position paper and tutorial to those who are users of SLAM. By looking at the published research with a critical eye, we delineate open challenges and new research issues, that still deserve careful scientific investigation. The paper also contains the authors' take on two questions that often animate discussions during robotics conferences: Do robots need SLAM? and Is SLAM solved

    Soft, Round, High Resolution Tactile Fingertip Sensors for Dexterous Robotic Manipulation

    Full text link
    High resolution tactile sensors are often bulky and have shape profiles that make them awkward for use in manipulation. This becomes important when using such sensors as fingertips for dexterous multi-fingered hands, where boxy or planar fingertips limit the available set of smooth manipulation strategies. High resolution optical based sensors such as GelSight have until now been constrained to relatively flat geometries due to constraints on illumination geometry.Here, we show how to construct a rounded fingertip that utilizes a form of light piping for directional illumination. Our sensors can replace the standard rounded fingertips of the Allegro hand.They can capture high resolution maps of the contact surfaces,and can be used to support various dexterous manipulation tasks

    Precise Depth Image Based Real-Time 3D Difference Detection

    Get PDF
    3D difference detection is the task to verify whether the 3D geometry of a real object exactly corresponds to a 3D model of this object. This thesis introduces real-time 3D difference detection with a hand-held depth camera. In contrast to previous works, with the proposed approach, geometric differences can be detected in real time and from arbitrary viewpoints. Therefore, the scan position of the 3D difference detection be changed on the fly, during the 3D scan. Thus, the user can move the scan position closer to the object to inspect details or to bypass occlusions. The main research questions addressed by this thesis are: Q1: How can 3D differences be detected in real time and from arbitrary viewpoints using a single depth camera? Q2: Extending the first question, how can 3D differences be detected with a high precision? Q3: Which accuracy can be achieved with concrete setups of the proposed concept for real time, depth image based 3D difference detection? This thesis answers Q1 by introducing a real-time approach for depth image based 3D difference detection. The real-time difference detection is based on an algorithm which maps the 3D measurements of a depth camera onto an arbitrary 3D model in real time by fusing computer vision (depth imaging and pose estimation) with a computer graphics based analysis-by-synthesis approach. Then, this thesis answers Q2 by providing solutions for enhancing the 3D difference detection accuracy, both by precise pose estimation and by reducing depth measurement noise. A precise variant of the 3D difference detection concept is proposed, which combines two main aspects. First, the precision of the depth camera’s pose estimation is improved by coupling the depth camera with a very precise coordinate measuring machine. Second, measurement noise of the captured depth images is reduced and missing depth information is filled in by extending the 3D difference detection with 3D reconstruction. The accuracy of the proposed 3D difference detection is quantified by a quantitative evaluation. This provides an anwer to Q3. The accuracy is evaluated both for the basic setup and for the variants that focus on a high precision. The quantitative evaluation using real-world data covers both the accuracy which can be achieved with a time-of-flight camera (SwissRanger 4000) and with a structured light depth camera (Kinect). With the basic setup and the structured light depth camera, differences of 8 to 24 millimeters can be detected from one meter measurement distance. With the enhancements proposed for precise 3D difference detection, differences of 4 to 12 millimeters can be detected from one meter measurement distance using the same depth camera. By solving the challenges described by the three research question, this thesis provides a solution for precise real-time 3D difference detection based on depth images. With the approach proposed in this thesis, dense 3D differences can be detected in real time and from arbitrary viewpoints using a single depth camera. Furthermore, by coupling the depth camera with a coordinate measuring machine and by integrating 3D reconstruction in the 3D difference detection, 3D differences can be detected in real time and with a high precision

    CADSim: Robust and Scalable in-the-wild 3D Reconstruction for Controllable Sensor Simulation

    Full text link
    Realistic simulation is key to enabling safe and scalable development of % self-driving vehicles. A core component is simulating the sensors so that the entire autonomy system can be tested in simulation. Sensor simulation involves modeling traffic participants, such as vehicles, with high quality appearance and articulated geometry, and rendering them in real time. The self-driving industry has typically employed artists to build these assets. However, this is expensive, slow, and may not reflect reality. Instead, reconstructing assets automatically from sensor data collected in the wild would provide a better path to generating a diverse and large set with good real-world coverage. Nevertheless, current reconstruction approaches struggle on in-the-wild sensor data, due to its sparsity and noise. To tackle these issues, we present CADSim, which combines part-aware object-class priors via a small set of CAD models with differentiable rendering to automatically reconstruct vehicle geometry, including articulated wheels, with high-quality appearance. Our experiments show our method recovers more accurate shapes from sparse data compared to existing approaches. Importantly, it also trains and renders efficiently. We demonstrate our reconstructed vehicles in several applications, including accurate testing of autonomy perception systems.Comment: CoRL 2022. Project page: https://waabi.ai/cadsim

    Augmentation of Visual Odometry using Radar

    Get PDF
    As UAVs become viable for more applications, pose estimation continues to be critical. All UAVs need to know where they are at all times, in order to avoid disaster. However, in the event that UAVs are deployed in an area with poor visual conditions, such as in many disaster scenarios, many localization algorithms have difficulties working. This thesis presents VIL-DSO, a visual odometry method as a pose estimation solution, combining several different algorithms in order to improve pose estimation and provide metric scale. This thesis also presents a method for automatically determining an accurate physical transform between radar and camera data, and in doing so, allow for the projection of radar information into the image plane. Finally, this thesis presents EVIL-DSO, a method for localization that fuses visual-inertial odometry with radar information. The proposed EVIL-DSO algorithm uses radar information projected into the image plane in order to create a depth map for odometry to directly observe depth of features, which can then be used as part of the odometry algorithm to remove the need to perform costly depth estimations. Trajectory analysis of the proposed algorithm on outdoor data, compared to differential GPS data, shows that the proposed algorithm is more accurate in terms of root-mean-square error, as well as having a lower percentage of scale error. Runtime analysis shows that the proposed algorithm updates more frequently than other, similar, algorithms

    Neural radiance fields in the industrial and robotics domain: applications, research opportunities and use cases

    Full text link
    The proliferation of technologies, such as extended reality (XR), has increased the demand for high-quality three-dimensional (3D) graphical representations. Industrial 3D applications encompass computer-aided design (CAD), finite element analysis (FEA), scanning, and robotics. However, current methods employed for industrial 3D representations suffer from high implementation costs and reliance on manual human input for accurate 3D modeling. To address these challenges, neural radiance fields (NeRFs) have emerged as a promising approach for learning 3D scene representations based on provided training 2D images. Despite a growing interest in NeRFs, their potential applications in various industrial subdomains are still unexplored. In this paper, we deliver a comprehensive examination of NeRF industrial applications while also providing direction for future research endeavors. We also present a series of proof-of-concept experiments that demonstrate the potential of NeRFs in the industrial domain. These experiments include NeRF-based video compression techniques and using NeRFs for 3D motion estimation in the context of collision avoidance. In the video compression experiment, our results show compression savings up to 48\% and 74\% for resolutions of 1920x1080 and 300x168, respectively. The motion estimation experiment used a 3D animation of a robotic arm to train Dynamic-NeRF (D-NeRF) and achieved an average peak signal-to-noise ratio (PSNR) of disparity map with the value of 23 dB and an structural similarity index measure (SSIM) 0.97

    From Capture to Display: A Survey on Volumetric Video

    Full text link
    Volumetric video, which offers immersive viewing experiences, is gaining increasing prominence. With its six degrees of freedom, it provides viewers with greater immersion and interactivity compared to traditional videos. Despite their potential, volumetric video services poses significant challenges. This survey conducts a comprehensive review of the existing literature on volumetric video. We firstly provide a general framework of volumetric video services, followed by a discussion on prerequisites for volumetric video, encompassing representations, open datasets, and quality assessment metrics. Then we delve into the current methodologies for each stage of the volumetric video service pipeline, detailing capturing, compression, transmission, rendering, and display techniques. Lastly, we explore various applications enabled by this pioneering technology and we present an array of research challenges and opportunities in the domain of volumetric video services. This survey aspires to provide a holistic understanding of this burgeoning field and shed light on potential future research trajectories, aiming to bring the vision of volumetric video to fruition.Comment: Submitte

    Plenoptic Signal Processing for Robust Vision in Field Robotics

    Get PDF
    This thesis proposes the use of plenoptic cameras for improving the robustness and simplicity of machine vision in field robotics applications. Dust, rain, fog, snow, murky water and insufficient light can cause even the most sophisticated vision systems to fail. Plenoptic cameras offer an appealing alternative to conventional imagery by gathering significantly more light over a wider depth of field, and capturing a rich 4D light field structure that encodes textural and geometric information. The key contributions of this work lie in exploring the properties of plenoptic signals and developing algorithms for exploiting them. It lays the groundwork for the deployment of plenoptic cameras in field robotics by establishing a decoding, calibration and rectification scheme appropriate to compact, lenslet-based devices. Next, the frequency-domain shape of plenoptic signals is elaborated and exploited by constructing a filter which focuses over a wide depth of field rather than at a single depth. This filter is shown to reject noise, improving contrast in low light and through attenuating media, while mitigating occluders such as snow, rain and underwater particulate matter. Next, a closed-form generalization of optical flow is presented which directly estimates camera motion from first-order derivatives. An elegant adaptation of this "plenoptic flow" to lenslet-based imagery is demonstrated, as well as a simple, additive method for rendering novel views. Finally, the isolation of dynamic elements from a static background is considered, a task complicated by the non-uniform apparent motion caused by a mobile camera. Two elegant closed-form solutions are presented dealing with monocular time-series and light field image pairs. This work emphasizes non-iterative, noise-tolerant, closed-form, linear methods with predictable and constant runtimes, making them suitable for real-time embedded implementation in field robotics applications
    corecore