73,893 research outputs found

    Design of an Active Stereo Vision 3D Scene Reconstruction System Based on the Linear Position Sensor Module

    Get PDF
    Active vision systems and passive vision systems currently exist for three-dimensional (3D) scene reconstruction. Active systems use a laser that interacts with the scene. Passive systems implement stereo vision, using two cameras and geometry to reconstruct the scene. Each type of system has advantages and disadvantages in resolution, speed, and scene depth. It may be possible to combine the advantages of both systems as well as new hardware technologies such as position sensitive devices (PSDs) and field programmable gate arrays (FPGAs) to create a real-time, mid-range 3D scene reconstruction system. Active systems usually reconstruct long-range scenes so that a measurable amount of time can pass for the laser to travel to the scene and back. Passive systems usually reconstruct close-range scenes but must overcome the correspondence problem. If PSDs are placed in a stereo vision configuration and a laser is directed at the scene, the correspondence problem can be eliminated. The laser can scan the entire scene as the PSDs continually pick up points, and the scene can be reconstructed. By eliminating the correspondence problem, much of the computation time of stereo vision is removed, allowing larger scenes, possibly at mid-range, to be modeled. To give good resolution at a real-time frame rate, points would have to be recorded very quickly. PSDs are analog devices that give the position of a light spot and have very fast response times. The cameras in the system can be replaced by PSDs to help achieve real- time refresh rates and better resolution. A contribution of this thesis is to design a 3D scene reconstruction system by placing two PSDs in a stereo vision configuration and to use FPGAs to perform calculations to achieve real-time frame rates of mid-range scenes. The linear position sensor module (LPSM) made by Noah Corp is based on a PSD and outputs a position in terms of voltage. The LPSM is characterized for this application by testing it with different power lasers while also varying environment variables such as background light, scene type, and scene distance. It is determined that the LPSM is sensitive to red wavelength lasers. When the laser is reflected off of diffuse surfaces, the laser must output at least 500 mW to be picked up by the LPSM and the scene must be within 15 inches, or the power intensity will not meet the intensity requirements of the LPSM. The establishment of these performance boundaries is a contribution of the thesis along with characterizing and testing the LPSM as a vision sensor in the proposed scene reconstruction system. Once performance boundaries are set, the LPSM is used to model calibrated objects. LPSM sensitivity to power intensity changes seems to cause considerable error. The change in power appears to be a function of depth due to the dispersion of the laser beam. The model is improved by using a correction factor to find the position of the light spot. Using a better-focused laser may improve the results. Another option is to place two PSDs in the same configuration and test to see whether the intensity problem is intrinsic to all PSDs or if the problem is unique to the LPSM

    Active Image-based Modeling with a Toy Drone

    Full text link
    Image-based modeling techniques can now generate photo-realistic 3D models from images. But it is up to users to provide high quality images with good coverage and view overlap, which makes the data capturing process tedious and time consuming. We seek to automate data capturing for image-based modeling. The core of our system is an iterative linear method to solve the multi-view stereo (MVS) problem quickly and plan the Next-Best-View (NBV) effectively. Our fast MVS algorithm enables online model reconstruction and quality assessment to determine the NBVs on the fly. We test our system with a toy unmanned aerial vehicle (UAV) in simulated, indoor and outdoor experiments. Results show that our system improves the efficiency of data acquisition and ensures the completeness of the final model.Comment: To be published on International Conference on Robotics and Automation 2018, Brisbane, Australia. Project Page: https://huangrui815.github.io/active-image-based-modeling/ The author's personal page: http://www.sfu.ca/~rha55

    Fast and Accurate Camera Covariance Computation for Large 3D Reconstruction

    Full text link
    Estimating uncertainty of camera parameters computed in Structure from Motion (SfM) is an important tool for evaluating the quality of the reconstruction and guiding the reconstruction process. Yet, the quality of the estimated parameters of large reconstructions has been rarely evaluated due to the computational challenges. We present a new algorithm which employs the sparsity of the uncertainty propagation and speeds the computation up about ten times \wrt previous approaches. Our computation is accurate and does not use any approximations. We can compute uncertainties of thousands of cameras in tens of seconds on a standard PC. We also demonstrate that our approach can be effectively used for reconstructions of any size by applying it to smaller sub-reconstructions.Comment: ECCV 201

    Event-based Vision: A Survey

    Get PDF
    Event cameras are bio-inspired sensors that differ from conventional frame cameras: Instead of capturing images at a fixed rate, they asynchronously measure per-pixel brightness changes, and output a stream of events that encode the time, location and sign of the brightness changes. Event cameras offer attractive properties compared to traditional cameras: high temporal resolution (in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low power consumption, and high pixel bandwidth (on the order of kHz) resulting in reduced motion blur. Hence, event cameras have a large potential for robotics and computer vision in challenging scenarios for traditional cameras, such as low-latency, high speed, and high dynamic range. However, novel methods are required to process the unconventional output of these sensors in order to unlock their potential. This paper provides a comprehensive overview of the emerging field of event-based vision, with a focus on the applications and the algorithms developed to unlock the outstanding properties of event cameras. We present event cameras from their working principle, the actual sensors that are available and the tasks that they have been used for, from low-level vision (feature detection and tracking, optic flow, etc.) to high-level vision (reconstruction, segmentation, recognition). We also discuss the techniques developed to process events, including learning-based techniques, as well as specialized processors for these novel sensors, such as spiking neural networks. Additionally, we highlight the challenges that remain to be tackled and the opportunities that lie ahead in the search for a more efficient, bio-inspired way for machines to perceive and interact with the world

    An intelligent real time 3D vision system for robotic welding tasks

    Get PDF
    MARWIN is a top-level robot control system that has been designed for automatic robot welding tasks. It extracts welding parameters and calculates robot trajectories directly from CAD models which are then verified by real-time 3D scanning and registration. MARWIN's 3D computer vision provides a user-centred robot environment in which a task is specified by the user by simply confirming and/or adjusting suggested parameters and welding sequences. The focus of this paper is on describing a mathematical formulation for fast 3D reconstruction using structured light together with the mechanical design and testing of the 3D vision system and show how such technologies can be exploited in robot welding tasks

    Optical techniques for 3D surface reconstruction in computer-assisted laparoscopic surgery

    Get PDF
    One of the main challenges for computer-assisted surgery (CAS) is to determine the intra-opera- tive morphology and motion of soft-tissues. This information is prerequisite to the registration of multi-modal patient-specific data for enhancing the surgeon’s navigation capabilites by observ- ing beyond exposed tissue surfaces and for providing intelligent control of robotic-assisted in- struments. In minimally invasive surgery (MIS), optical techniques are an increasingly attractive approach for in vivo 3D reconstruction of the soft-tissue surface geometry. This paper reviews the state-of-the-art methods for optical intra-operative 3D reconstruction in laparoscopic surgery and discusses the technical challenges and future perspectives towards clinical translation. With the recent paradigm shift of surgical practice towards MIS and new developments in 3D opti- cal imaging, this is a timely discussion about technologies that could facilitate complex CAS procedures in dynamic and deformable anatomical regions

    LiveCap: Real-time Human Performance Capture from Monocular Video

    Full text link
    We present the first real-time human performance capture approach that reconstructs dense, space-time coherent deforming geometry of entire humans in general everyday clothing from just a single RGB video. We propose a novel two-stage analysis-by-synthesis optimization whose formulation and implementation are designed for high performance. In the first stage, a skinned template model is jointly fitted to background subtracted input video, 2D and 3D skeleton joint positions found using a deep neural network, and a set of sparse facial landmark detections. In the second stage, dense non-rigid 3D deformations of skin and even loose apparel are captured based on a novel real-time capable algorithm for non-rigid tracking using dense photometric and silhouette constraints. Our novel energy formulation leverages automatically identified material regions on the template to model the differing non-rigid deformation behavior of skin and apparel. The two resulting non-linear optimization problems per-frame are solved with specially-tailored data-parallel Gauss-Newton solvers. In order to achieve real-time performance of over 25Hz, we design a pipelined parallel architecture using the CPU and two commodity GPUs. Our method is the first real-time monocular approach for full-body performance capture. Our method yields comparable accuracy with off-line performance capture techniques, while being orders of magnitude faster
    • …
    corecore