505 research outputs found

    NOVEL DENSE STEREO ALGORITHMS FOR HIGH-QUALITY DEPTH ESTIMATION FROM IMAGES

    Get PDF
    This dissertation addresses the problem of inferring scene depth information from a collection of calibrated images taken from different viewpoints via stereo matching. Although it has been heavily investigated for decades, depth from stereo remains a long-standing challenge and popular research topic for several reasons. First of all, in order to be of practical use for many real-time applications such as autonomous driving, accurate depth estimation in real-time is of great importance and one of the core challenges in stereo. Second, for applications such as 3D reconstruction and view synthesis, high-quality depth estimation is crucial to achieve photo realistic results. However, due to the matching ambiguities, accurate dense depth estimates are difficult to achieve. Last but not least, most stereo algorithms rely on identification of corresponding points among images and only work effectively when scenes are Lambertian. For non-Lambertian surfaces, the brightness constancy assumption is no longer valid. This dissertation contributes three novel stereo algorithms that are motivated by the specific requirements and limitations imposed by different applications. In addressing high speed depth estimation from images, we present a stereo algorithm that achieves high quality results while maintaining real-time performance. We introduce an adaptive aggregation step in a dynamic-programming framework. Matching costs are aggregated in the vertical direction using a computationally expensive weighting scheme based on color and distance proximity. We utilize the vector processing capability and parallelism in commodity graphics hardware to speed up this process over two orders of magnitude. In addressing high accuracy depth estimation, we present a stereo model that makes use of constraints from points with known depths - the Ground Control Points (GCPs) as referred to in stereo literature. Our formulation explicitly models the influences of GCPs in a Markov Random Field. A novel regularization prior is naturally integrated into a global inference framework in a principled way using the Bayes rule. Our probabilistic framework allows GCPs to be obtained from various modalities and provides a natural way to integrate information from various sensors. In addressing non-Lambertian reflectance, we introduce a new invariant for stereo correspondence which allows completely arbitrary scene reflectance (bidirectional reflectance distribution functions - BRDFs). This invariant can be used to formulate a rank constraint on stereo matching when the scene is observed by several lighting configurations in which only the lighting intensity varies

    Panoramic Stereovision and Scene Reconstruction

    Get PDF
    With advancement of research in robotics and computer vision, an increasingly high number of applications require the understanding of a scene in three dimensions. A variety of systems are deployed to do the same. This thesis explores a novel 3D imaging technique. This involves the use of catadioptric cameras in a stereoscopic arrangement. A secondary system aims to stabilize the system in the event that the cameras are misaligned during operation. The system provides a stark advantage due to it being a cost effective alternative to present day standard state-of-the-art systems that achieve the same goal of 3D imaging. The compromise lies in the quality of depth estimation, which can be overcome with a different imager and calibration. The result was a panoramic disparity map generated by the system

    Structured light assisted real-time stereo photogrammetry for robotics and automation. Novel implementation of stereo matching

    Get PDF
    In this Master’s thesis project a novel implementation of a stereo matching based method is proposed. Moreover, an exhaustive analysis of the state-of-the-art algorithms in that field is outlined. Specifically, both standard and deep learning based methods have been extensively investigated, thus to provide useful insights for the designed implementation. Regarding the developed work, it is basically structured in the following manner. At first a research phase has been carried out, hence to simply and rapidly test the thought strategy. Subsequently, a first implementation of the algorithm has been designed and tested using data available from the Middlebury 2014 dataset, which is one of the most exploited dataset in the computer vision area. At this stage, numerous tests have been completed and consequently various changes to the algorithm pipeline have been made, in order to improve the final result. Finally, after that exhaustive researching phase the actual method has been designed and tested using real environment images obtained from the stereo device developed by the company, in which this work has been produced. Fundamental element of the project is indeed that stereo device. As a matter of fact, the designed algorithm in based on the data produced by the cameras that constitute it. Specifically, the main function of the system designed by LaDiMo is to make the built stereo matching based procedure simultaneously faster and accurate. As a matter of fact one of the main prerogative of the project was to create an algorithm that has to prove potential real-time results. This has been in fact, achieved by applying one of the two methods created. Specifically, it is a lightweight implementation, which strongly exploits the information coming from the LaDiMo device, thus to provide accurate results, keeping the computational time short. At the end of this Master’s thesis images showing the main outcomes obtained are proposed. Moreover, a discussion regarding the further improvements that are going to be added to the project is stated. In fact, the method implemented, being not optimized only demonstrate a potential real-time implementation, which would be certainly achieved through an efficient refactoring of the main pipeline

    Automatic Dense 3D Scene Mapping from Non-overlapping Passive Visual Sensors for Future Autonomous Systems

    Get PDF
    The ever increasing demand for higher levels of autonomy for robots and vehicles means there is an ever greater need for such systems to be aware of their surroundings. Whilst solutions already exist for creating 3D scene maps, many are based on active scanning devices such as laser scanners and depth cameras that are either expensive, unwieldy, or do not function well under certain environmental conditions. As a result passive cameras are a favoured sensor due their low cost, small size, and ability to work in a range of lighting conditions. In this work we address some of the remaining research challenges within the problem of 3D mapping around a moving platform. We utilise prior work in dense stereo imaging, Stereo Visual Odometry (SVO) and extend Structure from Motion (SfM) to create a pipeline optimised for on vehicle sensing. Using forward facing stereo cameras, we use state of the art SVO and dense stereo techniques to map the scene in front of the vehicle. With significant amounts of prior research in dense stereo, we addressed the issue of selecting an appropriate method by creating a novel evaluation technique. Visual 3D mapping of dynamic scenes from a moving platform result in duplicated scene objects. We extend the prior work on mapping by introducing a generalized dynamic object removal process. Unlike other approaches that rely on computationally expensive segmentation or detection, our method utilises existing data from the mapping stage and the findings from our dense stereo evaluation. We introduce a new SfM approach that exploits our platform motion to create a novel dense mapping process that exceeds the 3D data generation rate of state of the art alternatives. Finally, we combine dense stereo, SVO, and our SfM approach to automatically align point clouds from non-overlapping views to create a rotational and scale consistent global 3D model

    Tuning the Computational Effort: An Adaptive Accuracy-aware Approach Across System Layers

    Get PDF
    This thesis introduces a novel methodology to realize accuracy-aware systems, which will help designers integrate accuracy awareness into their systems. It proposes an adaptive accuracy-aware approach across system layers that addresses current challenges in that domain, combining and tuning accuracy-aware methods on different system layers. To widen the scope of accuracy-aware computing including approximate computing for other domains, this thesis presents innovative accuracy-aware methods and techniques for different system layers. The required tuning of the accuracy-aware methods is integrated into a configuration layer that tunes the available knobs of the accuracy-aware methods integrated into a system

    Vertical Optimizations of Convolutional Neural Networks for Embedded Systems

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    Object Detection with Deep Learning to Accelerate Pose Estimation for Automated Aerial Refueling

    Get PDF
    Remotely piloted aircraft (RPAs) cannot currently refuel during flight because the latency between the pilot and the aircraft is too great to safely perform aerial refueling maneuvers. However, an AAR system removes this limitation by allowing the tanker to directly control the RP A. The tanker quickly finding the relative position and orientation (pose) of the approaching aircraft is the first step to create an AAR system. Previous work at AFIT demonstrates that stereo camera systems provide robust pose estimation capability. This thesis first extends that work by examining the effects of the cameras\u27 resolution on the quality of pose estimation. Next, it demonstrates a deep learning approach to accelerate the pose estimation process. The results show that this pose estimation process is precise and fast enough to safely perform AAR

    Stereo uparivanje iz video isječka

    Get PDF
    This paper proposes a novel method for stereo matching which is based on combination of active and passive stereo 3D reconstruction approaches. A laser line is used to scan the reconstructed scene and a stereo camera pair is used for the image acquisition. Each image pixel is scanned at a specific scan time so that the intensity time patterns of the correspondent pixels are highly correlated. This yield in highly confident and accurate disparity map calculation and also allows the reconstruction of poorly textured as well as the extremely textured surface which are very hard to deal with using the conventional passive stereo approaches. The occluded regions are also detected successfully. This method is not computationally intensive and can be used for turning the smartphone into the practical 3D scanner as presented in this work.Ovim radom predstavljena je nova metoda stereo uparivanja temeljena na kombinaciji aktivnog i pasivnog stereo pristupa. Rekonstruirana scena skenirana je laserskom linijom, dok se par stereo kamera koristi za akviziciju video isječka. Svaki slikovni element rekonstruirane scene skeniran je laserskom linijom u određenom trenutku stoga su profili intenziteta svjetline u vremenskoj domeni izrazito korelirani za slikovne elemente lijeve i desne kamere koji odgovaraju istom slikovnom element rekonstruirane scene. Stoga je rezultat predstavljene metode određivanje stereo parova slikovnih elemenata s visokom pouzdanošću. Nadalje, predstavljena metoda omogućuje rekonstruiranje izrazito slabo odnosno izrazito intenzivno teksturiranih scena što je često veoma teško postići korištenjem konvencionalnih metoda stereo 3D rekonstrukcije. Metoda je jednostavna te ju je moguće implementirati na sustavima ograničenih računarskih resursa, stoga je iznimno pogodna za primjenu na mobilnim platformama primjerice pametnim telefonima

    Towards Real-time Remote Processing of Laparoscopic Video

    Get PDF
    Laparoscopic surgery is a minimally invasive technique where surgeons insert a small video camera into the patient\u27s body to visualize internal organs and use small tools to perform these procedures. However, the benefit of small incisions has a disadvantage of limited visualization of subsurface tissues. Image-guided surgery (IGS) uses pre-operative and intra-operative images to map subsurface structures and can reduce the limitations of laparoscopic surgery. One particular laparoscopic system is the daVinci-si robotic surgical vision system. The video streams generate approximately 360 megabytes of data per second, demonstrating a trend toward increased data sizes in medicine, primarily due to higher-resolution video cameras and imaging equipment. Real-time processing this large stream of data on a bedside PC, single or dual node setup, may be challenging and a high-performance computing (HPC) environment is not typically available at the point of care. To process this data on remote HPC clusters at the typical 30 frames per second rate (fps), it is required that each 11.9 MB (1080p) video frame be processed by a server and returned within the time this frame is displayed or 1/30th of a second. The ability to acquire, process, and visualize data in real time is essential for the performance of complex tasks as well as minimizing risk to the patient. We have implemented and compared performance of compression, segmentation and registration algorithms on Clemson\u27s Palmetto supercomputer using dual Nvidia graphics processing units (GPUs) per node and compute unified device architecture (CUDA) programming model. We developed three separate applications that run simultaneously: video acquisition, image processing, and video display. The image processing application allows several algorithms to run simultaneously on different cluster nodes and transfer images through message passing interface (MPI). Our segmentation and registration algorithms resulted in an acceleration factor of around 2 and 8 times respectively. To achieve a higher frame rate, we also resized images and reduced the overall processing time. As a result, using high-speed network to access computing clusters with GPUs to implement these algorithms in parallel will improve surgical procedures by providing real-time medical image processing and laparoscopic data
    corecore