28 research outputs found

    Correcting Decalibration of Stereo Cameras in Self-Driving Vehicles

    Full text link
    We address the problem of optical decalibration in mobile stereo camera setups, especially in context of autonomous vehicles. In real world conditions, an optical system is subject to various sources of anticipated and unanticipated mechanical stress (vibration, rough handling, collisions). Mechanical stress changes the geometry between the cameras that make up the stereo pair, and as a consequence, the pre-calculated epipolar geometry is no longer valid. Our method is based on optimization of camera geometry parameters and plugs directly into the output of the stereo matching algorithm. Therefore, it is able to recover calibration parameters on image pairs obtained from a decalibrated stereo system with minimal use of additional computing resources. The number of successfully recovered depth pixels is used as an objective function, which we aim to maximize. Our simulation confirms that the method can run constantly in parallel to stereo estimation and thus help keep the system calibrated in real time. Results confirm that the method is able to recalibrate all the parameters except for the baseline distance, which scales the absolute depth readings. However, that scaling factor could be uniquely determined using any kind of absolute range finding methods (e.g. a single beam time-of-flight sensor).Comment: 8 pages, 9 figure

    An Efficient Calibration Method for a Stereo Camera System with Heterogeneous Lenses Using an Embedded Checkerboard Pattern

    Get PDF
    We present two simple approaches to calibrate a stereo camera setup with heterogeneous lenses: a wide-angle fish-eye lens and a narrow-angle lens in left and right sides, respectively. Instead of using a conventional black-white checkerboard pattern, we design an embedded checkerboard pattern by combining two differently colored patterns. In both approaches, we split the captured stereo images into RGB channels and extract R and inverted G channels from left and right camera images, respectively. In our first approach, we consider the checkerboard pattern as the world coordinate system and calculate left and right transformation matrices corresponding to it. We use these two transformation matrices to estimate the relative pose of the right camera by multiplying the inversed left transformation with the right. In the second approach, we calculate a planar homography transformation to identify common object points in left-right image pairs and treat them with the well-known Zhangs camera calibration method. We analyze the robustness of these two approaches by comparing reprojection errors and image rectification results. Experimental results show that the second method is more accurate than the first one

    Laser and Camera Intercalibration Techniques for Multi-Sensorized Vehicles

    Get PDF
    This thesis presents the topic of the extrinsic calibration of active and passive sensors which are used on modern intelligent vehicles to get a rich perception of the surrounding environment. An in-depth analysis of the intercalibration procedure was conduced with respect to the data fusion accuracy. Several laser and camera intercalibration procedure are presented and a new method based on triangular calibration target is detailed. Finally, a calibration procedure is proposed; tested on different prototypes (e.g., BRAiVE and VIAC vehicles) with different sensor suits

    Calibration of non-conventional imaging systems

    Get PDF

    Enhancing 3D Visual Odometry with Single-Camera Stereo Omnidirectional Systems

    Full text link
    We explore low-cost solutions for efficiently improving the 3D pose estimation problem of a single camera moving in an unfamiliar environment. The visual odometry (VO) task -- as it is called when using computer vision to estimate egomotion -- is of particular interest to mobile robots as well as humans with visual impairments. The payload capacity of small robots like micro-aerial vehicles (drones) requires the use of portable perception equipment, which is constrained by size, weight, energy consumption, and processing power. Using a single camera as the passive sensor for the VO task satisfies these requirements, and it motivates the proposed solutions presented in this thesis. To deliver the portability goal with a single off-the-shelf camera, we have taken two approaches: The first one, and the most extensively studied here, revolves around an unorthodox camera-mirrors configuration (catadioptrics) achieving a stereo omnidirectional system (SOS). The second approach relies on expanding the visual features from the scene into higher dimensionalities to track the pose of a conventional camera in a photogrammetric fashion. The first goal has many interdependent challenges, which we address as part of this thesis: SOS design, projection model, adequate calibration procedure, and application to VO. We show several practical advantages for the single-camera SOS due to its complete 360-degree stereo views, that other conventional 3D sensors lack due to their limited field of view. Since our omnidirectional stereo (omnistereo) views are captured by a single camera, a truly instantaneous pair of panoramic images is possible for 3D perception tasks. Finally, we address the VO problem as a direct multichannel tracking approach, which increases the pose estimation accuracy of the baseline method (i.e., using only grayscale or color information) under the photometric error minimization as the heart of the “direct” tracking algorithm. Currently, this solution has been tested on standard monocular cameras, but it could also be applied to an SOS. We believe the challenges that we attempted to solve have not been considered previously with the level of detail needed for successfully performing VO with a single camera as the ultimate goal in both real-life and simulated scenes

    Analysis of human motion with vision systems: kinematic and dynamic parameters estimation

    Get PDF
    This work presents a multicamera motion capture system able to digitize, measure and analyse the human motion. Key feature of this system is an easy wearable garment printed with a color coded pattern. The pattern of coloured markers allows simultaneous reconstruction of shape and motion of the subject. With the information gathered we can also estimate both kinematic and dynamic motion parameters. In the framework of this research we developed algorithms to: design the color coded pattern, perform 3D shape reconstruction, estimate kinematic and dynamic motion parameters and calibrate the multi-camera system. We paid particular attention to estimate the uncertainty of the kinematics parameters, also comparing the results obtained with commercial systems. The work presents also an overview of some real-world application in which the developed system has been used as measurement tool

    Técnicas de coste reducido para el posicionamiento del paciente en radioterapia percutánea utilizando un sistema de imágenes ópticas

    Get PDF
    Patient positioning is an important part of radiation therapy which is one of the main solutions for the treatment of malignant tissue in the human body. Currently, the most common patient positioning methods expose healthy tissue of the patient's body to extra dangerous radiations. Other non-invasive positioning methods are either not very accurate or are very costly for an average hospital. In this thesis, we explore the possibility of developing a system comprised of affordable hardware and advanced computer vision algorithms that facilitates patient positioning. Our algorithms are based on the usage of affordable RGB-D sensors, image features, ArUco planar markers, and other geometry registration methods. Furthermore, we take advantage of consumer-level computing hardware to make our systems widely accessible. More specifically, we avoid the usage of approaches that need to take advantage of dedicated GPU hardware for general-purpose computing since they are more costly. In different publications, we explore the usage of the mentioned tools to increase the accuracy of reconstruction/localization of the patient in its pose. We also take into account the visualization of the patient's target position with respect to their current position in order to assist the person who performs patient positioning. Furthermore, we make usage of augmented reality in conjunction with a real-time 3D tracking algorithm for better interaction between the program and the operator. We also solve more fundamental problems about ArUco markers that could be used in the future to improve our systems. These include highquality multi-camera calibration and mapping using ArUco markers plus detection of these markers in event cameras which are very useful in the presence of fast camera movement. In the end, we conclude that it is possible to increase the accuracy of 3D reconstruction and localization by combining current computer vision algorithms with fiducial planar markers with RGB-D sensors. This is reflected in the low amount of error we have achieved in our experiments for patient positioning, pushing forward the state of the art for this application.En el tratamiento de tumores malignos en el cuerpo, el posicionamiento del paciente en las sesiones de radioterapia es una cuestión crucial. Actualmente, los métodos más comunes de posicionamiento del paciente exponen tejido sano del mismo a radiaciones peligrosas debido a que no es posible asegurar que la posición del paciente siempre sea la misma que la que tuvo cuando se planificó la zona a radiar. Los métodos que se usan actualmente, o no son precisos o tienen costes que los hacen inasequibles para ser usados en hospitales con financiación limitada. En esta Tesis hemos analizado la posibilidad de desarrollar un sistema compuesto por hardware de bajo coste y métodos avanzados de visión por ordenador que ayuden a que el posicionamiento del paciente sea el mismo en las diferentes sesiones de radioterapia, con respecto a su pose cuando fue se planificó la zona a radiar. La solución propuesta como resultado de la Tesis se basa en el uso de sensores RGB-D, características extraídas de la imagen, marcadores cuadrados denominados ArUco y métodos de registro de la geometría en la imagen. Además, en la solución propuesta, se aprovecha la existencia de hardware convencional de bajo coste para hacer nuestro sistema ampliamente accesible. Más específicamente, evitamos el uso de enfoques que necesitan aprovechar GPU, de mayores costes, para computación de propósito general. Se han obtenido diferentes publicaciones para conseguir el objetivo final. Las mismas describen métodos para aumentar la precisión de la reconstrucción y la localización del paciente en su pose, teniendo en cuenta la visualización de la posición ideal del paciente con respecto a su posición actual, para ayudar al profesional que realiza la colocación del paciente. También se han propuesto métodos de realidad aumentada junto con algoritmos para seguimiento 3D en tiempo real para conseguir una mejor interacción entre el sistema ideado y el profesional que debe realizar esa labor. De forma añadida, también se han propuesto soluciones para problemas fundamentales relacionados con el uso de marcadores cuadrados que han sido utilizados para conseguir el objetivo de la Tesis. Las soluciones propuestas pueden ser empleadas en el futuro para mejorar otros sistemas. Los problemas citados incluyen la calibración y el mapeo multicámara de alta calidad utilizando los marcadores y la detección de estos marcadores en cámaras de eventos, que son muy útiles en presencia de movimientos rápidos de la cámara. Al final, concluimos que es posible aumentar la precisión de la reconstrucción y localización en 3D combinando los actuales algoritmos de visión por ordenador, que usan marcadores cuadrados de referencia, con sensores RGB-D. Los resultados obtenidos con respecto al error que el sistema obtiene al reproducir el posicionamiento del paciente suponen un importante avance en el estado del arte de este tópico

    Analyse und Modellierung dynamischer dreidimensionaler Szenen unter Verwendung einer Laufzeitkamera

    Get PDF
    Many applications in Computer Vision require the automatic analysis and reconstruction of static and dynamic scenes. Therefore the automatic analysis of three-dimensional scenes is an area which is intensively investigated. Most approaches focus on the reconstruction of rigid geometry because the reconstruction of non-rigid geometry is far more challenging and requires that three-dimensional data is available at high frame-rates. Rigid scene analysis is for example used in autonomous navigation, for surveillance and for the conservation of cultural heritage. The analysis and reconstruction of non-rigid geometry on the other hand provides a lot more possibilities, not only for the above-mentioned applications. In the production of media content for television or cinema the analysis, recording and playback of full 3D content can be used to generate new views of real scenes or to replace real actors by animated artificial characters. The most important requirement for the analysis of dynamic content is the availability of reliable three-dimensional scene data. Mostly stereo methods have been used to compute the depth of scene points, but these methods are computationally expensive and do not provide sufficient quality in real-time. In recent years the so-called Time-of-Flight cameras have left the prototype stadium and are now capable to deliver dense depth information in real-time at reasonable quality and price. This thesis investigates the suitability of these cameras for the purpose of dynamic three-dimensional scene analysis. Before a Time-of-Flight camera can be used to analyze three-dimensional scenes it has to be calibrated internally and externally. Moreover, Time-of-Flight cameras suffer from systematic depth measurement errors due to their operation principle. This thesis proposes an approach to estimate all necessary parameters in one calibration step. In the following the reconstruction of rigid environments and objects is investigated and solutions for these tasks are presented. The reconstruction of dynamic scenes and the generation of novel views of dynamic scenes is achieved by the introduction of a volumetric data structure to store and fuse the depth measurements and their change over time. Finally a Mixed Reality system is presented in which the contributions of this thesis are brought together. This system is able to combine real and artificial scene elements with correct mutual occlusion, mutual shadowing and physical interaction. This thesis shows that Time-of-Flight cameras are a suitable choice for the analysis of rigid as well as non-rigid scenes under certain conditions. It contains important contributions for the necessary steps of calibration, preprocessing of depth data and reconstruction and analysis of three-dimensional scenes.Viele Anwendungen des Maschinellen Sehens benötigen die automatische Analyse und Rekonstruktion von statischen und dynamischen Szenen. Deshalb ist die automatische Analyse von dreidimensionalen Szenen und Objekten ein Bereich der intensiv erforscht wird. Die meisten Ansätze konzentrieren sich auf die Rekonstruktion statischer Szenen, da die Rekonstruktion nicht-statischer Geometrien viel herausfordernder ist und voraussetzt, dass dreidimensionale Szeneninformation mit hoher zeitlicher Auflösung verfügbar ist. Statische Szenenanalyse wird beispielsweise in der autonomen Navigation, für die Überwachung und für die Erhaltung des Kulturerbes eingesetzt. Andererseits eröffnet die Analyse und Rekonstruktion nicht-statischer Geometrie viel mehr Möglichkeiten, nicht nur für die bereits erwähnten Anwendungen. In der Produktion von Medieninhalten für Film und Fernsehen kann die Analyse und die Aufnahme und Wiedergabe von vollständig dreidimensionalen Inhalten verwendet werden um neue Ansichten realer Szenen zu erzeugen oder echte Schauspieler durch animierte virtuelle Charaktere zu ersetzen. Die wichtigste Voraussetzung für die Analyse von dynamischen Inhalten ist die Verfügbarkeit von zuverlässigen dreidimensionalen Szeneninformationen. Um die Entfernung von Punkten in der Szene zu bestimmen wurden meistens Stereo-Verfahren eingesetzt, aber diese Verfahren benötigen viel Rechenzeit und erreichen in Echtzeit nicht die benötigte Qualität. In den letzten Jahren haben die so genannten Laufzeitkameras das Stadium der Prototypen verlassen und sind jetzt in der Lage dichte Tiefeninformationen in vernünftiger Qualität zu einem vernünftigen Preis zu liefern. Diese Arbeit untersucht die Eignung dieser Kameras für die Analyse nicht-statischer dreidimensionaler Szenen. Bevor eine Laufzeitkamera für die Analyse eingesetzt werden kann muss sie intern und extern kalibriert werden. Darüber hinaus leiden Laufzeitkameras an systematischen Fehlern bei der Entfernungsmessung, bedingt durch ihr Funktionsprinzip. Diese Arbeit stellt ein Verfahren vor um alle nötigen Parameter in einem Kalibrierschritt zu berechnen. Im Weiteren wird die Rekonstruktion von statischen Umgebungen und Objekten untersucht und Lösungen für diese Aufgaben werden präsentiert. Die Rekonstruktion von nicht-statischen Szenen und die Erzeugung neuer Ansichten solcher Szenen wird mit der Einführung einer volumetrischen Datenstruktur erreicht, in der die Tiefenmessungen und ihr Änderungen über die Zeit gespeichert und fusioniert werden. Schließlich wird ein Mixed Reality System vorgestellt in welchem die Beiträge dieser Arbeit zusammengeführt werden. Dieses System ist in der Lage reale und künstliche Szenenelemente unter Beachtung von korrekter gegenseitiger Verdeckung, Schattenwurf und physikalischer Interaktion zu kombinieren. Diese Arbeit zeigt, dass Laufzeitkameras unter bestimmten Voraussetzungen eine geeignete Wahl für die Analyse von statischen und nicht-statischen Szenen sind. Sie enthält wichtige Beiträge für die notwendigen Schritte der Kalibrierung, der Vorverarbeitung von Tiefendaten und der Rekonstruktion und der Analyse von dreidimensionalen Szenen

    Robot Assisted Object Manipulation for Minimally Invasive Surgery

    Get PDF
    Robotic systems have an increasingly important role in facilitating minimally invasive surgical treatments. In robot-assisted minimally invasive surgery, surgeons remotely control instruments from a console to perform operations inside the patient. However, despite the advanced technological status of surgical robots, fully autonomous systems, with decision-making capabilities, are not yet available. In 2017, a structure to classify the research efforts toward autonomy achievable with surgical robots was proposed by Yang et al. Six different levels were identified: no autonomy, robot assistance, task autonomy, conditional autonomy, high autonomy, and full autonomy. All the commercially available platforms in robot-assisted surgery is still in level 0 (no autonomy). Despite increasing the level of autonomy remains an open challenge, its adoption could potentially introduce multiple benefits, such as decreasing surgeons’ workload and fatigue and pursuing a consistent quality of procedures. Ultimately, allowing the surgeons to interpret the ample and intelligent information from the system will enhance the surgical outcome and positively reflect both on patients and society. Three main aspects are required to introduce automation into surgery: the surgical robot must move with high precision, have motion planning capabilities and understand the surgical scene. Besides these main factors, depending on the type of surgery, there could be other aspects that might play a fundamental role, to name some compliance, stiffness, etc. This thesis addresses three technological challenges encountered when trying to achieve the aforementioned goals, in the specific case of robot-object interaction. First, how to overcome the inaccuracy of cable-driven systems when executing fine and precise movements. Second, planning different tasks in dynamically changing environments. Lastly, how the understanding of a surgical scene can be used to solve more than one manipulation task. To address the first challenge, a control scheme relying on accurate calibration is implemented to execute the pick-up of a surgical needle. Regarding the planning of surgical tasks, two approaches are explored: one is learning from demonstration to pick and place a surgical object, and the second is using a gradient-based approach to trigger a smoother object repositioning phase during intraoperative procedures. Finally, to improve scene understanding, this thesis focuses on developing a simulation environment where multiple tasks can be learned based on the surgical scene and then transferred to the real robot. Experiments proved that automation of the pick and place task of different surgical objects is possible. The robot was successfully able to autonomously pick up a suturing needle, position a surgical device for intraoperative ultrasound scanning and manipulate soft tissue for intraoperative organ retraction. Despite automation of surgical subtasks has been demonstrated in this work, several challenges remain open, such as the capabilities of the generated algorithm to generalise over different environment conditions and different patients

    User-oriented markerless augmented reality framework based on 3D reconstruction and loop closure detection

    Get PDF
    An augmented reality (AR) system needs to track the user-view to perform an accurate augmentation registration. The present research proposes a conceptual marker-less, natural feature-based AR framework system, the process for which is divided into two stages - an offline database training session for the application developers, and an online AR tracking and display session for the final users. In the offline session, two types of 3D reconstruction application, RGBD-SLAM and SfM are integrated into the development framework for building the reference template of a target environment. The performance and applicable conditions of these two methods are presented in the present thesis, and the application developers can choose which method to apply for their developmental demands. A general developmental user interface is provided to the developer for interaction, including a simple GUI tool for augmentation configuration. The present proposal also applies a Bag of Words strategy to enable a rapid "loop-closure detection" in the online session, for efficiently querying the application user-view from the trained database to locate the user pose. The rendering and display process of augmentation is currently implemented within an OpenGL window, which is one result of the research that is worthy of future detailed investigation and development
    corecore