313 research outputs found

    Reliable localization methods for intelligent vehicles based on environment perception

    Get PDF
    Mención Internacional en el título de doctorIn the near past, we would see autonomous vehicles and Intelligent Transport Systems (ITS) as a potential future of transportation. Today, thanks to all the technological advances in recent years, the feasibility of such systems is no longer a question. Some of these autonomous driving technologies are already sharing our roads, and even commercial vehicles are including more Advanced Driver-Assistance Systems (ADAS) over the years. As a result, transportation is becoming more efficient and the roads are considerably safer. One of the fundamental pillars of an autonomous system is self-localization. An accurate and reliable estimation of the vehicle’s pose in the world is essential to navigation. Within the context of outdoor vehicles, the Global Navigation Satellite System (GNSS) is the predominant localization system. However, these systems are far from perfect, and their performance is degraded in environments with limited satellite visibility. Additionally, their dependence on the environment can make them unreliable if it were to change. Accordingly, the goal of this thesis is to exploit the perception of the environment to enhance localization systems in intelligent vehicles, with special attention to their reliability. To this end, this thesis presents several contributions: First, a study on exploiting 3D semantic information in LiDAR odometry is presented, providing interesting insights regarding the contribution to the odometry output of each type of element in the scene. The experimental results have been obtained using a public dataset and validated on a real-world platform. Second, a method to estimate the localization error using landmark detections is proposed, which is later on exploited by a landmark placement optimization algorithm. This method, which has been validated in a simulation environment, is able to determine a set of landmarks so the localization error never exceeds a predefined limit. Finally, a cooperative localization algorithm based on a Genetic Particle Filter is proposed to utilize vehicle detections in order to enhance the estimation provided by GNSS systems. Multiple experiments are carried out in different simulation environments to validate the proposed method.En un pasado no muy lejano, los vehículos autónomos y los Sistemas Inteligentes del Transporte (ITS) se veían como un futuro para el transporte con gran potencial. Hoy, gracias a todos los avances tecnológicos de los últimos años, la viabilidad de estos sistemas ha dejado de ser una incógnita. Algunas de estas tecnologías de conducción autónoma ya están compartiendo nuestras carreteras, e incluso los vehículos comerciales cada vez incluyen más Sistemas Avanzados de Asistencia a la Conducción (ADAS) con el paso de los años. Como resultado, el transporte es cada vez más eficiente y las carreteras son considerablemente más seguras. Uno de los pilares fundamentales de un sistema autónomo es la autolocalización. Una estimación precisa y fiable de la posición del vehículo en el mundo es esencial para la navegación. En el contexto de los vehículos circulando en exteriores, el Sistema Global de Navegación por Satélite (GNSS) es el sistema de localización predominante. Sin embargo, estos sistemas están lejos de ser perfectos, y su rendimiento se degrada en entornos donde la visibilidad de los satélites es limitada. Además, los cambios en el entorno pueden provocar cambios en la estimación, lo que los hace poco fiables en ciertas situaciones. Por ello, el objetivo de esta tesis es utilizar la percepción del entorno para mejorar los sistemas de localización en vehículos inteligentes, con una especial atención a la fiabilidad de estos sistemas. Para ello, esta tesis presenta varias aportaciones: En primer lugar, se presenta un estudio sobre cómo aprovechar la información semántica 3D en la odometría LiDAR, generando una base de conocimiento sobre la contribución de cada tipo de elemento del entorno a la salida de la odometría. Los resultados experimentales se han obtenido utilizando una base de datos pública y se han validado en una plataforma de conducción del mundo real. En segundo lugar, se propone un método para estimar el error de localización utilizando detecciones de puntos de referencia, que posteriormente es explotado por un algoritmo de optimización de posicionamiento de puntos de referencia. Este método, que ha sido validado en un entorno de simulación, es capaz de determinar un conjunto de puntos de referencia para el cual el error de localización nunca supere un límite previamente fijado. Por último, se propone un algoritmo de localización cooperativa basado en un Filtro Genético de Partículas para utilizar las detecciones de vehículos con el fin de mejorar la estimación proporcionada por los sistemas GNSS. El método propuesto ha sido validado mediante múltiples experimentos en diferentes entornos de simulación.Programa de Doctorado en Ingeniería Eléctrica, Electrónica y Automática por la Universidad Carlos III de MadridSecretario: Joshué Manuel Pérez Rastelli.- Secretario: Jorge Villagrá Serrano.- Vocal: Enrique David Martí Muño

    Evaluation of automated decisionmaking methodologies and development of an integrated robotic system simulation

    Get PDF
    A generic computer simulation for manipulator systems (ROBSIM) was implemented and the specific technologies necessary to increase the role of automation in various missions were developed. The specific items developed are: (1) capability for definition of a manipulator system consisting of multiple arms, load objects, and an environment; (2) capability for kinematic analysis, requirements analysis, and response simulation of manipulator motion; (3) postprocessing options such as graphic replay of simulated motion and manipulator parameter plotting; (4) investigation and simulation of various control methods including manual force/torque and active compliances control; (5) evaluation and implementation of three obstacle avoidance methods; (6) video simulation and edge detection; and (7) software simulation validation

    External multi-modal imaging sensor calibration for sensor fusion: A review

    Get PDF
    Multi-modal data fusion has gained popularity due to its diverse applications, leading to an increased demand for external sensor calibration. Despite several proven calibration solutions, they fail to fully satisfy all the evaluation criteria, including accuracy, automation, and robustness. Thus, this review aims to contribute to this growing field by examining recent research on multi-modal imaging sensor calibration and proposing future research directions. The literature review comprehensively explains the various characteristics and conditions of different multi-modal external calibration methods, including traditional motion-based calibration and feature-based calibration. Target-based calibration and targetless calibration are two types of feature-based calibration, which are discussed in detail. Furthermore, the paper highlights systematic calibration as an emerging research direction. Finally, this review concludes crucial factors for evaluating calibration methods and provides a comprehensive discussion on their applications, with the aim of providing valuable insights to guide future research directions. Future research should focus primarily on the capability of online targetless calibration and systematic multi-modal sensor calibration.Ministerio de Ciencia, Innovación y Universidades | Ref. PID2019-108816RB-I0

    On-line control of active camera networks

    Get PDF
    Large networks of cameras have been increasingly employed to capture dynamic events for tasks such as surveillance and training. When using active (pan-tilt-zoom) cameras to capture events distributed throughout a large area, human control becomes impractical and unreliable. This has led to the development of automated approaches for on-line camera control. I introduce a new approach that consists of a stochastic performance metric and a constrained optimization method. The metric quantifies the uncertainty in the state of multiple points on each target. It uses state-space methods with stochastic models of the target dynamics and camera measurements. It can account for static and dynamic occlusions, accommodate requirements specific to the algorithm used to process the images, and incorporate other factors that can affect its results. The optimization explores the space of camera configurations over time under constraints associated with the cameras, the predicted target trajectories, and the image processing algorithm. While an exhaustive exploration of this parameter space is intractable, through careful complexity analysis and application domain observations I have identified appropriate alternatives for reducing the space. Specifically, I reduce the spatial dimension of the search by dividing the optimization problem into subproblems, and then optimizing each subproblem independently. I reduce the temporal dimension of the search by using empirically-based heuristics inside each subproblem. The result is a tractable optimization that explores an appropriate subspace of the parameters, while attempting to minimize the risk of excluding the global optimum. The approach can be applied to conventional surveillance tasks (e.g., tracking or face recognition), as well as tasks employing more complex computer vision methods (e.g., markerless motion capture or 3D reconstruction). I present the results of experimental simulations of two such scenarios, using controlled and natural (unconstrained) target motions, employing simulated and real target tracks, in realistic scenes, and with realistic camera networks

    Relating Multimodal Imagery Data in 3D

    Get PDF
    This research develops and improves the fundamental mathematical approaches and techniques required to relate imagery and imagery derived multimodal products in 3D. Image registration, in a 2D sense, will always be limited by the 3D effects of viewing geometry on the target. Therefore, effects such as occlusion, parallax, shadowing, and terrain/building elevation can often be mitigated with even a modest amounts of 3D target modeling. Additionally, the imaged scene may appear radically different based on the sensed modality of interest; this is evident from the differences in visible, infrared, polarimetric, and radar imagery of the same site. This thesis develops a `model-centric\u27 approach to relating multimodal imagery in a 3D environment. By correctly modeling a site of interest, both geometrically and physically, it is possible to remove/mitigate some of the most difficult challenges associated with multimodal image registration. In order to accomplish this feat, the mathematical framework necessary to relate imagery to geometric models is thoroughly examined. Since geometric models may need to be generated to apply this `model-centric\u27 approach, this research develops methods to derive 3D models from imagery and LIDAR data. Of critical note, is the implementation of complimentary techniques for relating multimodal imagery that utilize the geometric model in concert with physics based modeling to simulate scene appearance under diverse imaging scenarios. Finally, the often neglected final phase of mapping localized image registration results back to the world coordinate system model for final data archival are addressed. In short, once a target site is properly modeled, both geometrically and physically, it is possible to orient the 3D model to the same viewing perspective as a captured image to enable proper registration. If done accurately, the synthetic model\u27s physical appearance can simulate the imaged modality of interest while simultaneously removing the 3-D ambiguity between the model and the captured image. Once registered, the captured image can then be archived as a texture map on the geometric site model. In this way, the 3D information that was lost when the image was acquired can be regained and properly related with other datasets for data fusion and analysis

    Automated visual assembly inspection

    Get PDF
    Includes bibliographical references (pages 699-700).This chapter has discussed an intelligent assembly inspection system that uses a multiscale algorithm to detect errors in assemblies after the algorithm is trained on synthetic CAD images of correctly assembled products. It was shown how the CAD information of an assembly along with fast rendering techniques on specialized graphics machines can be used for the automation of the work-cell camera and light placement. The current emphasis in the manufacturing industry on concurrent engineering will only cause this integration between the CAD model of products and its manufacturing inspection to grow in value

    \u3cem\u3eGRASP News\u3c/em\u3e: Volume 9, Number 1

    Get PDF
    The past year at the GRASP Lab has been an exciting and productive period. As always, innovation and technical advancement arising from past research has lead to unexpected questions and fertile areas for new research. New robots, new mobile platforms, new sensors and cameras, and new personnel have all contributed to the breathtaking pace of the change. Perhaps the most significant change is the trend towards multi-disciplinary projects, most notable the multi-agent project (see inside for details on this, and all the other new and on-going projects). This issue of GRASP News covers the developments for the year 1992 and the first quarter of 1993

    Scene Reconstruction Beyond Structure-from-Motion and Multi-View Stereo

    Get PDF
    Image-based 3D reconstruction has become a robust technology for recovering accurate and realistic models of real-world objects and scenes. A common pipeline for 3D reconstruction is to first apply Structure-from-Motion (SfM), which recovers relative poses for the input images and sparse geometry for the scene, and then apply Multi-view Stereo (MVS), which estimates a dense depthmap for each image. While this two-stage process is quite effective in many 3D modeling scenarios, there are limits to what can be reconstructed. This dissertation focuses on three particular scenarios where the SfM+MVS pipeline fails and introduces new approaches to accomplish each reconstruction task. First, I introduce a novel method to recover dense surface reconstructions of endoscopic video. In this setting, SfM can generally provide sparse surface structure, but the lack of surface texture as well as complex, changing illumination often causes MVS to fail. To overcome these difficulties, I introduce a method that utilizes SfM both to guide surface reflectance estimation and to regularize shading-based depth reconstruction. I also introduce models of reflectance and illumination that improve the final result. Second, I introduce an approach for augmenting 3D reconstructions from large-scale Internet photo-collections by recovering the 3D position of transient objects --- specifically, people --- in the input imagery. Since no two images can be assumed to capture the same person in the same location, the typical triangulation constraints enjoyed by SfM and MVS cannot be directly applied. I introduce an alternative method to approximately triangulate people who stood in similar locations, aided by a height distribution prior and visibility constraints provided by SfM. The scale of the scene, gravity direction, and per-person ground-surface normals are also recovered. Finally, I introduce the concept of using crowd-sourced imagery to create living 3D reconstructions --- visualizations of real places that include dynamic representations of transient objects. A key difficulty here is that SfM+MVS pipelines often poorly reconstruct ground surfaces given Internet images. To address this, I introduce a volumetric reconstruction approach that leverages scene scale and person placements. Crowd simulation is then employed to add virtual pedestrians to the space and bring the reconstruction "to life."Doctor of Philosoph

    From motion capture to interactive virtual worlds : towards unconstrained motion-capture algorithms for real-time performance-driven character animation

    Get PDF
    This dissertation takes performance-driven character animation as a representative application and advances motion capture algorithms and animation methods to meet its high demands. Existing approaches have either coarse resolution and restricted capture volume, require expensive and complex multi-camera systems, or use intrusive suits and controllers. For motion capture, set-up time is reduced using fewer cameras, accuracy is increased despite occlusions and general environments, initialization is automated, and free roaming is enabled by egocentric cameras. For animation, increased robustness enables the use of low-cost sensors input, custom control gesture definition is guided to support novice users, and animation expressiveness is increased. The important contributions are: 1) an analytic and differentiable visibility model for pose optimization under strong occlusions, 2) a volumetric contour model for automatic actor initialization in general scenes, 3) a method to annotate and augment image-pose databases automatically, 4) the utilization of unlabeled examples for character control, and 5) the generalization and disambiguation of cyclical gestures for faithful character animation. In summary, the whole process of human motion capture, processing, and application to animation is advanced. These advances on the state of the art have the potential to improve many interactive applications, within and outside virtual reality.Diese Arbeit befasst sich mit Performance-driven Character Animation, insbesondere werden Motion Capture-Algorithmen entwickelt um den hohen Anforderungen dieser Beispielanwendung gerecht zu werden. Existierende Methoden haben entweder eine geringe Genauigkeit und einen eingeschränkten Aufnahmebereich oder benötigen teure Multi-Kamera-Systeme, oder benutzen störende Controller und spezielle Anzüge. Für Motion Capture wird die Setup-Zeit verkürzt, die Genauigkeit für Verdeckungen und generelle Umgebungen erhöht, die Initialisierung automatisiert, und Bewegungseinschränkung verringert. Für Character Animation wird die Robustheit für ungenaue Sensoren erhöht, Hilfe für benutzerdefinierte Gestendefinition geboten, und die Ausdrucksstärke der Animation verbessert. Die wichtigsten Beiträge sind: 1) ein analytisches und differenzierbares Sichtbarkeitsmodell für Rekonstruktionen unter starken Verdeckungen, 2) ein volumetrisches Konturenmodell für automatische Körpermodellinitialisierung in genereller Umgebung, 3) eine Methode zur automatischen Annotation von Posen und Augmentation von Bildern in großen Datenbanken, 4) das Nutzen von Beispielbewegungen für Character Animation, und 5) die Generalisierung und Übertragung von zyklischen Gesten für genaue Charakteranimation. Es wird der gesamte Prozess erweitert, von Motion Capture bis hin zu Charakteranimation. Die Verbesserungen sind für viele interaktive Anwendungen geeignet, innerhalb und außerhalb von virtueller Realität
    corecore