112 research outputs found

    Multi-task near-field perception for autonomous driving using surround-view fisheye cameras

    Get PDF
    Die Bildung der Augen führte zum Urknall der Evolution. Die Dynamik änderte sich von einem primitiven Organismus, der auf den Kontakt mit der Nahrung wartete, zu einem Organismus, der durch visuelle Sensoren gesucht wurde. Das menschliche Auge ist eine der raffiniertesten Entwicklungen der Evolution, aber es hat immer noch Mängel. Der Mensch hat über Millionen von Jahren einen biologischen Wahrnehmungsalgorithmus entwickelt, der in der Lage ist, Autos zu fahren, Maschinen zu bedienen, Flugzeuge zu steuern und Schiffe zu navigieren. Die Automatisierung dieser Fähigkeiten für Computer ist entscheidend für verschiedene Anwendungen, darunter selbstfahrende Autos, Augmented Realität und architektonische Vermessung. Die visuelle Nahfeldwahrnehmung im Kontext von selbstfahrenden Autos kann die Umgebung in einem Bereich von 0 - 10 Metern und 360° Abdeckung um das Fahrzeug herum wahrnehmen. Sie ist eine entscheidende Entscheidungskomponente bei der Entwicklung eines sichereren automatisierten Fahrens. Jüngste Fortschritte im Bereich Computer Vision und Deep Learning in Verbindung mit hochwertigen Sensoren wie Kameras und LiDARs haben ausgereifte Lösungen für die visuelle Wahrnehmung hervorgebracht. Bisher stand die Fernfeldwahrnehmung im Vordergrund. Ein weiteres wichtiges Problem ist die begrenzte Rechenleistung, die für die Entwicklung von Echtzeit-Anwendungen zur Verfügung steht. Aufgrund dieses Engpasses kommt es häufig zu einem Kompromiss zwischen Leistung und Laufzeiteffizienz. Wir konzentrieren uns auf die folgenden Themen, um diese anzugehen: 1) Entwicklung von Nahfeld-Wahrnehmungsalgorithmen mit hoher Leistung und geringer Rechenkomplexität für verschiedene visuelle Wahrnehmungsaufgaben wie geometrische und semantische Aufgaben unter Verwendung von faltbaren neuronalen Netzen. 2) Verwendung von Multi-Task-Learning zur Überwindung von Rechenengpässen durch die gemeinsame Nutzung von initialen Faltungsschichten zwischen den Aufgaben und die Entwicklung von Optimierungsstrategien, die die Aufgaben ausbalancieren.The formation of eyes led to the big bang of evolution. The dynamics changed from a primitive organism waiting for the food to come into contact for eating food being sought after by visual sensors. The human eye is one of the most sophisticated developments of evolution, but it still has defects. Humans have evolved a biological perception algorithm capable of driving cars, operating machinery, piloting aircraft, and navigating ships over millions of years. Automating these capabilities for computers is critical for various applications, including self-driving cars, augmented reality, and architectural surveying. Near-field visual perception in the context of self-driving cars can perceive the environment in a range of 0 - 10 meters and 360° coverage around the vehicle. It is a critical decision-making component in the development of safer automated driving. Recent advances in computer vision and deep learning, in conjunction with high-quality sensors such as cameras and LiDARs, have fueled mature visual perception solutions. Until now, far-field perception has been the primary focus. Another significant issue is the limited processing power available for developing real-time applications. Because of this bottleneck, there is frequently a trade-off between performance and run-time efficiency. We concentrate on the following issues in order to address them: 1) Developing near-field perception algorithms with high performance and low computational complexity for various visual perception tasks such as geometric and semantic tasks using convolutional neural networks. 2) Using Multi-Task Learning to overcome computational bottlenecks by sharing initial convolutional layers between tasks and developing optimization strategies that balance tasks

    Autonomous navigation and mapping of mobile robots based on 2D/3D cameras combination

    Get PDF
    Aufgrund der tendenziell zunehmenden Nachfrage an Systemen zur Unterstützung des alltäglichen Lebens gibt es derzeit ein großes Interesse an autonomen Systemen. Autonome Systeme werden in Häusern, Büros, Museen sowie in Fabriken eingesetzt. Sie können verschiedene Aufgaben erledigen, beispielsweise beim Reinigen, als Helfer im Haushalt, im Bereich der Sicherheit und Bildung, im Supermarkt sowie im Empfang als Auskunft, weil sie dazu verwendet werden können, die Verarbeitungszeit zu kontrollieren und präzise, zuverlässige Ergebnisse zu liefern. Ein Forschungsgebiet autonomer Systeme ist die Navigation und Kartenerstellung. Das heißt, mobile Roboter sollen selbständig ihre Aufgaben erledigen und zugleich eine Karte der Umgebung erstellen, um navigieren zu können. Das Hauptproblem besteht darin, dass der mobile Roboter in einer unbekannten Umgebung, in der keine zusätzlichen Bezugsinformationen vorhanden sind, das Gelände erkunden und eine dreidimensionale Karte davon erstellen muss. Der Roboter muss seine Positionen innerhalb der Karte bestimmen. Es ist notwendig, ein unterscheidbares Objekt zu finden. Daher spielen die ausgewählten Sensoren und der Register-Algorithmus eine relevante Rolle. Die Sensoren, die sowohl Tiefen- als auch Bilddaten liefern können, sind noch unzureichend. Der neue 3D-Sensor, nämlich der "Photonic Mixer Device" (PMD), erzeugt mit hoher Bildwiederholfrequenz eine Echtzeitvolumenerfassung des umliegenden Szenarios und liefert Tiefen- und Graustufendaten. Allerdings erfordert die höhere Qualität der dreidimensionalen Erkundung der Umgebung Details und Strukturen der Oberflächen, die man nur mit einer hochauflösenden CCD-Kamera erhalten kann. Die vorliegende Arbeit präsentiert somit eine Exploration eines mobilen Roboters mit Hilfe der Kombination einer CCD- und PMD-Kamera, um eine dreidimensionale Karte der Umgebung zu erstellen. Außerdem wird ein Hochleistungsalgorithmus zur Erstellung von 3D Karten und zur Poseschätzung in Echtzeit unter Verwendung des "Simultaneous Localization and Mapping" (SLAM) Verfahrens präsentiert. Der autonom arbeitende, mobile Roboter soll ferner Aufgaben übernehmen, wie z.B. die Erkennung von Objekten in ihrer Umgebung, um verschiedene praktische Aufgaben zu lösen. Die visuellen Daten der CCD-Kamera liefern nicht nur eine hohe Auflösung der Textur-Daten für die Tiefendaten, sondern werden auch für die Objekterkennung verwendet. Der "Iterative Closest Point" (ICP) Algorithmus benutzt zwei Punktwolken, um den Bewegungsvektor zu bestimmen. Schließlich sind die Auswertung der Korrespondenzen und die Rekonstruktion der Karte, um die reale Umgebung abzubilden, in dieser Arbeit enthalten.Presently, intelligent autonomous systems have to perform very interesting tasks due to trendy increases in support demands of human living. Autonomous systems have been used in various applications like houses, offices, museums as well as in factories. They are able to operate in several kinds of applications such as cleaning, household assistance, transportation, security, education and shop assistance because they can be used to control the processing time, and to provide precise and reliable output. One research field of autonomous systems is mobile robot navigation and map generation. That means the mobile robot should work autonomously while generating a map, which the robot follows. The main issue is that the mobile robot has to explore an unknown environment and to generate a three dimensional map of an unknown environment in case that there is not any further reference information. The mobile robot has to estimate its position and pose. It is required to find distinguishable objects. Therefore, the selected sensors and registered algorithms are significant. The sensors, which can provide both, depth as well as image data are still deficient. A new 3D sensor, namely the Photonic Mixer Device (PMD), generates a high rate output in real-time capturing the surrounding scenario as well as the depth and gray scale data. However, a higher quality of three dimension explorations requires details and textures of surfaces, which can be obtained from a high resolution CCD camera. This work hence presents the mobile robot exploration using the integration of CCD and PMD camera in order to create a three dimensional map. In addition, a high performance algorithm for 3D mapping and pose estimation of the locomotion in real time, using the "Simultaneous Localization and Mapping" (SLAM) technique is proposed. The flawlessly mobile robot should also handle the tasks, such as the recognition of objects in its environment, in order to achieve various practical missions. Visual input from the CCD camera not only delivers high resolution texture data on depth volume, but is also used for object recognition. The “Iterative Closest Point” (ICP) algorithm is using two sets of points to find out the translation and rotation vector between two scans. Finally, the evaluation of the correspondences and the reconstruction of the map to resemble the real environment are included in this thesis

    Vision Sensors and Edge Detection

    Get PDF
    Vision Sensors and Edge Detection book reflects a selection of recent developments within the area of vision sensors and edge detection. There are two sections in this book. The first section presents vision sensors with applications to panoramic vision sensors, wireless vision sensors, and automated vision sensor inspection, and the second one shows image processing techniques, such as, image measurements, image transformations, filtering, and parallel computing

    Multiperspective mosaics and layered representation for scene visualization

    Get PDF
    This thesis documents the efforts made to implement multiperspective mosaicking for the purpose of mosaicking undervehicle and roadside sequences. For the undervehicle sequences, it is desired to create a large, high-resolution mosaic that may used to quickly inspect the entire scene shot by a camera making a single pass underneath the vehicle. Several constraints are placed on the video data, in order to facilitate the assumption that the entire scene in the sequence exists on a single plane. Therefore, a single mosaic is used to represent a single video sequence. Phase correlation is used to perform motion analysis in this case. For roadside video sequences, it is assumed that the scene is composed of several planar layers, as opposed to a single plane. Layer extraction techniques are implemented in order to perform this decomposition. Instead of using phase correlation to perform motion analysis, the Lucas-Kanade motion tracking algorithm is used in order to create dense motion maps. Using these motion maps, spatial support for each layer is determined based on a pre-initialized layer model. By separating the pixels in the scene into motion-specific layers, it is possible to sample each element in the scene correctly while performing multiperspective mosaicking. It is also possible to fill in many gaps in the mosaics caused by occlusions, hence creating more complete representations of the objects of interest. The results are several mosaics with each mosaic representing a single planar layer of the scene

    Robust ego-localization using monocular visual odometry

    Get PDF

    Automatic Dense 3D Scene Mapping from Non-overlapping Passive Visual Sensors for Future Autonomous Systems

    Get PDF
    The ever increasing demand for higher levels of autonomy for robots and vehicles means there is an ever greater need for such systems to be aware of their surroundings. Whilst solutions already exist for creating 3D scene maps, many are based on active scanning devices such as laser scanners and depth cameras that are either expensive, unwieldy, or do not function well under certain environmental conditions. As a result passive cameras are a favoured sensor due their low cost, small size, and ability to work in a range of lighting conditions. In this work we address some of the remaining research challenges within the problem of 3D mapping around a moving platform. We utilise prior work in dense stereo imaging, Stereo Visual Odometry (SVO) and extend Structure from Motion (SfM) to create a pipeline optimised for on vehicle sensing. Using forward facing stereo cameras, we use state of the art SVO and dense stereo techniques to map the scene in front of the vehicle. With significant amounts of prior research in dense stereo, we addressed the issue of selecting an appropriate method by creating a novel evaluation technique. Visual 3D mapping of dynamic scenes from a moving platform result in duplicated scene objects. We extend the prior work on mapping by introducing a generalized dynamic object removal process. Unlike other approaches that rely on computationally expensive segmentation or detection, our method utilises existing data from the mapping stage and the findings from our dense stereo evaluation. We introduce a new SfM approach that exploits our platform motion to create a novel dense mapping process that exceeds the 3D data generation rate of state of the art alternatives. Finally, we combine dense stereo, SVO, and our SfM approach to automatically align point clouds from non-overlapping views to create a rotational and scale consistent global 3D model

    Stereo-Camera–LiDAR Calibration for Autonomous Driving

    Get PDF
    Perception is one of the key factors to successful self-driving. According to recent studies in developing perception 3D range scanners combined with stereo camera vision are the most utilized sensors in autonomous vehicle perception systems. To enable accurate perception, the sensors must be calibrated before the sensor data can be fused. Calibration minimizes measurement errors caused by the nonidealities of individual sensors and errors caused by the transformation between different sensor frames. This thesis presents camera-LiDAR calibration, synchronisation, and data fusion techniques. It can be argued that the quality of data is more important to the calibration than the actual optimization algorithms, therefore, one challenge addressed in this thesis is accurate data collection with different calibration targets and result validation with different optimization algorithms. We estimated the vehicle windshield effect on camera calibration and show that the error caused by the windshield can be decreased by using more complex distortion models than the standard model. Synchronisation is required to ensure that sensors provide measurements at the same time. The sensor data used in this thesis was synchronized by using an external trigger signal from a GNSS receiver. The camera-LiDAR extrinsic calibration was performed using synchronised 3D-2D (LiDAR points and camera pixels) and 3D-3D (LiDAR points and stereo camera) point correspondences. This comparison demonstrates that the best method to estimate camera-LiDAR extrinsic parameters is to use 3D-2D point correspondences. Moreover, a comparison between camera-based and LiDAR 3D reconstruction is presented. Due to different sensors viewpoint, some data points are occluded, therefore, we propose a camera-LiDAR occlusion handling algorithm to remove occluded points. The quality of the calibration is demonstrated visually, by fusing and aligning the LiDAR point cloud and the image

    Aircraft Attitude Estimation Using Panoramic Images

    Full text link
    This thesis investigates the problem of reliably estimating attitude from panoramic imagery in cluttered environments. Accurate attitude is an essential input to the stabilisation systems of autonomous aerial vehicles. A new camera system which combines a CCD camera, UltraViolet (UV) filters and a panoramic mirror-lens is designed. Drawing on biological inspiration from the Ocelli organ possessed by certain insects, UV filtered images are used to enhance the contrast between the sky and ground and mitigate the effect of the sun. A novel method for real–time horizon-based attitude estimation using panoramic image that is capable of estimating an aircraft pitch and roll at a low altitude in the presence of sun, clouds and occluding features such as tree, building, is developed. Also, a new method for panoramic sky/ground thresholding, consisting of a horizon– and a sun–tracking system which works effectively even when the horizon line is difficult to detect by normal thresholding methods due to flares and other effects from the presence of the sun in the image, is proposed. An algorithm for estimating the attitude from three–dimensional mapping of the horizon projected onto a 3D plane is developed. The use of optic flow to determine pitch and roll rates is investigated using the panoramic image and image interpolation algorithm (I2A). Two methods which employ sensor fusion techniques, Extended Kalman Filter (EKF) and Artificial Neural Networks (ANNs), are used to fuse unfiltered measurements from inertial sensors and the vision system. The EKF estimates gyroscope biases and also the attitude. The ANN fuses the optic flow and horizon–based attitude to provide smooth attitude estimations. The results obtained from different parts of the research are tested and validated through simulations and real flight tests
    • …
    corecore