Search CORE

364 research outputs found

Vision-Based Localization Algorithm Based on Landmark Matching, Triangulation, Reconstruction, and Comparison

Author: MacDonald B.A.
Yuen D.C.K.
Publication venue
Publication date: 01/01/2005
Field of study

Many generic position-estimation algorithms are vulnerable to ambiguity introduced by nonunique landmarks. Also, the available high-dimensional image data is not fully used when these techniques are extended to vision-based localization. This paper presents the landmark matching, triangulation, reconstruction, and comparison (LTRC) global localization algorithm, which is reasonably immune to ambiguous landmark matches. It extracts natural landmarks for the (rough) matching stage before generating the list of possible position estimates through triangulation. Reconstruction and comparison then rank the possible estimates. The LTRC algorithm has been implemented using an interpreted language, onto a robot equipped with a panoramic vision system. Empirical data shows remarkable improvement in accuracy when compared with the established random sample consensus method. LTRC is also robust against inaccurate map data

Southampton (e-Prints Soton)

Intuitive 3D Maps for MAV Terrain Exploration and Obstacle Avoidance

Author: Achtelik Markus
Kneip Laurent
Scaramuzza Davide
Siegwart Roland
Weiss Stephan
Publication venue
Publication date: 18/06/2018
Field of study

Recent development showed that Micro Aerial Vehicles (MAVs) are nowadays capable of autonomously take off at one point and land at another using only one single camera as exteroceptive sensor. During the flight and landing phase the MAV and user have, however, little knowledge about the whole terrain and potential obstacles. In this paper we show a new solution for a real-time dense 3D terrain reconstruction. This can be used for efficient unmanned MAV terrain exploration and yields a solid base for standard autonomous obstacle avoidance algorithms and path planners. Our approach is based on a textured 3D mesh on sparse 3D point features of the scene. We use the same feature points to localize and control the vehicle in the 3D space as we do for building the 3D terrain reconstruction mesh. This enables us to reconstruct the terrain without significant additional cost and thus in real-time. Experiments show that the MAV is easily guided through an unknown, GPS denied environment. Obstacles are recognized in the iteratively built 3D terrain reconstruction and are thus well avoide

RERO DOC Digital Library

Event-based Vision: A Survey

Author: Bartolozzi Chiara
Censi Andrea
Conradt Joerg
Daniilidis Kostas
Davison Andrew
Delbruck Tobi
Gallego Guillermo
Leutenegger Stefan
Orchard Garrick
Scaramuzza Davide
Taba Brian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

Event cameras are bio-inspired sensors that differ from conventional frame cameras: Instead of capturing images at a fixed rate, they asynchronously measure per-pixel brightness changes, and output a stream of events that encode the time, location and sign of the brightness changes. Event cameras offer attractive properties compared to traditional cameras: high temporal resolution (in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low power consumption, and high pixel bandwidth (on the order of kHz) resulting in reduced motion blur. Hence, event cameras have a large potential for robotics and computer vision in challenging scenarios for traditional cameras, such as low-latency, high speed, and high dynamic range. However, novel methods are required to process the unconventional output of these sensors in order to unlock their potential. This paper provides a comprehensive overview of the emerging field of event-based vision, with a focus on the applications and the algorithms developed to unlock the outstanding properties of event cameras. We present event cameras from their working principle, the actual sensors that are available and the tasks that they have been used for, from low-level vision (feature detection and tracking, optic flow, etc.) to high-level vision (reconstruction, segmentation, recognition). We also discuss the techniques developed to process events, including learning-based techniques, as well as specialized processors for these novel sensors, such as spiking neural networks. Additionally, we highlight the challenges that remain to be tackled and the opportunities that lie ahead in the search for a more efficient, bio-inspired way for machines to perceive and interact with the world

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

ZORA

Computational intelligence approaches to robotics, automation, and control [Volume guest editors]

Author: Chen Yi
Gu Dongbing
Hu Huosheng
Li Yun
Xu Peter
Zhang Jun
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2015
Field of study

No abstract available

Enlighten

Plane-based 3D Mapping for Structured Indoor Environment

Author: Yuan Zehui
Publication venue: Politecnico di Torino
Publication date
Field of study

Three-dimensional (3D) mapping deals with the problem of building a map of the unknown environments explored by a mobile robot. In contrast to 2D maps, 3D maps contain richer information of the visited places. Besides enabling robot navigation in 3D, a 3D map of the robot surroundings could be of great importance for higher-level robotic tasks, like scene interpretation and object interaction or manipulation, as well as for visualization purposes in general, which are required in surveillance, urban search and rescue, surveying, and others. Hence, the goal of this thesis is to develop a system which is capable of reconstructing the surrounding environment of a mobile robot as a three-dimensional map. Microsoft Kinect camera is a novel sensing sensor that captures dense depth images along with RGB images at high frame rate. Recently, it has dominated the stage of 3D robotic sensing, as it is low-cost, low-power. For this work, it is used as the exteroceptive sensor and obtains 3D point clouds of the surrounding environment. Meanwhile, the wheel odometry of the robot is used to initialize the search for correspondences between different observations. As a single 3D point cloud generated by the Microsoft Kinect sensor is composed of many tens of thousands of data points, it is necessary to compress the raw data to process them efficiently. The method chosen in this work is to use a feature-based representation which simplifies the 3D mapping procedure. The chosen features are planar surfaces and orthogonal corners, which is based on the fact that indoor environments are designed such that walls, ground floors, pillars, and other major parts of the building structures can be modeled as planar surface patches, which are parallel or perpendicular to each other. While orthogonal corners are presented as higher features which are more distinguishable in indoor environment. In this thesis, the main idea is to obtain spatial constraints between pairwise frames by building correspondences between the extracted vertical plane features and corner features. A plane matching algorithm is presented that maximizes the similarity metric between a pair of planes within a search space to determine correspondences between planes. The corner matching result is based on the plane matching results. The estimated spatial constraints form the edges of a pose graph, referred to as graph-based SLAM front-end. In order to build a map, however, a robot must be able to recognize places that it has previously visited. Limitations in sensor processing problem, coupled with environmental ambiguity, make this difficult. In this thesis, we describe a loop closure detection algorithm by compressing point clouds into viewpoint feature histograms, inspired by their strong recognition ability. The estimated roto-translation between detected loop frames is added to the graph representing this newly discovered constraint. Due to the estimation errors, the estimated edges form a non-globally consistent trajectory. With the aid of a linear pose graph optimizing algorithm, the most likely configuration of the robot poses can be estimated given the edges of the graph, referred to as SLAM back-end. Finally, the 3D map is retrieved by attaching each acquired point cloud to the corresponding pose estimate. The approach is validated through different experiments with a mobile robot in an indoor environment

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Combined Learned and Classical Methods for Real-Time Visual Perception in Autonomous Driving

Author: Aladem Mohamed D.
Publication venue
Publication date: 24/01/2020
Field of study

Autonomy, robotics, and Artificial Intelligence (AI) are among the main defining themes of next-generation societies. Of the most important applications of said technologies is driving automation which spans from different Advanced Driver Assistance Systems (ADAS) to full self-driving vehicles. Driving automation is promising to reduce accidents, increase safety, and increase access to mobility for more people such as the elderly and the handicapped. However, one of the main challenges facing autonomous vehicles is robust perception which can enable safe interaction and decision making. With so many sensors to perceive the environment, each with its own capabilities and limitations, vision is by far one of the main sensing modalities. Cameras are cheap and can provide rich information of the observed scene. Therefore, this dissertation develops a set of visual perception algorithms with a focus on autonomous driving as the target application area. This dissertation starts by addressing the problem of real-time motion estimation of an agent using only the visual input from a camera attached to it, a problem known as visual odometry. The visual odometry algorithm can achieve low drift rates over long-traveled distances. This is made possible through the innovative local mapping approach used. This visual odometry algorithm was then combined with my multi-object detection and tracking system. The tracking system operates in a tracking-by-detection paradigm where an object detector based on convolution neural networks (CNNs) is used. Therefore, the combined system can detect and track other traffic participants both in image domain and in 3D world frame while simultaneously estimating vehicle motion. This is a necessary requirement for obstacle avoidance and safe navigation. Finally, the operational range of traditional monocular cameras was expanded with the capability to infer depth and thus replace stereo and RGB-D cameras. This is accomplished through a single-stream convolution neural network which can output both depth prediction and semantic segmentation. Semantic segmentation is the process of classifying each pixel in an image and is an important step toward scene understanding. Literature survey, algorithms descriptions, and comprehensive evaluations on real-world datasets are presented.Ph.D.College of Engineering & Computer ScienceUniversity of Michiganhttps://deepblue.lib.umich.edu/bitstream/2027.42/153989/1/Mohamed Aladem Final Dissertation.pdfDescription of Mohamed Aladem Final Dissertation.pdf : Dissertatio

Deep Blue Documents at the University of Michigan

Accurate SLAM With Application For Aerial Path Planning

Author: Friedman Chen
Publication venue
Publication date: 01/01/2013
Field of study

This thesis focuses on operation of Micro Aerial Vehicles (MAVs), in previously unexplored, GPS-denied environments. For this purpose, a refined Simultaneous Localization And Mapping (SLAM) algorithm using a laser range scanner is developed, capable of producing a map of the traversed environment, and estimating the position of the MAV within the evolving map. The algorithm's accuracy is quantitatively assessed using several dedicated metrics, showing significant advantages over current methods. Repeatability and robustness are shown using a set of 12 repeated experiments in a benchmark scenario. The SLAM algorithm is primarily based on an innovative scan matching approach, dubbed Perimeter Based Polar Scan Matching (PB-PSM), which introduces a maximum overlap term to the cost function. This term, along with a tailored cost minimization technique, are found to yield highly accurate solutions for scan matching pairs of range scans. The algorithm is extensively tested on both ground and aerial platforms, in indoor as well as outdoor scenarios, using both in-house and previously published datasets, utilizing several different laser scanners. The SLAM algorithm is then coupled with a global A* path planner, and applied on a single rotor helicopter, performing targeted flight missions using a pilot-in-the- loop implementation. Targeted flight is defined as navigating to a goal position, defined by relative distance from a known initial position. It differs from the more common task of mapping, as it may not rely on loop closure opportunities to smooth out errors and optimize the generated map. Therefore, the importance of position estimates accuracy increases dramatically. The complete algorithm is then used for targeted flight experiments with a pilot in the loop. The algorithm presents the pilot with nothing but heading information. In order to further prevent the pilot from interfering with the obstacle avoidance task, the evolving map and position are not shown to the human pilot. Furthermore, the scenario is introduced with artificial (invisible) obstacles, apparent only to the path planner. The pilot therefore has to adhere to the path planner directions in order to reach the goal while avoiding all obstacles. The resulting paths show the helicopter successfully avoid both real and artificial obstacles, while following the planned path to the goal

Digital Repository at the University of Maryland

Camera Marker Networks for Pose Estimation and Scene Understanding in Construction Automation and Robotics.

Author: Feng Chen
Publication venue
Publication date: 01/01/2015
Field of study

The construction industry faces challenges that include high workplace injuries and fatalities, stagnant productivity, and skill shortage. Automation and Robotics in Construction (ARC) has been proposed in the literature as a potential solution that makes machinery easier to collaborate with, facilitates better decision-making, or enables autonomous behavior. However, there are two primary technical challenges in ARC: 1) unstructured and featureless environments; and 2) differences between the as-designed and the as-built. It is therefore impossible to directly replicate conventional automation methods adopted in industries such as manufacturing on construction sites. In particular, two fundamental problems, pose estimation and scene understanding, must be addressed to realize the full potential of ARC. This dissertation proposes a pose estimation and scene understanding framework that addresses the identified research gaps by exploiting cameras, markers, and planar structures to mitigate the identified technical challenges. A fast plane extraction algorithm is developed for efficient modeling and understanding of built environments. A marker registration algorithm is designed for robust, accurate, cost-efficient, and rapidly reconfigurable pose estimation in unstructured and featureless environments. Camera marker networks are then established for unified and systematic design, estimation, and uncertainty analysis in larger scale applications. The proposed algorithms' efficiency has been validated through comprehensive experiments. Specifically, the speed, accuracy and robustness of the fast plane extraction and the marker registration have been demonstrated to be superior to existing state-of-the-art algorithms. These algorithms have also been implemented in two groups of ARC applications to demonstrate the proposed framework's effectiveness, wherein the applications themselves have significant social and economic value. The first group is related to in-situ robotic machinery, including an autonomous manipulator for assembling digital architecture designs on construction sites to help improve productivity and quality; and an intelligent guidance and monitoring system for articulated machinery such as excavators to help improve safety. The second group emphasizes human-machine interaction to make ARC more effective, including a mobile Building Information Modeling and way-finding platform with discrete location recognition to increase indoor facility management efficiency; and a 3D scanning and modeling solution for rapid and cost-efficient dimension checking and concise as-built modeling.PHDCivil EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/113481/1/cforrest_1.pd

Deep Blue Documents at the University of Michigan

Exploitation of time-of-flight (ToF) cameras

Author: Alenyà Ribas Guillem
Foix Salmerón Sergi
Torras Carme
Publication venue
Publication date: 01/01/2010
Field of study

This technical report reviews the state-of-the art in the field of ToF cameras, their advantages, their limitations, and their present-day applications sometimes in combination with other sensors. Even though ToF cameras provide neither higher resolution nor larger ambiguity-free range compared to other range map estimation systems, advantages such as registered depth and intensity data at a high frame rate, compact design, low weight and reduced power consumption have motivated their use in numerous areas of research. In robotics, these areas range from mobile robot navigation and map building to vision-based human motion capture and gesture recognition, showing particularly a great potential in object modeling and recognition.Preprin

UPCommons. Portal del coneixement obert de la UPC