207 research outputs found

    Utilization and experimental evaluation of occlusion aware kernel correlation filter tracker using RGB-D

    Get PDF
    Unlike deep-learning which requires large training datasets, correlation filter-based trackers like Kernelized Correlation Filter (KCF) uses implicit properties of tracked images (circulant matrices) for training in real-time. Despite their practical application in tracking, a need for a better understanding of the fundamentals associated with KCF in terms of theoretically, mathematically, and experimentally exists. This thesis first details the workings prototype of the tracker and investigates its effectiveness in real-time applications and supporting visualizations. We further address some of the drawbacks of the tracker in cases of occlusions, scale changes, object rotation, out-of-view and model drift with our novel RGB-D Kernel Correlation tracker. We also study the use of particle filter to improve trackers\u27 accuracy. Our results are experimentally evaluated using a) standard dataset and b) real-time using Microsoft Kinect V2 sensor. We believe this work will set the basis for better understanding the effectiveness of kernel-based correlation filter trackers and to further define some of its possible advantages in tracking

    Sobi: An Interactive Social Service Robot for Long-Term Autonomy in Open Environments

    Get PDF
    Long-term autonomy in service robotics is a current research topic, especially for dynamic, large-scale environments that change over time. We present Sobi, a mobile service robot developed as an interactive guide for open environments, such as public places with indoor and outdoor areas. The robot will serve as a platform for environmental modeling and human-robot interaction. Its main hardware and software components, which we freely license as a documented open source project, are presented. Another key focus is Sobi’s monitoring system for long-term autonomy, which restores system components in a targeted manner in order to extend the total system lifetime without unplanned intervention. We demonstrate first results of the long-term autonomous capabilities in a 16-day indoor deployment, in which the robot patrols a total of 66.6 km with an average of 5.5 hours of travel time per weekday, charging autonomously in between. In a user study with 12 participants, we evaluate the appearance and usability of the user interface, which allows users to interactively query information about the environment and directions.© 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

    Robot Mapping and Navigation in Real-World Environments

    Get PDF
    Robots can perform various tasks, such as mapping hazardous sites, taking part in search-and-rescue scenarios, or delivering goods and people. Robots operating in the real world face many challenges on the way to the completion of their mission. Essential capabilities required for the operation of such robots are mapping, localization and navigation. Solving all of these tasks robustly presents a substantial difficulty as these components are usually interconnected, i.e., a robot that starts without any knowledge about the environment must simultaneously build a map, localize itself in it, analyze the surroundings and plan a path to efficiently explore an unknown environment. In addition to the interconnections between these tasks, they highly depend on the sensors used by the robot and on the type of the environment in which the robot operates. For example, an RGB camera can be used in an outdoor scene for computing visual odometry, or to detect dynamic objects but becomes less useful in an environment that does not have enough light for cameras to operate. The software that controls the behavior of the robot must seamlessly process all the data coming from different sensors. This often leads to systems that are tailored to a particular robot and a particular set of sensors. In this thesis, we challenge this concept by developing and implementing methods for a typical robot navigation pipeline that can work with different types of the sensors seamlessly both, in indoor and outdoor environments. With the emergence of new range-sensing RGBD and LiDAR sensors, there is an opportunity to build a single system that can operate robustly both in indoor and outdoor environments equally well and, thus, extends the application areas of mobile robots. The techniques presented in this thesis aim to be used with both RGBD and LiDAR sensors without adaptations for individual sensor models by using range image representation and aim to provide methods for navigation and scene interpretation in both static and dynamic environments. For a static world, we present a number of approaches that address the core components of a typical robot navigation pipeline. At the core of building a consistent map of the environment using a mobile robot lies point cloud matching. To this end, we present a method for photometric point cloud matching that treats RGBD and LiDAR sensors in a uniform fashion and is able to accurately register point clouds at the frame rate of the sensor. This method serves as a building block for the further mapping pipeline. In addition to the matching algorithm, we present a method for traversability analysis of the currently observed terrain in order to guide an autonomous robot to the safe parts of the surrounding environment. A source of danger when navigating difficult to access sites is the fact that the robot may fail in building a correct map of the environment. This dramatically impacts the ability of an autonomous robot to navigate towards its goal in a robust way, thus, it is important for the robot to be able to detect these situations and to find its way home not relying on any kind of map. To address this challenge, we present a method for analyzing the quality of the map that the robot has built to date, and safely returning the robot to the starting point in case the map is found to be in an inconsistent state. The scenes in dynamic environments are vastly different from the ones experienced in static ones. In a dynamic setting, objects can be moving, thus making static traversability estimates not enough. With the approaches developed in this thesis, we aim at identifying distinct objects and tracking them to aid navigation and scene understanding. We target these challenges by providing a method for clustering a scene taken with a LiDAR scanner and a measure that can be used to determine if two clustered objects are similar that can aid the tracking performance. All methods presented in this thesis are capable of supporting real-time robot operation, rely on RGBD or LiDAR sensors and have been tested on real robots in real-world environments and on real-world datasets. All approaches have been published in peer-reviewed conference papers and journal articles. In addition to that, most of the presented contributions have been released publicly as open source software

    Multiple human tracking in RGB-depth data: A survey

    Get PDF
    © The Institution of Engineering and Technology. Multiple human tracking (MHT) is a fundamental task in many computer vision applications. Appearance-based approaches, primarily formulated on RGB data, are constrained and affected by problems arising from occlusions and/or illumination variations. In recent years, the arrival of cheap RGB-depth devices has led to many new approaches to MHT, and many of these integrate colour and depth cues to improve each and every stage of the process. In this survey, the authors present the common processing pipeline of these methods and review their methodology based (a) on how they implement this pipeline and (b) on what role depth plays within each stage of it. They identify and introduce existing, publicly available, benchmark datasets and software resources that fuse colour and depth data for MHT. Finally, they present a brief comparative evaluation of the performance of those works that have applied their methods to these datasets

    Automatic and adaptable registration of live RGBD video streams sharing partially overlapping views

    Get PDF
    In this thesis, we introduce DeReEs-4v, an algorithm for unsupervised and automatic registration of two video frames captured depth-sensing cameras. DeReEs-4V receives two RGBD video streams from two depth-sensing cameras arbitrary located in an indoor space that share a minimum amount of 25% overlap between their captured scenes. The motivation of this research is to employ multiple depth-sensing cameras to enlarge the field of view and acquire a more complete and accurate 3D information of the environment. A typical way to combine multiple views from different cameras is through manual calibration. However, this process is time-consuming and may require some technical knowledge. Moreover, calibration has to be repeated when the location or position of the cameras change. In this research, we demonstrate how DeReEs-4V registration can be used to find the transformation of the view of one camera with respect to the other at interactive rates. Our algorithm automatically finds the 3D transformation to match the views from two cameras, requires no human interference, and is robust to camera movements while capturing. To validate this approach, a thorough examination of the system performance under different scenarios is presented. The system presented here supports any application that might benefit from the wider field-of-view provided by the combined scene from both cameras, including applications in 3D telepresence, gaming, people tracking, videoconferencing and computer vision

    Camera Marker Networks for Pose Estimation and Scene Understanding in Construction Automation and Robotics.

    Full text link
    The construction industry faces challenges that include high workplace injuries and fatalities, stagnant productivity, and skill shortage. Automation and Robotics in Construction (ARC) has been proposed in the literature as a potential solution that makes machinery easier to collaborate with, facilitates better decision-making, or enables autonomous behavior. However, there are two primary technical challenges in ARC: 1) unstructured and featureless environments; and 2) differences between the as-designed and the as-built. It is therefore impossible to directly replicate conventional automation methods adopted in industries such as manufacturing on construction sites. In particular, two fundamental problems, pose estimation and scene understanding, must be addressed to realize the full potential of ARC. This dissertation proposes a pose estimation and scene understanding framework that addresses the identified research gaps by exploiting cameras, markers, and planar structures to mitigate the identified technical challenges. A fast plane extraction algorithm is developed for efficient modeling and understanding of built environments. A marker registration algorithm is designed for robust, accurate, cost-efficient, and rapidly reconfigurable pose estimation in unstructured and featureless environments. Camera marker networks are then established for unified and systematic design, estimation, and uncertainty analysis in larger scale applications. The proposed algorithms' efficiency has been validated through comprehensive experiments. Specifically, the speed, accuracy and robustness of the fast plane extraction and the marker registration have been demonstrated to be superior to existing state-of-the-art algorithms. These algorithms have also been implemented in two groups of ARC applications to demonstrate the proposed framework's effectiveness, wherein the applications themselves have significant social and economic value. The first group is related to in-situ robotic machinery, including an autonomous manipulator for assembling digital architecture designs on construction sites to help improve productivity and quality; and an intelligent guidance and monitoring system for articulated machinery such as excavators to help improve safety. The second group emphasizes human-machine interaction to make ARC more effective, including a mobile Building Information Modeling and way-finding platform with discrete location recognition to increase indoor facility management efficiency; and a 3D scanning and modeling solution for rapid and cost-efficient dimension checking and concise as-built modeling.PHDCivil EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/113481/1/cforrest_1.pd

    Robust perception of humans for mobile robots RGB-depth algorithms for people tracking, re-identification and action recognition

    Get PDF
    Human perception is one of the most important skills for a mobile robot sharing its workspace with humans. This is not only true for navigation, because people have to be avoided differently than other obstacles, but also because mobile robots must be able to truly interact with humans. In a near future, we can imagine that robots will be more and more present in every house and will perform services useful to the well-being of humans. For this purpose, robust people tracking algorithms must be exploited and person re-identification techniques play an important role for allowing robots to recognize a person after a full occlusion or after long periods of time. Moreover, they must be able to recognize what humans are doing, in order to react accordingly, helping them if needed or also learning from them. This thesis tackles these problems by proposing approaches which combine algorithms based on both RGB and depth information which can be obtained with recently introduced consumer RGB-D sensors. Our key contribution to people detection and tracking research is a depth-clustering method which allows to apply a robust image-based people detector only to a small subset of possible detection windows, thus decreasing the number of false detections while reaching high computational efficiency. We also advance person re-identification research by proposing two techniques exploiting depth-based skeletal tracking algorithms: one is targeted to short-term re-identification and creates a compact, yet discrimative signature of people based on computing features at skeleton keypoints, which are highly repeatable and semantically meaningful; the other extract long-term features, such as 3D shape, to compare people by matching the corresponding 3D point cloud acquired with a RGB-D sensor. In order to account for the fact that people are articulated and not rigid objects, it exploits 3D skeleton information for warping people point clouds to a standard pose, thus making them directly comparable by means of least square fitting. Finally, we describe an extension of flow-based action recognition methods to the RGB-D domain which computes motion over time of persons' 3D points by exploiting joint color and depth information and recognizes human actions by classifying gridded descriptors of 3D flow. A further contribution of this thesis is the creation of a number of new RGB-D datasets which allow to compare different algorithms on data acquired by consumer RGB-D sensors. All these datasets have been publically released in order to foster research in these fields
    • …
    corecore