48,033 research outputs found

    EARL: Eye-on-Hand Reinforcement Learner for Dynamic Grasping with Active Pose Estimation

    Full text link
    In this paper, we explore the dynamic grasping of moving objects through active pose tracking and reinforcement learning for hand-eye coordination systems. Most existing vision-based robotic grasping methods implicitly assume target objects are stationary or moving predictably. Performing grasping of unpredictably moving objects presents a unique set of challenges. For example, a pre-computed robust grasp can become unreachable or unstable as the target object moves, and motion planning must also be adaptive. In this work, we present a new approach, Eye-on-hAnd Reinforcement Learner (EARL), for enabling coupled Eye-on-Hand (EoH) robotic manipulation systems to perform real-time active pose tracking and dynamic grasping of novel objects without explicit motion prediction. EARL readily addresses many thorny issues in automated hand-eye coordination, including fast-tracking of 6D object pose from vision, learning control policy for a robotic arm to track a moving object while keeping the object in the camera's field of view, and performing dynamic grasping. We demonstrate the effectiveness of our approach in extensive experiments validated on multiple commercial robotic arms in both simulations and complex real-world tasks.Comment: Presented on IROS 2023 Corresponding author Siddarth Jai

    NON-RIGID MULTI-BODY TRACKING IN RGBD STREAMS

    Get PDF
    To efficiently collect training data for an off-the-shelf object detector, we consider the problem of segmenting and tracking non-rigid objects from RGBD sequences by introducing the spatio-temporal matrix with very few assumptions – no prior object model and no stationary sensor. Spatial temporal matrix is able to encode not only spatial associations between multiple objects, but also component-level spatio temporal associations that allow the correction of falsely segmented objects in the presence of various types of interaction among multiple objects. Extensive experiments over complex human/animal body motions with occlusions and body part motions demonstrate that our approach substantially improves tracking robustness and segmentation accuracy

    Dynamic object tracking and classification from a moving platform

    Get PDF
    This thesis presents an object-level SLAM system capable of tracking objects in frame and classifying stationary and moving objects. The system combines two open-source algorithms, Mask R-CNN and ORB-SLAM. Mask R-CNN provides instance-level object detection and segmentation, while ORB-SLAM provides keypoint detection, camera tracking, and local mapping. A typical SLAM system assumes a static environment and treats dynamic objects in the scene as outliers. By using object-level information from Mask R-CNN, we extend the capability to recognize and track dynamic objects. The system uses only monocular images as input, resulting in numerous potentially low-cost applications without the need for calibrating multiple sensors or using a stereo rig. This system gives a mobile agent the capability of understanding and potentially interacting with its dynamic environment

    Pixel-Level Deep Multi-Dimensional Embeddings for Homogeneous Multiple Object Tracking

    Get PDF
    The goal of Multiple Object Tracking (MOT) is to locate multiple objects and keep track of their individual identities and trajectories given a sequence of (video) frames. A popular approach to MOT is tracking by detection consisting of two processing components: detection (identification of objects of interest in individual frames) and data association (connecting data from multiple frames). This work addresses the detection component by introducing a method based on semantic instance segmentation, i.e., assigning labels to all visible pixels such that they are unique among different instances. Modern tracking methods often built around Convolutional Neural Networks (CNNs) and additional, explicitly-defined post-processing steps. This work introduces two detection methods that incorporate multi-dimensional embeddings. We train deep CNNs to produce easily-clusterable embeddings for semantic instance segmentation and to enable object detection through pose estimation. The use of embeddings allows the method to identify per-pixel instance membership for both tasks. Our method specifically targets applications that require long-term tracking of homogeneous targets using a stationary camera. Furthermore, this method was developed and evaluated on a livestock tracking application which presents exceptional challenges that generalized tracking methods are not equipped to solve. This is largely because contemporary datasets for multiple object tracking lack properties that are specific to livestock environments. These include a high degree of visual similarity between targets, complex physical interactions, long-term inter-object occlusions, and a fixed-cardinality set of targets. For the reasons stated above, our method is developed and tested with the livestock application in mind and, specifically, group-housed pigs are evaluated in this work. Our method reliably detects pigs in a group housed environment based on the publicly available dataset with 99% precision and 95% using pose estimation and achieves 80% accuracy when using semantic instance segmentation at 50% IoU threshold. Results demonstrate our method\u27s ability to achieve consistent identification and tracking of group-housed livestock, even in cases where the targets are occluded and despite the fact that they lack uniquely identifying features. The pixel-level embeddings used by the proposed method are thoroughly evaluated in order to demonstrate their properties and behaviors when applied to real data. Adivser: Lance C. Pére

    Tracking Object Existence From an Autonomous Patrol Vehicle

    Get PDF
    An autonomous vehicle patrols a large region, during which an algorithm receives measurements of detected potential objects within its sensor range. The goal of the algorithm is to track all objects in the region over time. This problem differs from traditional multi-target tracking scenarios because the region of interest is much larger than the sensor range and relies on the movement of the sensor through this region for coverage. The goal is to know whether anything has changed between visits to the same location. In particular, two kinds of alert conditions must be detected: (1) a previously detected object has disappeared and (2) a new object has appeared in a location already checked. For the time an object is within sensor range, the object can be assumed to remain stationary, changing position only between visits. The problem is difficult because the upstream object detection processing is likely to make many errors, resulting in heavy clutter (false positives) and missed detections (false negatives), and because only noisy, bearings-only measurements are available. This work has three main goals: (1) Associate incoming measurements with known objects or mark them as new objects or false positives, as appropriate. For this, a multiple hypothesis tracker was adapted to this scenario. (2) Localize the objects using multiple bearings-only measurements to provide estimates of global position (e.g., latitude and longitude). A nonlinear Kalman filter extension provides these 2D position estimates using the 1D measurements. (3) Calculate the probability that a suspected object truly exists (in the estimated position), and determine whether alert conditions have been triggered (for new objects or disappeared objects). The concept of a probability of existence was created, and a new Bayesian method for updating this probability at each time step was developed. A probabilistic multiple hypothesis approach is chosen because of its superiority in handling the uncertainty arising from errors in sensors and upstream processes. However, traditional target tracking methods typically assume a stationary detection volume of interest, whereas in this case, one must make adjustments for being able to see only a small portion of the region of interest and understand when an alert situation has occurred. To track object existence inside and outside the vehicle's sensor range, a probability of existence was defined for each hypothesized object, and this value was updated at every time step in a Bayesian manner based on expected characteristics of the sensor and object and whether that object has been detected in the most recent time step. Then, this value feeds into a sequential probability ratio test (SPRT) to determine the status of the object (suspected, confirmed, or deleted). Alerts are sent upon selected status transitions. Additionally, in order to track objects that move in and out of sensor range and update the probability of existence appropriately a variable probability detection has been defined and the hypothesis probability equations have been re-derived to accommodate this change. Unsupervised object tracking is a pervasive issue in automated perception systems. This work could apply to any mobile platform (ground vehicle, sea vessel, air vehicle, or orbiter) that intermittently revisits regions of interest and needs to determine whether anything interesting has changed

    Effectiveness of "rescue saccades" on the accuracy of tracking multiple moving targets: An eye-tracking study on the effects of target occlusions

    Get PDF
    Occlusion is one of the main challenges in tracking multiple moving objects. In almost all real-world scenarios, a moving object or a stationary obstacle occludes targets partially or completely for a short or long time during their movement. A previous study (Zelinsky & Todor, 2010) reported that subjects make timely saccades toward the object in danger of being occluded. Observers make these so-called "rescue saccades" to prevent target swapping. In this study, we examined whether these saccades are helpful. To this aim, we used as the stimuli recorded videos from natural movement of zebrafish larvae swimming freely in a circular container. We considered two main types of occlusion: object-object occlusions that naturally exist in the videos, and object-occluder occlusions created by adding a stationary doughnut-shape occluder in some videos. Four different scenarios were studied: (1) no occlusions, (2) only object-object occlusions, (3) only object-occluder occlusion, or (4) both object-object and object-occluder occlusions. For each condition, two set sizes (two and four) were applied. Participants' eye movements were recorded during tracking, and rescue saccades were extracted afterward. The results showed that rescue saccades are helpful in handling object-object occlusions but had no reliable effect on tracking through object-occluder occlusions. The presence of occlusions generally increased visual sampling of the scenes; nevertheless, tracking accuracy declined due to occlusion

    Illumination invariant stationary object detection

    Get PDF
    A real-time system for the detection and tracking of moving objects that becomes stationary in a restricted zone. A new pixel classification method based on the segmentation history image is used to identify stationary objects in the scene. These objects are then tracked using a novel adaptive edge orientation-based tracking method. Experimental results have shown that the tracking technique gives more than a 95% detection success rate, even if objects are partially occluded. The tracking results, together with the historic edge maps, are analysed to remove objects that are no longer stationary or are falsely identified as foreground regions because of sudden changes in the illumination conditions. The technique has been tested on over 7 h of video recorded at different locations and time of day, both outdoors and indoors. The results obtained are compared with other available state-of-the-art methods
    corecore