444 research outputs found

    Scene understanding by robotic interactive perception

    Get PDF
    This thesis presents a novel and generic visual architecture for scene understanding by robotic interactive perception. This proposed visual architecture is fully integrated into autonomous systems performing object perception and manipulation tasks. The proposed visual architecture uses interaction with the scene, in order to improve scene understanding substantially over non-interactive models. Specifically, this thesis presents two experimental validations of an autonomous system interacting with the scene: Firstly, an autonomous gaze control model is investigated, where the vision sensor directs its gaze to satisfy a scene exploration task. Secondly, autonomous interactive perception is investigated, where objects in the scene are repositioned by robotic manipulation. The proposed visual architecture for scene understanding involving perception and manipulation tasks has four components: 1) A reliable vision system, 2) Camera-hand eye calibration to integrate the vision system into an autonomous robot’s kinematic frame chain, 3) A visual model performing perception tasks and providing required knowledge for interaction with scene, and finally, 4) A manipulation model which, using knowledge received from the perception model, chooses an appropriate action (from a set of simple actions) to satisfy a manipulation task. This thesis presents contributions for each of the aforementioned components. Firstly, a portable active binocular robot vision architecture that integrates a number of visual behaviours are presented. This active vision architecture has the ability to verge, localise, recognise and simultaneously identify multiple target object instances. The portability and functional accuracy of the proposed vision architecture is demonstrated by carrying out both qualitative and comparative analyses using different robot hardware configurations, feature extraction techniques and scene perspectives. Secondly, a camera and hand-eye calibration methodology for integrating an active binocular robot head within a dual-arm robot are described. For this purpose, the forward kinematic model of the active robot head is derived and the methodology for calibrating and integrating the robot head is described in detail. A rigid calibration methodology has been implemented to provide a closed-form hand-to-eye calibration chain and this has been extended with a mechanism to allow the camera external parameters to be updated dynamically for optimal 3D reconstruction to meet the requirements for robotic tasks such as grasping and manipulating rigid and deformable objects. It is shown from experimental results that the robot head achieves an overall accuracy of fewer than 0.3 millimetres while recovering the 3D structure of a scene. In addition, a comparative study between current RGB-D cameras and our active stereo head within two dual-arm robotic test-beds is reported that demonstrates the accuracy and portability of our proposed methodology. Thirdly, this thesis proposes a visual perception model for the task of category-wise objects sorting, based on Gaussian Process (GP) classification that is capable of recognising objects categories from point cloud data. In this approach, Fast Point Feature Histogram (FPFH) features are extracted from point clouds to describe the local 3D shape of objects and a Bag-of-Words coding method is used to obtain an object-level vocabulary representation. Multi-class Gaussian Process classification is employed to provide a probability estimate of the identity of the object and serves the key role of modelling perception confidence in the interactive perception cycle. The interaction stage is responsible for invoking the appropriate action skills as required to confirm the identity of an observed object with high confidence as a result of executing multiple perception-action cycles. The recognition accuracy of the proposed perception model has been validated based on simulation input data using both Support Vector Machine (SVM) and GP based multi-class classifiers. Results obtained during this investigation demonstrate that by using a GP-based classifier, it is possible to obtain true positive classification rates of up to 80\%. Experimental validation of the above semi-autonomous object sorting system shows that the proposed GP based interactive sorting approach outperforms random sorting by up to 30\% when applied to scenes comprising configurations of household objects. Finally, a fully autonomous visual architecture is presented that has been developed to accommodate manipulation skills for an autonomous system to interact with the scene by object manipulation. This proposed visual architecture is mainly made of two stages: 1) A perception stage, that is a modified version of the aforementioned visual interaction model, 2) An interaction stage, that performs a set of ad-hoc actions relying on the information received from the perception stage. More specifically, the interaction stage simply reasons over the information (class label and associated probabilistic confidence score) received from perception stage to choose one of the following two actions: 1) An object class has been identified with high confidence, so remove from the scene and place it in the designated basket/bin for that particular class. 2) An object class has been identified with less probabilistic confidence, since from observation and inspired from the human behaviour of inspecting doubtful objects, an action is chosen to further investigate that object in order to confirm the object’s identity by capturing more images from different views in isolation. The perception stage then processes these views, hence multiple perception-action/interaction cycles take place. From an application perspective, the task of autonomous category based objects sorting is performed and the experimental design for the task is described in detail

    Vision-Based Observation Models for Lower Limb 3D Tracking with a Moving Platform

    Get PDF
    Tracking and understanding human gait is an important step towards improving elderly mobility and safety. This thesis presents a vision-based tracking system that estimates the 3D pose of a wheeled walker user's lower limbs with cameras mounted on the moving walker. The tracker estimates 3D poses from images of the lower limbs in the coronal plane in a dynamic, uncontrolled environment. It employs a probabilistic approach based on particle filtering with three different camera setups: a monocular RGB camera, binocular RGB cameras, and a depth camera. For the RGB cameras, observation likelihoods are designed to compare the colors and gradients of each frame with initial templates that are manually extracted. Two strategies are also investigated for handling appearance change of tracking target: increasing number of templates and using different representations of colors. For the depth camera, two observation likelihoods are developed: the first one works directly in the 3D space, while the second one works in the projected image space. Experiments are conducted to evaluate the performance of the tracking system with different users for all three camera setups. It is demonstrated that the trackers with the RGB cameras produce results with higher error as compared to the depth camera, and the strategies for handling appearance change improve tracking accuracy in general. On the other hand, the tracker with the depth sensor successfully tracks the 3D poses of users over the entire video sequence and is robust against unfavorable conditions such as partial occlusion, missing observations, and deformable tracking target

    151-168

    Get PDF

    Stereo Vision System for Remotely Operated Robots

    Get PDF

    An Architecture for Online Affordance-based Perception and Whole-body Planning

    Get PDF
    The DARPA Robotics Challenge Trials held in December 2013 provided a landmark demonstration of dexterous mobile robots executing a variety of tasks aided by a remote human operator using only data from the robot's sensor suite transmitted over a constrained, field-realistic communications link. We describe the design considerations, architecture, implementation and performance of the software that Team MIT developed to command and control an Atlas humanoid robot. Our design emphasized human interaction with an efficient motion planner, where operators expressed desired robot actions in terms of affordances fit using perception and manipulated in a custom user interface. We highlight several important lessons we learned while developing our system on a highly compressed schedule

    Toward Simulation-Based Training Validation Protocols: Exploring 3d Stereo with Incremental Rehearsal and Partial Occlusion to Instigate and Modulate Smooth Pursuit and Saccade Responses in Baseball Batting

    Get PDF
    “Keeping your eye on the ball” is a long-standing tenet in baseball batting. And yet, there are no protocols for objectively conditioning, measuring, and/or evaluating eye-on-ball coordination performance relative to baseball-pitch trajectories. Although video games and other virtual simulation technologies offer alternatives for training and obtaining objective measures, baseball batting instruction has relied on traditional eye-pitch coordination exercises with qualitative “face validation”, statistics of whole-task batting performance, and/or subjective batter-interrogation methods, rather than on direct, quantitative eye-movement performance evaluations. Further, protocols for validating transfer-of-training (ToT) for video games and other simulation-based training have not been established in general ― or for eye-movement training, specifically. An exploratory research study was conducted to consider the ecological and ToT validity of a part-task, virtual-fastball simulator implemented in 3D stereo along with a rotary pitching machine standing as proxy for the live-pitch referent. The virtual-fastball and live-pitch simulation couple was designed to facilitate objective eye-movement response measures to live and virtual stimuli. The objective measures 1) served to assess the ecological validity of virtual fastballs, 2) informed the characterization and comparison of eye-movement strategies employed by expert and novice batters, 3) enabled a treatment protocol relying on repurposed incremental-rehearsal and partial-occlusion methods intended to instigate and modulate strategic eye movements, and 4) revealed whether the simulation-based treatment resulted in positive (or negative) ToT in the real task. Results indicated that live fastballs consistently elicited different saccade onset time responses than virtual fastballs. Saccade onset times for live fastballs were consistent with catch-up saccades that follow the smooth-pursuit maximum velocity threshold of approximately 40-70˚/sec while saccade onset times for virtual fastballs lagged in the order of 13%. More experienced batters employed more deliberate and timely combinations of smooth pursuit and catch-up saccades than less experienced batters, enabling them to position their eye to meet the ball near the front edge of home plate. Smooth pursuit and saccade modulation from treatment was inconclusive from virtual-pitch pre- and post-treatment comparisons, but comparisons of live-pitch pre- and post-treatment indicate ToT improvements. Lagging saccade onset times from virtual-pitch suggest possible accommodative-vergence impairment due to accommodation-vergence conflict inherent to 3D stereo displays

    High-precision grasping and placing for mobile robots

    Get PDF
    This work presents a manipulation system for multiple labware in life science laboratories using the H20 mobile robots. The H20 robot is equipped with the Kinect V2 sensor to identify and estimate the position of the required labware on the workbench. The local features recognition based on SURF algorithm is used. The recognition process is performed for the labware to be grasped and for the workbench holder. Different grippers and labware containers are designed to manipulate different weights of labware and to realize a safe transportation

    Aerospace Medicine and Biology. A continuing bibliography with indexes

    Get PDF
    This bibliography lists 244 reports, articles, and other documents introduced into the NASA scientific and technical information system in February 1981. Aerospace medicine and aerobiology topics are included. Listings for physiological factors, astronaut performance, control theory, artificial intelligence, and cybernetics are included

    Deictic primitives for general purpose navigation

    Get PDF
    A visually-based deictic primative used as an elementary command set for general purpose navigation was investigated. It was shown that a simple 'follow your eyes' scenario is sufficient for tracking a moving target. Limitations of velocity, acceleration, and modeling of the response of the mechanical systems were enforced. Realistic paths of the robots were produced during the simulation. Scientists could remotely command a planetary rover to go to a particular rock formation that may be interesting. Similarly an expert at plant maintenance could obtain diagnostic information remotely by using deictic primitives on a mobile are used in the deictic primitives, we could imagine that the exact same control software could be used for all of these applications

    The Active Stereo Probe: The Design and Implementation of an Active Videometrics System

    Get PDF
    This thesis describes research leading to the design and development of the Active Stereo Probe (ASP): an active vision based videometrics system. The ASP espouses both definitions of active vision by integrating structured illumination with a steerable binocular camera platform (or head). However, the primary function of the ASP is to recover quantitative 3D surface models of a scene from stereo images captured from the system's stereo pair of CCD video cameras. Stereo matching is performed using a development of Zhengping and Mowforth's Multiple Scale Signal Matcher (MSSM) stereo matcher. The performance of the original MSSM algorithm was dramatically improved, both in terms of speed of execution and dynamic range, by completely re-implementing it using an efficient scale space pyramid image representation. A range of quantitative performance tests for stereo matchers was developed, and these were applied to the newly developed MSSM stereo matcher to verify its suitability for use in the ASP. The performance of the stereo matcher is further improved by employing the ASP's structured illumination device to bathe the imaged scene in textured light. Few previously reported dynamic binocular camera heads have been able to perform any type of quantitative vision task. It is argued here that this failure has arisen mainly from the rudimentary nature of the design process applied to previous heads. Therefore, in order to address this problem, a new rigorous approach, suitable for the design of both dynamic and static stereo vision systems, was devised. This approach relies extensively upon system modelling as part of the design process. In order to support this new design approach, a general mathematical model of stereo imaging systems was developed and implemented within a software simulator. This simulator was then applied to the analysis of the requirements of the ASP and the MSSM stereo matcher. A specification for the imaging and actuation components of the ASP was hence obtained which was predicted to meet its performance requirements. This led directly to the fabrication of the completed ASP sensor head. The developed approach and model has subsequently been used successfully for the design of several other quantitative stereo vision systems. A vital requirement of any vision system that is intended to perform quantitative measurement is calibration. A novel calibration scheme was devised for the ASP by adopting advanced techniques from the field of photogrammetry and adapting them for use in the context of a dynamic computer vision system. The photogrammetric technique known as the Direct Linear Transform was used successfully in the implementation of the first, static stage of this calibration scheme. A significant aspect of the work reported in this thesis is the importance given to integrating the components developed for the ASP, i.e. the sensor head, the stereo matching software and the calibration software, into a complete videometric system. The success of this approach is demonstrated by the high quality of 3D surface models obtained using the integrated videometric system that was developed
    corecore