676 research outputs found

    CabiNet: Scaling Neural Collision Detection for Object Rearrangement with Procedural Scene Generation

    Full text link
    We address the important problem of generalizing robotic rearrangement to clutter without any explicit object models. We first generate over 650K cluttered scenes - orders of magnitude more than prior work - in diverse everyday environments, such as cabinets and shelves. We render synthetic partial point clouds from this data and use it to train our CabiNet model architecture. CabiNet is a collision model that accepts object and scene point clouds, captured from a single-view depth observation, and predicts collisions for SE(3) object poses in the scene. Our representation has a fast inference speed of 7 microseconds per query with nearly 20% higher performance than baseline approaches in challenging environments. We use this collision model in conjunction with a Model Predictive Path Integral (MPPI) planner to generate collision-free trajectories for picking and placing in clutter. CabiNet also predicts waypoints, computed from the scene's signed distance field (SDF), that allows the robot to navigate tight spaces during rearrangement. This improves rearrangement performance by nearly 35% compared to baselines. We systematically evaluate our approach, procedurally generate simulated experiments, and demonstrate that our approach directly transfers to the real world, despite training exclusively in simulation. Robot experiment demos in completely unknown scenes and objects can be found at this http https://cabinet-object-rearrangement.github.i

    Tactile Perception And Visuotactile Integration For Robotic Exploration

    Get PDF
    As the close perceptual sibling of vision, the sense of touch has historically received less than deserved attention in both human psychology and robotics. In robotics, this may be attributed to at least two reasons. First, it suffers from the vicious cycle of immature sensor technology, which causes industry demand to be low, and then there is even less incentive to make existing sensors in research labs easy to manufacture and marketable. Second, the situation stems from a fear of making contact with the environment, avoided in every way so that visually perceived states do not change before a carefully estimated and ballistically executed physical interaction. Fortunately, the latter viewpoint is starting to change. Work in interactive perception and contact-rich manipulation are on the rise. Good reasons are steering the manipulation and locomotion communities’ attention towards deliberate physical interaction with the environment prior to, during, and after a task. We approach the problem of perception prior to manipulation, using the sense of touch, for the purpose of understanding the surroundings of an autonomous robot. The overwhelming majority of work in perception for manipulation is based on vision. While vision is a fast and global modality, it is insufficient as the sole modality, especially in environments where the ambient light or the objects therein do not lend themselves to vision, such as in darkness, smoky or dusty rooms in search and rescue, underwater, transparent and reflective objects, and retrieving items inside a bag. Even in normal lighting conditions, during a manipulation task, the target object and fingers are usually occluded from view by the gripper. Moreover, vision-based grasp planners, typically trained in simulation, often make errors that cannot be foreseen until contact. As a step towards addressing these problems, we present first a global shape-based feature descriptor for object recognition using non-prehensile tactile probing alone. Then, we investigate in making the tactile modality, local and slow by nature, more efficient for the task by predicting the most cost-effective moves using active exploration. To combine the local and physical advantages of touch and the fast and global advantages of vision, we propose and evaluate a learning-based method for visuotactile integration for grasping

    A Hybrid Visual Control Scheme to Assist the Visually Impaired with Guided Reaching Tasks

    Get PDF
    In recent years, numerous researchers have been working towards adapting technology developed for robotic control to use in the creation of high-technology assistive devices for the visually impaired. These types of devices have been proven to help visually impaired people live with a greater degree of confidence and independence. However, most prior work has focused primarily on a single problem from mobile robotics, namely navigation in an unknown environment. In this work we address the issue of the design and performance of an assistive device application to aid the visually-impaired with a guided reaching task. The device follows an eye-in-hand, IBLM visual servoing configuration with a single camera and vibrotactile feedback to the user to direct guided tracking during the reaching task. We present a model for the system that employs a hybrid control scheme based on a Discrete Event System (DES) approach. This approach avoids significant problems inherent in the competing classical control or conventional visual servoing models for upper limb movement found in the literature. The proposed hybrid model parameterizes the partitioning of the image state-space that produces a variable size targeting window for compensatory tracking in the reaching task. The partitioning is created through the positioning of hypersurface boundaries within the state space, which when crossed trigger events that cause DES-controller state transition that enable differing control laws. A set of metrics encompassing, accuracy (DD), precision (θe\theta_{e}), and overall tracking performance (ψ\psi) are also proposed to quantity system performance so that the effect of parameter variations and alternate controller configurations can be compared. To this end, a prototype called \texttt{aiReach} was constructed and experiments were conducted testing the functional use of the system and other supporting aspects of the system behaviour using participant volunteers. Results are presented validating the system design and demonstrating effective use of a two parameter partitioning scheme that utilizes a targeting window with additional hysteresis region to filtering perturbations due to natural proprioceptive limitations for precise control of upper limb movement. Results from the experiments show that accuracy performance increased with the use of the dual parameter hysteresis target window model (0.91≤D≤10.91 \leq D \leq 1, μ(D)=0.9644\mu(D)=0.9644, σ(D)=0.0172\sigma(D)=0.0172) over the single parameter fixed window model (0.82≤D≤0.980.82 \leq D \leq 0.98, μ(D)=0.9205\mu(D)=0.9205, σ(D)=0.0297\sigma(D)=0.0297) while the precision metric, θe\theta_{e}, remained relatively unchanged. In addition, the overall tracking performance metric produces scores which correctly rank the performance of the guided reaching tasks form most difficult to easiest

    Exploring Natural User Abstractions For Shared Perceptual Manipulator Task Modeling & Recovery

    Get PDF
    State-of-the-art domestic robot assistants are essentially autonomous mobile manipulators capable of exerting human-scale precision grasps. To maximize utility and economy, non-technical end-users would need to be nearly as efficient as trained roboticists in control and collaboration of manipulation task behaviors. However, it remains a significant challenge given that many WIMP-style tools require superficial proficiency in robotics, 3D graphics, and computer science for rapid task modeling and recovery. But research on robot-centric collaboration has garnered momentum in recent years; robots are now planning in partially observable environments that maintain geometries and semantic maps, presenting opportunities for non-experts to cooperatively control task behavior with autonomous-planning agents exploiting the knowledge. However, as autonomous systems are not immune to errors under perceptual difficulty, a human-in-the-loop is needed to bias autonomous-planning towards recovery conditions that resume the task and avoid similar errors. In this work, we explore interactive techniques allowing non-technical users to model task behaviors and perceive cooperatively with a service robot under robot-centric collaboration. We evaluate stylus and touch modalities that users can intuitively and effectively convey natural abstractions of high-level tasks, semantic revisions, and geometries about the world. Experiments are conducted with \u27pick-and-place\u27 tasks in an ideal \u27Blocks World\u27 environment using a Kinova JACO six degree-of-freedom manipulator. Possibilities for the architecture and interface are demonstrated with the following features; (1) Semantic \u27Object\u27 and \u27Location\u27 grounding that describe function and ambiguous geometries (2) Task specification with an unordered list of goal predicates, and (3) Guiding task recovery with implied scene geometries and trajectory via symmetry cues and configuration space abstraction. Empirical results from four user studies show our interface was much preferred than the control condition, demonstrating high learnability and ease-of-use that enable our non-technical participants to model complex tasks, provide effective recovery assistance, and teleoperative control

    Workshop on "Robotic assembly of 3D MEMS".

    No full text
    Proceedings of a workshop proposed in IEEE IROS'2007.The increase of MEMS' functionalities often requires the integration of various technologies used for mechanical, optical and electronic subsystems in order to achieve a unique system. These different technologies have usually process incompatibilities and the whole microsystem can not be obtained monolithically and then requires microassembly steps. Microassembly of MEMS based on micrometric components is one of the most promising approaches to achieve high-performance MEMS. Moreover, microassembly also permits to develop suitable MEMS packaging as well as 3D components although microfabrication technologies are usually able to create 2D and "2.5D" components. The study of microassembly methods is consequently a high stake for MEMS technologies growth. Two approaches are currently developped for microassembly: self-assembly and robotic microassembly. In the first one, the assembly is highly parallel but the efficiency and the flexibility still stay low. The robotic approach has the potential to reach precise and reliable assembly with high flexibility. The proposed workshop focuses on this second approach and will take a bearing of the corresponding microrobotic issues. Beyond the microfabrication technologies, performing MEMS microassembly requires, micromanipulation strategies, microworld dynamics and attachment technologies. The design and the fabrication of the microrobot end-effectors as well as the assembled micro-parts require the use of microfabrication technologies. Moreover new micromanipulation strategies are necessary to handle and position micro-parts with sufficiently high accuracy during assembly. The dynamic behaviour of micrometric objects has also to be studied and controlled. Finally, after positioning the micro-part, attachment technologies are necessary

    Development of Cognitive Capabilities in Humanoid Robots

    Get PDF
    Merged with duplicate record 10026.1/645 on 03.04.2017 by CS (TIS)Building intelligent systems with human level of competence is the ultimate grand challenge for science and technology in general, and especially for the computational intelligence community. Recent theories in autonomous cognitive systems have focused on the close integration (grounding) of communication with perception, categorisation and action. Cognitive systems are essential for integrated multi-platform systems that are capable of sensing and communicating. This thesis presents a cognitive system for a humanoid robot that integrates abilities such as object detection and recognition, which are merged with natural language understanding and refined motor controls. The work includes three studies; (1) the use of generic manipulation of objects using the NMFT algorithm, by successfully testing the extension of the NMFT to control robot behaviour; (2) a study of the development of a robotic simulator; (3) robotic simulation experiments showing that a humanoid robot is able to acquire complex behavioural, cognitive, and linguistic skills through individual and social learning. The robot is able to learn to handle and manipulate objects autonomously, to cooperate with human users, and to adapt its abilities to changes in internal and environmental conditions. The model and the experimental results reported in this thesis, emphasise the importance of embodied cognition, i.e. the humanoid robot's physical interaction between its body and the environment

    Integrated visual perception architecture for robotic clothes perception and manipulation

    Get PDF
    This thesis proposes a generic visual perception architecture for robotic clothes perception and manipulation. This proposed architecture is fully integrated with a stereo vision system and a dual-arm robot and is able to perform a number of autonomous laundering tasks. Clothes perception and manipulation is a novel research topic in robotics and has experienced rapid development in recent years. Compared to the task of perceiving and manipulating rigid objects, clothes perception and manipulation poses a greater challenge. This can be attributed to two reasons: firstly, deformable clothing requires precise (high-acuity) visual perception and dexterous manipulation; secondly, as clothing approximates a non-rigid 2-manifold in 3-space, that can adopt a quasi-infinite configuration space, the potential variability in the appearance of clothing items makes them difficult to understand, identify uniquely, and interact with by machine. From an applications perspective, and as part of EU CloPeMa project, the integrated visual perception architecture refines a pre-existing clothing manipulation pipeline by completing pre-wash clothes (category) sorting (using single-shot or interactive perception for garment categorisation and manipulation) and post-wash dual-arm flattening. To the best of the author’s knowledge, as investigated in this thesis, the autonomous clothing perception and manipulation solutions presented here were first proposed and reported by the author. All of the reported robot demonstrations in this work follow a perception-manipulation method- ology where visual and tactile feedback (in the form of surface wrinkledness captured by the high accuracy depth sensor i.e. CloPeMa stereo head or the predictive confidence modelled by Gaussian Processing) serve as the halting criteria in the flattening and sorting tasks, respectively. From scientific perspective, the proposed visual perception architecture addresses the above challenges by parsing and grouping 3D clothing configurations hierarchically from low-level curvatures, through mid-level surface shape representations (providing topological descriptions and 3D texture representations), to high-level semantic structures and statistical descriptions. A range of visual features such as Shape Index, Surface Topologies Analysis and Local Binary Patterns have been adapted within this work to parse clothing surfaces and textures and several novel features have been devised, including B-Spline Patches with Locality-Constrained Linear coding, and Topology Spatial Distance to describe and quantify generic landmarks (wrinkles and folds). The essence of this proposed architecture comprises 3D generic surface parsing and interpretation, which is critical to underpinning a number of laundering tasks and has the potential to be extended to other rigid and non-rigid object perception and manipulation tasks. The experimental results presented in this thesis demonstrate that: firstly, the proposed grasp- ing approach achieves on-average 84.7% accuracy; secondly, the proposed flattening approach is able to flatten towels, t-shirts and pants (shorts) within 9 iterations on-average; thirdly, the proposed clothes recognition pipeline can recognise clothes categories from highly wrinkled configurations and advances the state-of-the-art by 36% in terms of classification accuracy, achieving an 83.2% true-positive classification rate when discriminating between five categories of clothes; finally the Gaussian Process based interactive perception approach exhibits a substantial improvement over single-shot perception. Accordingly, this thesis has advanced the state-of-the-art of robot clothes perception and manipulation

    Scene understanding by robotic interactive perception

    Get PDF
    This thesis presents a novel and generic visual architecture for scene understanding by robotic interactive perception. This proposed visual architecture is fully integrated into autonomous systems performing object perception and manipulation tasks. The proposed visual architecture uses interaction with the scene, in order to improve scene understanding substantially over non-interactive models. Specifically, this thesis presents two experimental validations of an autonomous system interacting with the scene: Firstly, an autonomous gaze control model is investigated, where the vision sensor directs its gaze to satisfy a scene exploration task. Secondly, autonomous interactive perception is investigated, where objects in the scene are repositioned by robotic manipulation. The proposed visual architecture for scene understanding involving perception and manipulation tasks has four components: 1) A reliable vision system, 2) Camera-hand eye calibration to integrate the vision system into an autonomous robot’s kinematic frame chain, 3) A visual model performing perception tasks and providing required knowledge for interaction with scene, and finally, 4) A manipulation model which, using knowledge received from the perception model, chooses an appropriate action (from a set of simple actions) to satisfy a manipulation task. This thesis presents contributions for each of the aforementioned components. Firstly, a portable active binocular robot vision architecture that integrates a number of visual behaviours are presented. This active vision architecture has the ability to verge, localise, recognise and simultaneously identify multiple target object instances. The portability and functional accuracy of the proposed vision architecture is demonstrated by carrying out both qualitative and comparative analyses using different robot hardware configurations, feature extraction techniques and scene perspectives. Secondly, a camera and hand-eye calibration methodology for integrating an active binocular robot head within a dual-arm robot are described. For this purpose, the forward kinematic model of the active robot head is derived and the methodology for calibrating and integrating the robot head is described in detail. A rigid calibration methodology has been implemented to provide a closed-form hand-to-eye calibration chain and this has been extended with a mechanism to allow the camera external parameters to be updated dynamically for optimal 3D reconstruction to meet the requirements for robotic tasks such as grasping and manipulating rigid and deformable objects. It is shown from experimental results that the robot head achieves an overall accuracy of fewer than 0.3 millimetres while recovering the 3D structure of a scene. In addition, a comparative study between current RGB-D cameras and our active stereo head within two dual-arm robotic test-beds is reported that demonstrates the accuracy and portability of our proposed methodology. Thirdly, this thesis proposes a visual perception model for the task of category-wise objects sorting, based on Gaussian Process (GP) classification that is capable of recognising objects categories from point cloud data. In this approach, Fast Point Feature Histogram (FPFH) features are extracted from point clouds to describe the local 3D shape of objects and a Bag-of-Words coding method is used to obtain an object-level vocabulary representation. Multi-class Gaussian Process classification is employed to provide a probability estimate of the identity of the object and serves the key role of modelling perception confidence in the interactive perception cycle. The interaction stage is responsible for invoking the appropriate action skills as required to confirm the identity of an observed object with high confidence as a result of executing multiple perception-action cycles. The recognition accuracy of the proposed perception model has been validated based on simulation input data using both Support Vector Machine (SVM) and GP based multi-class classifiers. Results obtained during this investigation demonstrate that by using a GP-based classifier, it is possible to obtain true positive classification rates of up to 80\%. Experimental validation of the above semi-autonomous object sorting system shows that the proposed GP based interactive sorting approach outperforms random sorting by up to 30\% when applied to scenes comprising configurations of household objects. Finally, a fully autonomous visual architecture is presented that has been developed to accommodate manipulation skills for an autonomous system to interact with the scene by object manipulation. This proposed visual architecture is mainly made of two stages: 1) A perception stage, that is a modified version of the aforementioned visual interaction model, 2) An interaction stage, that performs a set of ad-hoc actions relying on the information received from the perception stage. More specifically, the interaction stage simply reasons over the information (class label and associated probabilistic confidence score) received from perception stage to choose one of the following two actions: 1) An object class has been identified with high confidence, so remove from the scene and place it in the designated basket/bin for that particular class. 2) An object class has been identified with less probabilistic confidence, since from observation and inspired from the human behaviour of inspecting doubtful objects, an action is chosen to further investigate that object in order to confirm the object’s identity by capturing more images from different views in isolation. The perception stage then processes these views, hence multiple perception-action/interaction cycles take place. From an application perspective, the task of autonomous category based objects sorting is performed and the experimental design for the task is described in detail
    • …
    corecore