811 research outputs found

    PoseFusion: Robust Object-in-Hand Pose Estimation with SelectLSTM

    Full text link
    Accurate estimation of the relative pose between an object and a robot hand is critical for many manipulation tasks. However, most of the existing object-in-hand pose datasets use two-finger grippers and also assume that the object remains fixed in the hand without any relative movements, which is not representative of real-world scenarios. To address this issue, a 6D object-in-hand pose dataset is proposed using a teleoperation method with an anthropomorphic Shadow Dexterous hand. Our dataset comprises RGB-D images, proprioception and tactile data, covering diverse grasping poses, finger contact states, and object occlusions. To overcome the significant hand occlusion and limited tactile sensor contact in real-world scenarios, we propose PoseFusion, a hybrid multi-modal fusion approach that integrates the information from visual and tactile perception channels. PoseFusion generates three candidate object poses from three estimators (tactile only, visual only, and visuo-tactile fusion), which are then filtered by a SelectLSTM network to select the optimal pose, avoiding inferior fusion poses resulting from modality collapse. Extensive experiments demonstrate the robustness and advantages of our framework. All data and codes are available on the project website: https://elevenjiang1.github.io/ObjectInHand-Dataset

    Global position system sensor model for robotics simulator

    Get PDF
    oday there is acute problem of automatic navigation for different types of robots, unmanned vehicles and people. Increasing number of robotic vehicles requires them to have more accurate navigation in the environment. Development of algorithms for precise navigation requires a large amount of original data from sensors and it is impossible to test some situations the real world. Simulation as a method to study such objects is promising in solving this problem. The aim of this work is to develop a simulation model for universal global position system (GPS) sensor and configurable models of atmospheric effects to simulate real GPS receiver measurements in the normal environment. To achieve this goal a study of GPS receivers and modeling was done. The problem of modeling of the sensor system is considered for GPS system and Earth's atmosphere, but the results can be easily adapted to other sensors, such as GLONASS and GALILEO. As a simulation package for environment, we chose Unreal Engine 4 because of its precise physical simulation, allowing integration model of the model directly into the environment. Using Unreal Engine package we developed and tested model of the atmosphere and GPS receiver. Possibility of models’ configuration allowed us to test compliance of our model to the real environment. The resulting accuracy in accordance with real GPS receiver is over 95 %

    A time-varying Kalman filter for low-acceleration attitude estimation

    Get PDF
    CC BY-NC-ND 4.0 https://creativecommons.org/licenses/by-nc-nd/4.0/[Abstract]: This work shows an attitude estimator (AE) based on a time-varying Kalman filter (TVKF) and adapted to those cases where a low-acceleration assumption can be applied. This filter is an extended version of a previously published time-varying Kalman filter attitude estimator (TVKAE). A comparative analysis of the accuracies of those two estimators is provided. The efficiencies of both filters are also compared with those of other published AEs. The results show that the new AE achieves the best overall performance, followed by the original one.Xunta de Galicia; EDC431C-2021/39This research has been financed by the Xunta de Galicia and the European Regional Development Funds through grant EDC431C-2021/39, the Spanish Ministry of Education and Science under grants PID2021-126220OB-100 and TED2021-129847B-I0

    Bio-inspired retinal optic flow perception in robotic navigation

    Get PDF
    This thesis concerns the bio-inspired visual perception of motion with emphasis on locomotion targeting robotic systems. By continuously registering moving visual features in the human retina, a sensation of a visual flow cue is created. An interpretation of visual flow cues forms a low-level motion perception more known as retinal optic flow. Retinal optic flow is often mentioned and credited in human locomotor research but only in theory and simulated environments so far. Reconstructing the retinal optic flow fields using existing methods of estimating optic flow and experimental data from naive test subjects provides further insight into how it interacts with intermittent control behavior and dynamic gazing. The retinal optic flow is successfully demonstrated during a vehicular steering task scenario and further supports the idea that humans may use such perception to aid their ability to correct their steering during navigation.To achieve the reconstruction and estimation of the retinal optic flow, a set of optic flow estimators were fairly and systematically evaluated on the criteria on run-time predictability and reliability, and performance accuracy. A formalized methodology using containerization technology for performing the benchmarking was developed to generate the results. Furthermore, the readiness in road vehicles for the adoption of modern robotic software and related software processes were investigated. This was done with special emphasis on real-time computing and introducing containerization and microservice design paradigm. By doing so, continuous integration, continuous deployment, and continuous experimentation were enabled in order to aid further development and research. With the method of estimating retinal optic flow and its interaction with intermittent control, a more complete vision-based bionic steering control model is to be proposed and tested in a live robotic system

    TRIDENT: A Framework for Autonomous Underwater Intervention

    Get PDF
    TRIDENT is a STREP project recently approved by the European Commission whose proposal was submitted to the ICT call 4 of the 7th Framework Program. The project proposes a new methodology for multipurpose underwater intervention tasks. To that end, a cooperative team formed with an Autonomous Surface Craft and an Intervention Autonomous Underwater Vehicle will be used. The proposed methodology splits the mission in two stages mainly devoted to survey and intervention tasks, respectively. The project brings together research skills specific to the marine environments in navigation and mapping for underwater robotics, multi-sensory perception, intelligent control architectures, vehiclemanipulator systems and dexterous manipulation. TRIDENT is a three years project and its start is planned by first months of 2010.This work is partially supported by the European Commission through FP7-ICT2009-248497 projec

    Towards Collaborative Simultaneous Localization and Mapping: a Survey of the Current Research Landscape

    Get PDF
    Motivated by the tremendous progress we witnessed in recent years, this paper presents a survey of the scientific literature on the topic of Collaborative Simultaneous Localization and Mapping (C-SLAM), also known as multi-robot SLAM. With fleets of self-driving cars on the horizon and the rise of multi-robot systems in industrial applications, we believe that Collaborative SLAM will soon become a cornerstone of future robotic applications. In this survey, we introduce the basic concepts of C-SLAM and present a thorough literature review. We also outline the major challenges and limitations of C-SLAM in terms of robustness, communication, and resource management. We conclude by exploring the area's current trends and promising research avenues.Comment: 44 pages, 3 figure

    Multimodal Brain-Computer Interface for In-Vehicle Driver Cognitive Load Measurement: Dataset and Baselines

    Full text link
    Through this paper, we introduce a novel driver cognitive load assessment dataset, CL-Drive, which contains Electroencephalogram (EEG) signals along with other physiological signals such as Electrocardiography (ECG) and Electrodermal Activity (EDA) as well as eye tracking data. The data was collected from 21 subjects while driving in an immersive vehicle simulator, in various driving conditions, to induce different levels of cognitive load in the subjects. The tasks consisted of 9 complexity levels for 3 minutes each. Each driver reported their subjective cognitive load every 10 seconds throughout the experiment. The dataset contains the subjective cognitive load recorded as ground truth. In this paper, we also provide benchmark classification results for different machine learning and deep learning models for both binary and ternary label distributions. We followed 2 evaluation criteria namely 10-fold and leave-one-subject-out (LOSO). We have trained our models on both hand-crafted features as well as on raw data.Comment: 13 pages, 8 figures, 11 tables. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notic

    Differentiable world programs

    Full text link
    L'intelligence artificielle (IA) moderne a ouvert de nouvelles perspectives prometteuses pour la création de robots intelligents. En particulier, les architectures d'apprentissage basées sur le gradient (réseaux neuronaux profonds) ont considérablement amélioré la compréhension des scènes 3D en termes de perception, de raisonnement et d'action. Cependant, ces progrès ont affaibli l'attrait de nombreuses techniques ``classiques'' développées au cours des dernières décennies. Nous postulons qu'un mélange de méthodes ``classiques'' et ``apprises'' est la voie la plus prometteuse pour développer des modèles du monde flexibles, interprétables et exploitables : une nécessité pour les agents intelligents incorporés. La question centrale de cette thèse est : ``Quelle est la manière idéale de combiner les techniques classiques avec des architectures d'apprentissage basées sur le gradient pour une compréhension riche du monde 3D ?''. Cette vision ouvre la voie à une multitude d'applications qui ont un impact fondamental sur la façon dont les agents physiques perçoivent et interagissent avec leur environnement. Cette thèse, appelée ``programmes différentiables pour modèler l'environnement'', unifie les efforts de plusieurs domaines étroitement liés mais actuellement disjoints, notamment la robotique, la vision par ordinateur, l'infographie et l'IA. Ma première contribution---gradSLAM--- est un système de localisation et de cartographie simultanées (SLAM) dense et entièrement différentiable. En permettant le calcul du gradient à travers des composants autrement non différentiables tels que l'optimisation non linéaire par moindres carrés, le raycasting, l'odométrie visuelle et la cartographie dense, gradSLAM ouvre de nouvelles voies pour intégrer la reconstruction 3D classique et l'apprentissage profond. Ma deuxième contribution - taskography - propose une sparsification conditionnée par la tâche de grandes scènes 3D encodées sous forme de graphes de scènes 3D. Cela permet aux planificateurs classiques d'égaler (et de surpasser) les planificateurs de pointe basés sur l'apprentissage en concentrant le calcul sur les attributs de la scène pertinents pour la tâche. Ma troisième et dernière contribution---gradSim--- est un simulateur entièrement différentiable qui combine des moteurs physiques et graphiques différentiables pour permettre l'estimation des paramètres physiques et le contrôle visuomoteur, uniquement à partir de vidéos ou d'une image fixe.Modern artificial intelligence (AI) has created exciting new opportunities for building intelligent robots. In particular, gradient-based learning architectures (deep neural networks) have tremendously improved 3D scene understanding in terms of perception, reasoning, and action. However, these advancements have undermined many ``classical'' techniques developed over the last few decades. We postulate that a blend of ``classical'' and ``learned'' methods is the most promising path to developing flexible, interpretable, and actionable models of the world: a necessity for intelligent embodied agents. ``What is the ideal way to combine classical techniques with gradient-based learning architectures for a rich understanding of the 3D world?'' is the central question in this dissertation. This understanding enables a multitude of applications that fundamentally impact how embodied agents perceive and interact with their environment. This dissertation, dubbed ``differentiable world programs'', unifies efforts from multiple closely-related but currently-disjoint fields including robotics, computer vision, computer graphics, and AI. Our first contribution---gradSLAM---is a fully differentiable dense simultaneous localization and mapping (SLAM) system. By enabling gradient computation through otherwise non-differentiable components such as nonlinear least squares optimization, ray casting, visual odometry, and dense mapping, gradSLAM opens up new avenues for integrating classical 3D reconstruction and deep learning. Our second contribution---taskography---proposes a task-conditioned sparsification of large 3D scenes encoded as 3D scene graphs. This enables classical planners to match (and surpass) state-of-the-art learning-based planners by focusing computation on task-relevant scene attributes. Our third and final contribution---gradSim---is a fully differentiable simulator that composes differentiable physics and graphics engines to enable physical parameter estimation and visuomotor control, solely from videos or a still image
    corecore