811 research outputs found
PoseFusion: Robust Object-in-Hand Pose Estimation with SelectLSTM
Accurate estimation of the relative pose between an object and a robot hand
is critical for many manipulation tasks. However, most of the existing
object-in-hand pose datasets use two-finger grippers and also assume that the
object remains fixed in the hand without any relative movements, which is not
representative of real-world scenarios. To address this issue, a 6D
object-in-hand pose dataset is proposed using a teleoperation method with an
anthropomorphic Shadow Dexterous hand. Our dataset comprises RGB-D images,
proprioception and tactile data, covering diverse grasping poses, finger
contact states, and object occlusions. To overcome the significant hand
occlusion and limited tactile sensor contact in real-world scenarios, we
propose PoseFusion, a hybrid multi-modal fusion approach that integrates the
information from visual and tactile perception channels. PoseFusion generates
three candidate object poses from three estimators (tactile only, visual only,
and visuo-tactile fusion), which are then filtered by a SelectLSTM network to
select the optimal pose, avoiding inferior fusion poses resulting from modality
collapse. Extensive experiments demonstrate the robustness and advantages of
our framework. All data and codes are available on the project website:
https://elevenjiang1.github.io/ObjectInHand-Dataset
Global position system sensor model for robotics simulator
oday there is acute problem of automatic navigation for different types of robots, unmanned vehicles and people. Increasing number of robotic vehicles requires them to have more accurate navigation in the environment. Development of algorithms for precise navigation requires a large amount of original data from sensors and it is impossible to test some situations the real world. Simulation as a method to study such objects is promising in solving this problem. The aim of this work is to develop a simulation model for universal global position system (GPS) sensor and configurable models of atmospheric effects to simulate real GPS receiver measurements in the normal environment. To achieve this goal a study of GPS receivers and modeling was done. The problem of modeling of the sensor system is considered for GPS system and Earth's atmosphere, but the results can be easily adapted to other sensors, such as GLONASS and GALILEO. As a simulation package for environment, we chose Unreal Engine 4 because of its precise physical simulation, allowing integration model of the model directly into the environment. Using Unreal Engine package we developed and tested model of the atmosphere and GPS receiver. Possibility of models’ configuration allowed us to test compliance of our model to the real environment. The resulting accuracy in accordance with real GPS receiver is over 95 %
A time-varying Kalman filter for low-acceleration attitude estimation
CC BY-NC-ND 4.0
https://creativecommons.org/licenses/by-nc-nd/4.0/[Abstract]: This work shows an attitude estimator (AE) based on a time-varying Kalman filter (TVKF) and adapted to those cases where a low-acceleration assumption can be applied. This filter is an extended version of a previously published time-varying Kalman filter attitude estimator (TVKAE). A comparative analysis of the accuracies of those two estimators is provided. The efficiencies of both filters are also compared with those of other published AEs. The results show that the new AE achieves the best overall performance, followed by the original one.Xunta de Galicia; EDC431C-2021/39This research has been financed by the Xunta de Galicia and the European Regional Development Funds through grant EDC431C-2021/39, the Spanish Ministry of Education and Science under grants PID2021-126220OB-100 and TED2021-129847B-I0
Bio-inspired retinal optic flow perception in robotic navigation
This thesis concerns the bio-inspired visual perception of motion with emphasis on locomotion targeting robotic systems. By continuously registering moving visual features in the human retina, a sensation of a visual flow cue is created. An interpretation of visual flow cues forms a low-level motion perception more known as retinal optic flow. Retinal optic flow is often mentioned and credited in human locomotor research but only in theory and simulated environments so far. Reconstructing the retinal optic flow fields using existing methods of estimating optic flow and experimental data from naive test subjects provides further insight into how it interacts with intermittent control behavior and dynamic gazing. The retinal optic flow is successfully demonstrated during a vehicular steering task scenario and further supports the idea that humans may use such perception to aid their ability to correct their steering during navigation.To achieve the reconstruction and estimation of the retinal optic flow, a set of optic flow estimators were fairly and systematically evaluated on the criteria on run-time predictability and reliability, and performance accuracy. A formalized methodology using containerization technology for performing the benchmarking was developed to generate the results. Furthermore, the readiness in road vehicles for the adoption of modern robotic software and related software processes were investigated. This was done with special emphasis on real-time computing and introducing containerization and microservice design paradigm. By doing so, continuous integration, continuous deployment, and continuous experimentation were enabled in order to aid further development and research. With the method of estimating retinal optic flow and its interaction with intermittent control, a more complete vision-based bionic steering control model is to be proposed and tested in a live robotic system
TRIDENT: A Framework for Autonomous Underwater Intervention
TRIDENT is a STREP project recently approved by the European Commission whose proposal
was submitted to the ICT call 4 of the 7th Framework Program. The project proposes a new methodology
for multipurpose underwater intervention tasks. To that end, a cooperative team formed with an
Autonomous Surface Craft and an Intervention Autonomous Underwater Vehicle will be used. The
proposed methodology splits the mission in two stages mainly devoted to survey and intervention tasks,
respectively. The project brings together research skills specific to the marine environments in navigation
and mapping for underwater robotics, multi-sensory perception, intelligent control architectures, vehiclemanipulator
systems and dexterous manipulation. TRIDENT is a three years project and its start is planned
by first months of 2010.This work is partially supported by the European Commission
through FP7-ICT2009-248497 projec
Towards Collaborative Simultaneous Localization and Mapping: a Survey of the Current Research Landscape
Motivated by the tremendous progress we witnessed in recent years, this paper
presents a survey of the scientific literature on the topic of Collaborative
Simultaneous Localization and Mapping (C-SLAM), also known as multi-robot SLAM.
With fleets of self-driving cars on the horizon and the rise of multi-robot
systems in industrial applications, we believe that Collaborative SLAM will
soon become a cornerstone of future robotic applications. In this survey, we
introduce the basic concepts of C-SLAM and present a thorough literature
review. We also outline the major challenges and limitations of C-SLAM in terms
of robustness, communication, and resource management. We conclude by exploring
the area's current trends and promising research avenues.Comment: 44 pages, 3 figure
Multimodal Brain-Computer Interface for In-Vehicle Driver Cognitive Load Measurement: Dataset and Baselines
Through this paper, we introduce a novel driver cognitive load assessment
dataset, CL-Drive, which contains Electroencephalogram (EEG) signals along with
other physiological signals such as Electrocardiography (ECG) and Electrodermal
Activity (EDA) as well as eye tracking data. The data was collected from 21
subjects while driving in an immersive vehicle simulator, in various driving
conditions, to induce different levels of cognitive load in the subjects. The
tasks consisted of 9 complexity levels for 3 minutes each. Each driver reported
their subjective cognitive load every 10 seconds throughout the experiment. The
dataset contains the subjective cognitive load recorded as ground truth. In
this paper, we also provide benchmark classification results for different
machine learning and deep learning models for both binary and ternary label
distributions. We followed 2 evaluation criteria namely 10-fold and
leave-one-subject-out (LOSO). We have trained our models on both hand-crafted
features as well as on raw data.Comment: 13 pages, 8 figures, 11 tables. This work has been submitted to the
IEEE for possible publication. Copyright may be transferred without notic
Differentiable world programs
L'intelligence artificielle (IA) moderne a ouvert de nouvelles perspectives prometteuses pour la création de robots intelligents. En particulier, les architectures d'apprentissage basées sur le gradient (réseaux neuronaux profonds) ont considérablement amélioré la compréhension des scènes 3D en termes de perception, de raisonnement et d'action.
Cependant, ces progrès ont affaibli l'attrait de nombreuses techniques ``classiques'' développées au cours des dernières décennies.
Nous postulons qu'un mélange de méthodes ``classiques'' et ``apprises'' est la voie la plus prometteuse pour développer des modèles du monde flexibles, interprétables et exploitables : une nécessité pour les agents intelligents incorporés.
La question centrale de cette thèse est : ``Quelle est la manière idéale de combiner les techniques classiques avec des architectures d'apprentissage basées sur le gradient pour une compréhension riche du monde 3D ?''. Cette vision ouvre la voie à une multitude d'applications qui ont un impact fondamental sur la façon dont les agents physiques perçoivent et interagissent avec leur environnement. Cette thèse, appelée ``programmes différentiables pour modèler l'environnement'', unifie les efforts de plusieurs domaines étroitement liés mais actuellement disjoints, notamment la robotique, la vision par ordinateur, l'infographie et l'IA.
Ma première contribution---gradSLAM--- est un système de localisation et de cartographie simultanées (SLAM) dense et entièrement différentiable. En permettant le calcul du gradient à travers des composants autrement non différentiables tels que l'optimisation non linéaire par moindres carrés, le raycasting, l'odométrie visuelle et la cartographie dense, gradSLAM ouvre de nouvelles voies pour intégrer la reconstruction 3D classique et l'apprentissage profond.
Ma deuxième contribution - taskography - propose une sparsification conditionnée par la tâche de grandes scènes 3D encodées sous forme de graphes de scènes 3D. Cela permet aux planificateurs classiques d'égaler (et de surpasser) les planificateurs de pointe basés sur l'apprentissage en concentrant le calcul sur les attributs de la scène pertinents pour la tâche.
Ma troisième et dernière contribution---gradSim--- est un simulateur entièrement différentiable qui combine des moteurs physiques et graphiques différentiables pour permettre l'estimation des paramètres physiques et le contrôle visuomoteur, uniquement à partir de vidéos ou d'une image fixe.Modern artificial intelligence (AI) has created exciting new opportunities for building intelligent robots. In particular, gradient-based learning architectures (deep neural networks) have tremendously improved 3D scene understanding in terms of perception, reasoning, and action.
However, these advancements have undermined many ``classical'' techniques developed over the last few decades.
We postulate that a blend of ``classical'' and ``learned'' methods is the most promising path to developing flexible, interpretable, and actionable models of the world: a necessity for intelligent embodied agents.
``What is the ideal way to combine classical techniques with gradient-based learning architectures for a rich understanding of the 3D world?'' is the central question in this dissertation. This understanding enables a multitude of applications that fundamentally impact how embodied agents perceive and interact with their environment. This dissertation, dubbed ``differentiable world programs'', unifies efforts from multiple closely-related but currently-disjoint fields including robotics, computer vision, computer graphics, and AI.
Our first contribution---gradSLAM---is a fully differentiable dense simultaneous localization and mapping (SLAM) system. By enabling gradient computation through otherwise non-differentiable components such as nonlinear least squares optimization, ray casting, visual odometry, and dense mapping, gradSLAM opens up new avenues for integrating classical 3D reconstruction and deep learning.
Our second contribution---taskography---proposes a task-conditioned sparsification of large 3D scenes encoded as 3D scene graphs. This enables classical planners to match (and surpass) state-of-the-art learning-based planners by focusing computation on task-relevant scene attributes.
Our third and final contribution---gradSim---is a fully differentiable simulator that composes differentiable physics and graphics engines to enable physical parameter estimation and visuomotor control, solely from videos or a still image
- …