53 research outputs found

    Rapid Pose Label Generation through Sparse Representation of Unknown Objects

    Full text link
    Deep Convolutional Neural Networks (CNNs) have been successfully deployed on robots for 6-DoF object pose estimation through visual perception. However, obtaining labeled data on a scale required for the supervised training of CNNs is a difficult task - exacerbated if the object is novel and a 3D model is unavailable. To this end, this work presents an approach for rapidly generating real-world, pose-annotated RGB-D data for unknown objects. Our method not only circumvents the need for a prior 3D object model (textured or otherwise) but also bypasses complicated setups of fiducial markers, turntables, and sensors. With the help of a human user, we first source minimalistic labelings of an ordered set of arbitrarily chosen keypoints over a set of RGB-D videos. Then, by solving an optimization problem, we combine these labels under a world frame to recover a sparse, keypoint-based representation of the object. The sparse representation leads to the development of a dense model and the pose labels for each image frame in the set of scenes. We show that the sparse model can also be efficiently used for scaling to a large number of new scenes. We demonstrate the practicality of the generated labeled dataset by training a pipeline for 6-DoF object pose estimation and a pixel-wise segmentation network

    Musculoskeletal Estimation Using Inertial Measurement Units and Single Video Image

    Get PDF
    International audienceWe address the problem of estimating the physical burden of a human body. This translates to monitor and estimate muscle tension and joint reaction forces of a mus-culoskeletal model in real-time. The system should minimize the discomfort generating by any sensors that needs to be fixed on the user. Our system combines a 3D pose estimation from vision and IMU sensors. We aim to minimize the number of IMU fixed to the subject while compensating the remaining lack of information with vision

    TransFusionOdom: Interpretable Transformer-based LiDAR-Inertial Fusion Odometry Estimation

    Full text link
    Multi-modal fusion of sensors is a commonly used approach to enhance the performance of odometry estimation, which is also a fundamental module for mobile robots. However, the question of \textit{how to perform fusion among different modalities in a supervised sensor fusion odometry estimation task?} is still one of challenging issues remains. Some simple operations, such as element-wise summation and concatenation, are not capable of assigning adaptive attentional weights to incorporate different modalities efficiently, which make it difficult to achieve competitive odometry results. Recently, the Transformer architecture has shown potential for multi-modal fusion tasks, particularly in the domains of vision with language. In this work, we propose an end-to-end supervised Transformer-based LiDAR-Inertial fusion framework (namely TransFusionOdom) for odometry estimation. The multi-attention fusion module demonstrates different fusion approaches for homogeneous and heterogeneous modalities to address the overfitting problem that can arise from blindly increasing the complexity of the model. Additionally, to interpret the learning process of the Transformer-based multi-modal interactions, a general visualization approach is introduced to illustrate the interactions between modalities. Moreover, exhaustive ablation studies evaluate different multi-modal fusion strategies to verify the performance of the proposed fusion strategy. A synthetic multi-modal dataset is made public to validate the generalization ability of the proposed fusion strategy, which also works for other combinations of different modalities. The quantitative and qualitative odometry evaluations on the KITTI dataset verify the proposed TransFusionOdom could achieve superior performance compared with other related works.Comment: Submitted to IEEE Sensors Journal with some modifications. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibl

    Conversion of ethanol to propylene over HZSM-5(Ga) co-modified with lanthanum and phosphorous

    Get PDF
    Conversion of ethanol to propylene was carried out over HZSM-5(Ga) co-modified with lanthanum and phosphorous (La/P/HZSM-5(Ga)). The propylene yield was strongly dependent on both the La/Ga and P/Ga ratios, and the highest value of ca.29 C-% was obtained at a P/Ga ratio of 1 and a La/Ga ratio of 0.4. FT-IR, P-31 MAS NMR, and Ga-71 MAS NMR measurements demonstrate that the introduced lanthanum reacts with the pre-introduced phosphorous to regenerate some of Bronsted acid sites (Si(OH)Ga), and accordingly, the Bronsted acid sites are homogeneously distributed within the zeolite framework. In addition, the catalytic stability as well as the catalytic activity of HZSM-5(Ga) was effectively enhanced by co-modification with lanthanum and phosphorous because of the suppression of carbonaceous deposition and elimination of gallium from the zeolite framework

    Pose Space Surface Manipulation

    Get PDF
    Example-based mesh deformation techniques produce natural and realistic shapes by learning the space of deformations from examples. However, skeleton-based methods cannot manipulate a global mesh structure naturally, whereas the mesh-based approaches based on a translational control do not allow the user to edit a local mesh structure intuitively. This paper presents an example-driven mesh editing framework that achieves both global and local pose manipulations. The proposed system is built with a surface deformation method based on a two-step linear optimization technique and achieves direct manipulations of a model surface using translational and rotational controls. With the translational control, the user can create a model in natural poses easily. The rotational control can adjust the local pose intuitively by bending and twisting. We encode example deformations with a rotation-invariant mesh representation which handles large rotations in examples. To incorporate example deformations, we infer a pose from the handle translations/rotations and perform pose space interpolation, thereby avoiding involved nonlinear optimization. With the two-step linear approach combined with the proposed multiresolution deformation method, we can edit models at interactive rates without losing important deformation effects such as muscle bulging

    A Study on Learned Feature Maps Toward Direct Visual Servoing

    No full text
    International audienceDirect Visual Servoing (DVS) is a technique used in robotics and computer vision where visual information, typically obtained from camera pixels brightness, is directly used for controlling the motion of a robot. DVS is known for its ability to achieve accurate positioning, thanks to the redundancy of information all without the necessity to rely on geometric features.In this paper, we introduce a novel approach where pixel brightness is replaced with learned feature maps as the visual information for the servoing loop. The aim of this paper is to present a procedure to extract, transform and integrate deep neural networks feature maps toward replacing the brightness in a DVS control loop
    • …
    corecore