180 research outputs found

    Learning Language-Conditioned Deformable Object Manipulation with Graph Dynamics

    Full text link
    Multi-task learning of deformable object manipulation is a challenging problem in robot manipulation. Most previous works address this problem in a goal-conditioned way and adapt goal images to specify different tasks, which limits the multi-task learning performance and can not generalize to new tasks. Thus, we adapt language instruction to specify deformable object manipulation tasks and propose a learning framework. We first design a unified Transformer-based architecture to understand multi-modal data and output picking and placing action. Besides, we have introduced the visible connectivity graph to tackle nonlinear dynamics and complex configuration of the deformable object. Both simulated and real experiments have demonstrated that the proposed method is effective and can generalize to unseen instructions and tasks. Compared with the state-of-the-art method, our method achieves higher success rates (87.2% on average) and has a 75.6% shorter inference time. We also demonstrate that our method performs well in real-world experiments.Comment: submitted to ICRA 202

    Effects of transgenic Cry1Ac + CpTI cotton on non-target mealybug pest Ferrisia virgata and its predator Cryptolaemus montrouzieri

    Get PDF
    Recently, several invasive mealybugs (Hemiptera: Pseudococcidae) have rapidly spread to Asia and have become a serious threat to the production of cotton including transgenic cotton. Thus far, studies have mainly focused on the effects of mealybugs on non-transgenic cotton, without fully considering their effects on transgenic cotton and trophic interactions. Therefore, investigating the potential effects of mealybugs on transgenic cotton and their key natural enemies is vitally important. A first study on the effects of transgenic cotton on a non-target mealybug, Ferrisia virgata (Cockerell) (Hemiptera: Pseudococcidae) was performed by comparing its development, survival and body weight on transgenic cotton leaves expressing Cry1Ac (Bt toxin) + CpTI (Cowpea Trypsin Inhibitor) with those on its near-isogenic non-transgenic line. Furthermore, the development, survival, body weight, fecundity, adult longevity and feeding preference of the mealybug predator Cryptolaemus montrouzieri Mulsant (Coleoptera: Coccinellidae) was assessed when fed F. virgata maintained on transgenic cotton. In order to investigate potential transfer of Cry1Ac and CpTI proteins via the food chain, protein levels in cotton leaves, mealybugs and ladybirds were quantified. Experimental results showed that F. virgata could infest this bivalent transgenic cotton. No significant differences were observed in the physiological parameters of the predator C. montrouzieri offered F. virgata reared on transgenic cotton or its near-isogenic line. Cry1Ac and CpTI proteins were detected in transgenic cotton leaves, but no detectable levels of both proteins were present in the mealybug or its predator when reared on transgenic cotton leaves. Our bioassays indicated that transgenic cotton poses a negligible risk to the predatory coccinellid C. montrouzieri via its prey, the mealybug F.virgata

    Scene Graph for Embodied Exploration in Cluttered Scenario

    Full text link
    The ability to handle objects in cluttered environment has been long anticipated by robotic community. However, most of works merely focus on manipulation instead of rendering hidden semantic information in cluttered objects. In this work, we introduce the scene graph for embodied exploration in cluttered scenarios to solve this problem. To validate our method in cluttered scenario, we adopt the Manipulation Question Answering (MQA) tasks as our test benchmark, which requires an embodied robot to have the active exploration ability and semantic understanding ability of vision and language.As a general solution framework to the task, we propose an imitation learning method to generate manipulations for exploration. Meanwhile, a VQA model based on dynamic scene graph is adopted to comprehend a series of RGB frames from wrist camera of manipulator along with every step of manipulation is conducted to answer questions in our framework.The experiments on of MQA dataset with different interaction requirements demonstrate that our proposed framework is effective for MQA task a representative of tasks in cluttered scenario

    MQA: Answering the Question via Robotic Manipulation

    Full text link
    In this paper, we propose a novel task -- Manipulation Question Answering (MQA), where the robot is required to find the answer to the question by actively exploring the environment via manipulation. A framework consisting of a QA model and a manipulation model is proposed to solve this problem. For the QA model, we adopt the method of Visual Question Answering (VQA). For the manipulation model, a Deep Q Network (DQN) model is proposed to generate manipulations. By manipulating objects, the robot can continuously explore the bin until the answer to the question is found. Besides, a novel dataset for simulation that contains a variety of object models, complicated scenarios and corresponding question-answer pairs is established. Extensive experiments have been conducted to validate the effectiveness of the proposed framework

    Gated Attention Coding for Training High-performance and Efficient Spiking Neural Networks

    Full text link
    Spiking neural networks (SNNs) are emerging as an energy-efficient alternative to traditional artificial neural networks (ANNs) due to their unique spike-based event-driven nature. Coding is crucial in SNNs as it converts external input stimuli into spatio-temporal feature sequences. However, most existing deep SNNs rely on direct coding that generates powerless spike representation and lacks the temporal dynamics inherent in human vision. Hence, we introduce Gated Attention Coding (GAC), a plug-and-play module that leverages the multi-dimensional gated attention unit to efficiently encode inputs into powerful representations before feeding them into the SNN architecture. GAC functions as a preprocessing layer that does not disrupt the spike-driven nature of the SNN, making it amenable to efficient neuromorphic hardware implementation with minimal modifications. Through an observer model theoretical analysis, we demonstrate GAC's attention mechanism improves temporal dynamics and coding efficiency. Experiments on CIFAR10/100 and ImageNet datasets demonstrate that GAC achieves state-of-the-art accuracy with remarkable efficiency. Notably, we improve top-1 accuracy by 3.10\% on CIFAR100 with only 6-time steps and 1.07\% on ImageNet while reducing energy usage to 66.9\% of the previous works. To our best knowledge, it is the first time to explore the attention-based dynamic coding scheme in deep SNNs, with exceptional effectiveness and efficiency on large-scale datasets.Comment: 12 pages, 7 figure

    Vectorial structure of a hard-edged-diffracted four-petal Gaussian beam in the far field

    Full text link
    Based on the vector angular spectrum method and the stationary phase method and the fact that a circular aperture function can be expanded into a finite sum of complex Gaussian functions, the analytical vectorial structure of a four-petal Gaussian beam (FPGB) diffracted by a circular aperture is derived in the far field. The energy flux distributions and the diffraction effect introduced by the aperture are studied and illustrated graphically. Moreover, the influence of the f-parameter and the truncation parameter on the nonparaxiality is demonstrated in detail. In addition, the analytical formulas obtained in this paper can degenerate into un-apertured case when the truncation parameter tends to infinity. This work is beneficial to strengthen the understanding of vectorial properties of the FPGB diffracted by a circular aperture

    Analytical vectorial structure of non-paraxial four-petal Gaussian beams in the far field

    Full text link
    The analytical vectorial structure of non-paraxial four-petal Gaussian beams(FPGBs) in the far field has been studied based on vector angular spectrum method and stationary phase method. In terms of analytical electromagnetic representations of the TE and TM terms, the energy flux distributions of the TE term, the TM term, and the whole beam are derived in the far field, respectively. According to our investigation, the FPGBs can evolve into a number of small petals in the far field. The number of the petals is determined by the order of input beam. The physical pictures of the FPGBs are well illustrated from the vectorial structure, which is beneficial to strengthen the understanding of vectorial properties of the FPGBs
    corecore