153 research outputs found

    Study of the Effects of Ionic Liquids as Electrolyte Addictive for Redox Flow Batteries

    Get PDF
    Master'sMASTER OF SCIENC

    Open-World Multi-Task Control Through Goal-Aware Representation Learning and Adaptive Horizon Prediction

    Full text link
    We study the problem of learning goal-conditioned policies in Minecraft, a popular, widely accessible yet challenging open-ended environment for developing human-level multi-task agents. We first identify two main challenges of learning such policies: 1) the indistinguishability of tasks from the state distribution, due to the vast scene diversity, and 2) the non-stationary nature of environment dynamics caused by partial observability. To tackle the first challenge, we propose Goal-Sensitive Backbone (GSB) for the policy to encourage the emergence of goal-relevant visual state representations. To tackle the second challenge, the policy is further fueled by an adaptive horizon prediction module that helps alleviate the learning uncertainty brought by the non-stationary dynamics. Experiments on 20 Minecraft tasks show that our method significantly outperforms the best baseline so far; in many of them, we double the performance. Our ablation and exploratory studies then explain how our approach beat the counterparts and also unveil the surprising bonus of zero-shot generalization to new scenes (biomes). We hope our agent could help shed some light on learning goal-conditioned, multi-task agents in challenging, open-ended environments like Minecraft.Comment: This paper is accepted by CVPR202

    Making Sense of Audio Vibration for Liquid Height Estimation in Robotic Pouring

    Get PDF
    In this paper, we focus on the challenging perception problem in robotic pouring. Most of the existing approaches either leverage visual or haptic information. However, these techniques may suffer from poor generalization performances on opaque containers or concerning measuring precision. To tackle these drawbacks, we propose to make use of audio vibration sensing and design a deep neural network PouringNet to predict the liquid height from the audio fragment during the robotic pouring task. PouringNet is trained on our collected real-world pouring dataset with multimodal sensing data, which contains more than 3000 recordings of audio, force feedback, video and trajectory data of the human hand that performs the pouring task. Each record represents a complete pouring procedure. We conduct several evaluations on PouringNet with our dataset and robotic hardware. The results demonstrate that our PouringNet generalizes well across different liquid containers, positions of the audio receiver, initial liquid heights and types of liquid, and facilitates a more robust and accurate audio-based perception for robotic pouring.Comment: Checkout project page for video, code and dataset: https://lianghongzhuo.github.io/AudioPourin

    GROOT: Learning to Follow Instructions by Watching Gameplay Videos

    Full text link
    We study the problem of building a controller that can follow open-ended instructions in open-world environments. We propose to follow reference videos as instructions, which offer expressive goal specifications while eliminating the need for expensive text-gameplay annotations. A new learning framework is derived to allow learning such instruction-following controllers from gameplay videos while producing a video instruction encoder that induces a structured goal space. We implement our agent GROOT in a simple yet effective encoder-decoder architecture based on causal transformers. We evaluate GROOT against open-world counterparts and human players on a proposed Minecraft SkillForge benchmark. The Elo ratings clearly show that GROOT is closing the human-machine gap as well as exhibiting a 70% winning rate over the best generalist agent baseline. Qualitative analysis of the induced goal space further demonstrates some interesting emergent properties, including the goal composition and complex gameplay behavior synthesis. The project page is available at https://craftjarvis-groot.github.io

    Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents

    Full text link
    We investigate the challenge of task planning for multi-task embodied agents in open-world environments. Two main difficulties are identified: 1) executing plans in an open-world environment (e.g., Minecraft) necessitates accurate and multi-step reasoning due to the long-term nature of tasks, and 2) as vanilla planners do not consider how easy the current agent can achieve a given sub-task when ordering parallel sub-goals within a complicated plan, the resulting plan could be inefficient or even infeasible. To this end, we propose "D\underline{D}escribe, E\underline{E}xplain, P\underline{P}lan and S\underline{S}elect" (DEPS\textbf{DEPS}), an interactive planning approach based on Large Language Models (LLMs). DEPS facilitates better error correction on initial LLM-generated plan\textit{plan} by integrating description\textit{description} of the plan execution process and providing self-explanation\textit{explanation} of feedback when encountering failures during the extended planning phases. Furthermore, it includes a goal selector\textit{selector}, which is a trainable module that ranks parallel candidate sub-goals based on the estimated steps of completion, consequently refining the initial plan. Our experiments mark the milestone of the first zero-shot multi-task agent that can robustly accomplish 70+ Minecraft tasks and nearly double the overall performances. Further testing reveals our method's general effectiveness in popularly adopted non-open-ended domains as well (i.e., ALFWorld and tabletop manipulation). The ablation and exploratory studies detail how our design beats the counterparts and provide a promising update on the ObtainDiamond\texttt{ObtainDiamond} grand challenge with our approach. The code is released at https://github.com/CraftJarvis/MC-Planner.Comment: NeurIPS 202

    Interferometric inverse synthetic aperture radar experiment using an interferometric linear frequency modulated continuous wave millimetre-wave radar

    Get PDF
    D. Felguera-Martín,1 J.-T. González-Partida,1 P. Almorox-González,1 M. Burgos-García,1 and B.-P. Dorta-Naranjo2 1Universidad Politécnica de Madrid, Ciudad Universitaria s/n, Grupo de Microondas y Radar. Departamento de Señales, Sistemas y Radiocomunicaciones, Madrid, Spain 2Universidad de Las Palmas de Gran Canaria, Departamento de Señales y Comunicaciones, Las Palmas de Gran Canaria, Spain An interferometric linear frequency modulated continuous wave (LFMCW) millimetre-wave radar is presented, along with the results of an experiment conducted to study the feasibility of using it in a future millimetre-wave interferometric inverse synthetic aperture radar (InISAR) system. First, a description of the radar is given. Then, the signal processing chain is described, with special attention to the phase unwrapping technique. The interferometric phase is obtained by unwrapping the prominent target's phase in each antenna using a sliding frame processing technique. Cell migration issues in this method are also addressed. Simulations were carried out to illustrate and assess the processing chain and to show the effects of multipath echoes on the height measurement. In the real experiment, the range, speed and height of a moving target were tracked over consecutive inverse synthetic aperture radar (ISAR) image frames, verifying the performance of the whole system
    corecore