80 research outputs found
Following High-level Navigation Instructions on a Simulated Quadcopter with Imitation Learning
We introduce a method for following high-level navigation instructions by
mapping directly from images, instructions and pose estimates to continuous
low-level velocity commands for real-time control. The Grounded Semantic
Mapping Network (GSMN) is a fully-differentiable neural network architecture
that builds an explicit semantic map in the world reference frame by
incorporating a pinhole camera projection model within the network. The
information stored in the map is learned from experience, while the
local-to-world transformation is computed explicitly. We train the model using
DAggerFM, a modified variant of DAgger that trades tabular convergence
guarantees for improved training speed and memory use. We test GSMN in virtual
environments on a realistic quadcopter simulator and show that incorporating an
explicit mapping and grounding modules allows GSMN to outperform strong neural
baselines and almost reach an expert policy performance. Finally, we analyze
the learned map representations and show that using an explicit map leads to an
interpretable instruction-following model.Comment: To appear in Robotics: Science and Systems (RSS), 201
Goethite Mineral Dissolution to Probe the Chemistry of Radiolytic Water in Liquid-Phase Transmission Electron Microscopy
Liquid-Phase Transmission Electron Microscopy (LP-TEM) enables in situ observations of the dynamic behavior of materials in liquids at high spatial and temporal resolution. During LP-TEM, incident electrons decompose water molecules into highly reactive species. Consequently, the chemistry of the irradiated aqueous solution is strongly altered, impacting the reactions to be observed. However, the short lifetime of these reactive species prevent their direct study. Here, the morphological changes of goethite during its dissolution are used as a marker system to evaluate the influence of radiation on the changes in solution chemistry. At low electron flux density, the morphological changes are equivalent to those observed under bulk acidic conditions, but the rate of dissolution is higher. On the contrary, at higher electron fluxes, the morphological evolution does not correspond to a unique acidic dissolution process. Combined with kinetic simulations of the steady state concentrations of generated reactive species in the aqueous medium, the results provide a unique insight into the redox and acidity interplay during radiation induced chemical changes in LP-TEM. The results not only reveal beam-induced radiation chemistry via a nanoparticle indicator, but also open up new perspectives in the study of the dissolution process in industrial or natural settings
Neural Fields for Robotic Object Manipulation from a Single Image
We present a unified and compact representation for object rendering, 3D
reconstruction, and grasp pose prediction that can be inferred from a single
image within a few seconds. We achieve this by leveraging recent advances in
the Neural Radiance Field (NeRF) literature that learn category-level priors
and fine-tune on novel objects with minimal data and time. Our insight is that
we can learn a compact shape representation and extract meaningful additional
information from it, such as grasping poses. We believe this to be the first
work to retrieve grasping poses directly from a NeRF-based representation using
a single viewpoint (RGB-only), rather than going through a secondary network
and/or representation. When compared to prior art, our method is two to three
orders of magnitude smaller while achieving comparable performance at view
reconstruction and grasping. Accompanying our method, we also propose a new
dataset of rendered shoes for training a sim-2-real NeRF method with grasping
poses for different widths of grippers.Comment: Submitted to ICRA 202
Nucleation and Crystallization of Ferrous Phosphate Hydrate via an Amorphous Intermediate
The fundamental processes of nucleation and crystallization are widely observed in systems relevant to material synthesis and biomineralization; yet most often, their mechanism remains unclear. In this study, we unravel the discrete stages of nucleation and crystallization of Fe3(PO4)2·8H2O (vivianite). We experimentally monitored the formation and transformation from ions to solid products by employing correlated, time-resolved in situ and ex situ approaches. We show that vivianite crystallization occurs in distinct stages via a transient amorphous precursor phase. The metastable amorphous ferrous phosphate (AFEP) intermediate could be isolated and stabilized. We resolved the differences in bonding environments, structure, and symmetric changes of the Fe site during the transformation of AFEP to crystalline vivianite through synchrotron X-ray absorption spectroscopy at the Fe K-edge. This intermediate AFEP phase has a lower water content and less distorted local symmetry, compared to the crystalline end product vivianite. Our combined results indicate that a nonclassical, hydration-induced nucleation and transformation driven by the incorporation and rearrangement of water molecules and ions (Fe2+ and PO43–) within the AFEP is the dominating mechanism of vivianite formation at moderately high to low vivianite supersaturations (saturation index ≤ 10.19). We offer fundamental insights into the aqueous, amorphous-to-crystalline transformations in the Fe2+–PO4 system and highlight the different attributes of the AFEP, compared to its crystalline counterpart
ProgPrompt: Generating Situated Robot Task Plans using Large Language Models
Task planning can require defining myriad domain knowledge about the world in
which a robot needs to act. To ameliorate that effort, large language models
(LLMs) can be used to score potential next actions during task planning, and
even generate action sequences directly, given an instruction in natural
language with no additional domain information. However, such methods either
require enumerating all possible next steps for scoring, or generate free-form
text that may contain actions not possible on a given robot in its current
context. We present a programmatic LLM prompt structure that enables plan
generation functional across situated environments, robot capabilities, and
tasks. Our key insight is to prompt the LLM with program-like specifications of
the available actions and objects in an environment, as well as with example
programs that can be executed. We make concrete recommendations about prompt
structure and generation constraints through ablation experiments, demonstrate
state of the art success rates in VirtualHome household tasks, and deploy our
method on a physical robot arm for tabletop tasks. Website at
progprompt.github.i
- …