119 research outputs found

    Neural Volumetric Memory for Visual Locomotion Control

    Full text link
    Legged robots have the potential to expand the reach of autonomy beyond paved roads. In this work, we consider the difficult problem of locomotion on challenging terrains using a single forward-facing depth camera. Due to the partial observability of the problem, the robot has to rely on past observations to infer the terrain currently beneath it. To solve this problem, we follow the paradigm in computer vision that explicitly models the 3D geometry of the scene and propose Neural Volumetric Memory (NVM), a geometric memory architecture that explicitly accounts for the SE(3) equivariance of the 3D world. NVM aggregates feature volumes from multiple camera views by first bringing them back to the ego-centric frame of the robot. We test the learned visual-locomotion policy on a physical robot and show that our approach, which explicitly introduces geometric priors during training, offers superior performance than more na\"ive methods. We also include ablation studies and show that the representations stored in the neural volumetric memory capture sufficient geometric information to reconstruct the scene. Our project page with videos is https://rchalyang.github.io/NVM .Comment: CVPR 2023 Highlight. Our project page with videos is https://rchalyang.github.io/NV

    Boosted ab initio Cryo-EM 3D Reconstruction with ACE-EM

    Full text link
    The central problem in cryo-electron microscopy (cryo-EM) is to recover the 3D structure from noisy 2D projection images which requires estimating the missing projection angles (poses). Recent methods attempted to solve the 3D reconstruction problem with the autoencoder architecture, which suffers from the latent vector space sampling problem and frequently produces suboptimal pose inferences and inferior 3D reconstructions. Here we present an improved autoencoder architecture called ACE (Asymmetric Complementary autoEncoder), based on which we designed the ACE-EM method for cryo-EM 3D reconstructions. Compared to previous methods, ACE-EM reached higher pose space coverage within the same training time and boosted the reconstruction performance regardless of the choice of decoders. With this method, the Nyquist resolution (highest possible resolution) was reached for 3D reconstructions of both simulated and experimental cryo-EM datasets. Furthermore, ACE-EM is the only amortized inference method that reached the Nyquist resolution

    Two elementary band representation model, Fermi surface nesting, and surface topological superconductivity in AAV3_{3}Sb5_ {5} (A=K, Rb, CsA = \text{K, Rb, Cs})

    Full text link
    The recently discovered vanadium-based Kagome metals AAV3_{3}Sb5_{5} (A=K, Rb, CsA = \text{K, Rb, Cs}) are of great interest with the interplay of charge density wave (CDW) order, band topology and superconductivity. In this paper, by identifying elementary band representations (EBRs), we construct a two-EBR graphene-Kagome model to capture the two low-energy van-Hove-singularity dispersions and, more importantly, the nontrivial band topology in these Kagome metals. This model consists of Ag@3gA_g@3g (V-dx2−y2/z2d_{x^2-y^2/z^2}, Kagome sites) and A2′′@2dA_2''@2d EBRs (Sb1-pzp_z, honeycomb sites). We have investigated the Fermi surface instability by calculating the electronic susceptibility χ(q)\chi(\mathbf{q}). Prominent Fermi-surface nesting peaks are obtained at three L points, where the zz component of the nesting vector shows intimate relationship with the anticrossing point along M--L. The nesting peaks at L are consistent with the 2×2×22\times 2\times 2 CDW reconstruction in these compounds. In addition, the sublattice-resolved bare susceptibility is calculated and similar sharp peaks are observed at the L points, indicating a strong antiferromagnetic fluctuation. Assuming a bulk ss-wave superconducting pairing, helical surface states and nontrivial superconducting gap are obtained on the (001) surface. In analogous to FeTe1−x_{1-x}Sex_{x} superconductor, our results establish another material realization of a stoichiometric superconductor with nontrivial band topology, providing a promising platform for studying exotic Majorana physics in condensed matte

    Learning Vision-Guided Quadrupedal Locomotion End-to-End with Cross-Modal Transformers

    Full text link
    We propose to address quadrupedal locomotion tasks using Reinforcement Learning (RL) with a Transformer-based model that learns to combine proprioceptive information and high-dimensional depth sensor inputs. While learning-based locomotion has made great advances using RL, most methods still rely on domain randomization for training blind agents that generalize to challenging terrains. Our key insight is that proprioceptive states only offer contact measurements for immediate reaction, whereas an agent equipped with visual sensory observations can learn to proactively maneuver environments with obstacles and uneven terrain by anticipating changes in the environment many steps ahead. In this paper, we introduce LocoTransformer, an end-to-end RL method for quadrupedal locomotion that leverages a Transformer-based model for fusing proprioceptive states and visual observations. We evaluate our method in challenging simulated environments with different obstacles and uneven terrain. We show that our method obtains significant improvements over policies with only proprioceptive state inputs, and that Transformer-based models further improve generalization across environments. Our project page with videos is at https://RchalYang.github.io/LocoTransformer .Comment: Our project page with videos is at https://RchalYang.github.io/LocoTransforme
    • …
    corecore