135 research outputs found
Neural Volumetric Memory for Visual Locomotion Control
Legged robots have the potential to expand the reach of autonomy beyond paved
roads. In this work, we consider the difficult problem of locomotion on
challenging terrains using a single forward-facing depth camera. Due to the
partial observability of the problem, the robot has to rely on past
observations to infer the terrain currently beneath it. To solve this problem,
we follow the paradigm in computer vision that explicitly models the 3D
geometry of the scene and propose Neural Volumetric Memory (NVM), a geometric
memory architecture that explicitly accounts for the SE(3) equivariance of the
3D world. NVM aggregates feature volumes from multiple camera views by first
bringing them back to the ego-centric frame of the robot. We test the learned
visual-locomotion policy on a physical robot and show that our approach, which
explicitly introduces geometric priors during training, offers superior
performance than more na\"ive methods. We also include ablation studies and
show that the representations stored in the neural volumetric memory capture
sufficient geometric information to reconstruct the scene. Our project page
with videos is https://rchalyang.github.io/NVM .Comment: CVPR 2023 Highlight. Our project page with videos is
https://rchalyang.github.io/NV
Boosted ab initio Cryo-EM 3D Reconstruction with ACE-EM
The central problem in cryo-electron microscopy (cryo-EM) is to recover the
3D structure from noisy 2D projection images which requires estimating the
missing projection angles (poses). Recent methods attempted to solve the 3D
reconstruction problem with the autoencoder architecture, which suffers from
the latent vector space sampling problem and frequently produces suboptimal
pose inferences and inferior 3D reconstructions. Here we present an improved
autoencoder architecture called ACE (Asymmetric Complementary autoEncoder),
based on which we designed the ACE-EM method for cryo-EM 3D reconstructions.
Compared to previous methods, ACE-EM reached higher pose space coverage within
the same training time and boosted the reconstruction performance regardless of
the choice of decoders. With this method, the Nyquist resolution (highest
possible resolution) was reached for 3D reconstructions of both simulated and
experimental cryo-EM datasets. Furthermore, ACE-EM is the only amortized
inference method that reached the Nyquist resolution
Two elementary band representation model, Fermi surface nesting, and surface topological superconductivity in VSb ()
The recently discovered vanadium-based Kagome metals VSb () are of great interest with the interplay of charge density
wave (CDW) order, band topology and superconductivity. In this paper, by
identifying elementary band representations (EBRs), we construct a two-EBR
graphene-Kagome model to capture the two low-energy van-Hove-singularity
dispersions and, more importantly, the nontrivial band topology in these Kagome
metals. This model consists of (V-, Kagome sites) and
EBRs (Sb1-, honeycomb sites). We have investigated the Fermi
surface instability by calculating the electronic susceptibility
. Prominent Fermi-surface nesting peaks are obtained at three
L points, where the component of the nesting vector shows intimate
relationship with the anticrossing point along M--L. The nesting peaks at L are
consistent with the CDW reconstruction in these compounds.
In addition, the sublattice-resolved bare susceptibility is calculated and
similar sharp peaks are observed at the L points, indicating a strong
antiferromagnetic fluctuation. Assuming a bulk -wave superconducting
pairing, helical surface states and nontrivial superconducting gap are obtained
on the (001) surface. In analogous to FeTeSe superconductor, our
results establish another material realization of a stoichiometric
superconductor with nontrivial band topology, providing a promising platform
for studying exotic Majorana physics in condensed matte
Learning Vision-Guided Quadrupedal Locomotion End-to-End with Cross-Modal Transformers
We propose to address quadrupedal locomotion tasks using Reinforcement
Learning (RL) with a Transformer-based model that learns to combine
proprioceptive information and high-dimensional depth sensor inputs. While
learning-based locomotion has made great advances using RL, most methods still
rely on domain randomization for training blind agents that generalize to
challenging terrains. Our key insight is that proprioceptive states only offer
contact measurements for immediate reaction, whereas an agent equipped with
visual sensory observations can learn to proactively maneuver environments with
obstacles and uneven terrain by anticipating changes in the environment many
steps ahead. In this paper, we introduce LocoTransformer, an end-to-end RL
method for quadrupedal locomotion that leverages a Transformer-based model for
fusing proprioceptive states and visual observations. We evaluate our method in
challenging simulated environments with different obstacles and uneven terrain.
We show that our method obtains significant improvements over policies with
only proprioceptive state inputs, and that Transformer-based models further
improve generalization across environments. Our project page with videos is at
https://RchalYang.github.io/LocoTransformer .Comment: Our project page with videos is at
https://RchalYang.github.io/LocoTransforme
- …