42 research outputs found
Urban2Vec: Incorporating Street View Imagery and POIs for Multi-Modal Urban Neighborhood Embedding
Understanding intrinsic patterns and predicting spatiotemporal
characteristics of cities require a comprehensive representation of urban
neighborhoods. Existing works relied on either inter- or intra-region
connectivities to generate neighborhood representations but failed to fully
utilize the informative yet heterogeneous data within neighborhoods. In this
work, we propose Urban2Vec, an unsupervised multi-modal framework which
incorporates both street view imagery and point-of-interest (POI) data to learn
neighborhood embeddings. Specifically, we use a convolutional neural network to
extract visual features from street view images while preserving geospatial
similarity. Furthermore, we model each POI as a bag-of-words containing its
category, rating, and review information. Analog to document embedding in
natural language processing, we establish the semantic similarity between
neighborhood ("document") and the words from its surrounding POIs in the vector
space. By jointly encoding visual, textual, and geospatial information into the
neighborhood representation, Urban2Vec can achieve performances better than
baseline models and comparable to fully-supervised methods in downstream
prediction tasks. Extensive experiments on three U.S. metropolitan areas also
demonstrate the model interpretability, generalization capability, and its
value in neighborhood similarity analysis.Comment: To appear in Proceedings of the Thirty-Fourth AAAI Conference on
Artificial Intelligence (AAAI-20
RL-ViGen: A Reinforcement Learning Benchmark for Visual Generalization
Visual Reinforcement Learning (Visual RL), coupled with high-dimensional
observations, has consistently confronted the long-standing challenge of
generalization. Despite the focus on algorithms aimed at resolving visual
generalization problems, we argue that the devil is in the existing benchmarks
as they are restricted to isolated tasks and generalization categories,
undermining a comprehensive evaluation of agents' visual generalization
capabilities. To bridge this gap, we introduce RL-ViGen: a novel Reinforcement
Learning Benchmark for Visual Generalization, which contains diverse tasks and
a wide spectrum of generalization types, thereby facilitating the derivation of
more reliable conclusions. Furthermore, RL-ViGen incorporates the latest
generalization visual RL algorithms into a unified framework, under which the
experiment results indicate that no single existing algorithm has prevailed
universally across tasks. Our aspiration is that RL-ViGen will serve as a
catalyst in this area, and lay a foundation for the future creation of
universal visual generalization RL agents suitable for real-world scenarios.
Access to our code and implemented algorithms is provided at
https://gemcollector.github.io/RL-ViGen/
LiCROM: Linear-Subspace Continuous Reduced Order Modeling with Neural Fields
Linear reduced-order modeling (ROM) simplifies complex simulations by
approximating the behavior of a system using a simplified kinematic
representation. Typically, ROM is trained on input simulations created with a
specific spatial discretization, and then serves to accelerate simulations with
the same discretization. This discretization-dependence is restrictive.
Becoming independent of a specific discretization would provide flexibility
to mix and match mesh resolutions, connectivity, and type (tetrahedral,
hexahedral) in training data; to accelerate simulations with novel
discretizations unseen during training; and to accelerate adaptive simulations
that temporally or parametrically change the discretization.
We present a flexible, discretization-independent approach to reduced-order
modeling. Like traditional ROM, we represent the configuration as a linear
combination of displacement fields. Unlike traditional ROM, our displacement
fields are continuous maps from every point on the reference domain to a
corresponding displacement vector; these maps are represented as implicit
neural fields.
With linear continuous ROM (LiCROM), our training set can include multiple
geometries undergoing multiple loading conditions, independent of their
discretization. This opens the door to novel applications of reduced order
modeling. We can now accelerate simulations that modify the geometry at
runtime, for instance via cutting, hole punching, and even swapping the entire
mesh. We can also accelerate simulations of geometries unseen during training.
We demonstrate one-shot generalization, training on a single geometry and
subsequently simulating various unseen geometries
H-InDex: Visual Reinforcement Learning with Hand-Informed Representations for Dexterous Manipulation
Human hands possess remarkable dexterity and have long served as a source of
inspiration for robotic manipulation. In this work, we propose a human
andformed visual representation learning framework to
solve difficult terous manipulation tasks ()
with reinforcement learning. Our framework consists of three stages: (i)
pre-training representations with 3D human hand pose estimation, (ii) offline
adapting representations with self-supervised keypoint detection, and (iii)
reinforcement learning with exponential moving average BatchNorm. The last two
stages only modify parameters of the pre-trained representation in
total, ensuring the knowledge from pre-training is maintained to the full
extent. We empirically study 12 challenging dexterous manipulation tasks and
find that H-InDex largely surpasses strong baseline methods and the recent
visual foundation models for motor control. Code is available at
https://yanjieze.com/H-InDex .Comment: NeurIPS 2023. Code and videos: https://yanjieze.com/H-InDe
In situ electron paramagnetic resonance spectroscopy using single nanodiamond sensors
An ultimate goal of electron paramagnetic resonance (EPR) spectroscopy is to
analyze molecular dynamics in place where it occurs, such as in a living cell.
The nanodiamond (ND) hosting nitrogen-vacancy (NV) centers will be a promising
EPR sensor to achieve this goal. However, ND-based EPR spectroscopy remains
elusive, due to the challenge of controlling NV centers without well-defined
orientations inside a flexible ND. Here, we show a generalized zero-field EPR
technique with spectra robust to the sensor's orientation. The key is applying
an amplitude modulation on the control field, which generates a series of
equidistant Floquet states with energy splitting being the
orientation-independent modulation frequency. We acquire the zero-field EPR
spectrum of vanadyl ions in aqueous glycerol solution with embedded single NDs,
paving the way towards \emph{in vivo} EPR