112 research outputs found
CASE: Learning Conditional Adversarial Skill Embeddings for Physics-based Characters
We present CASE, an efficient and effective framework that learns
conditional Adversarial Skill Embeddings for physics-based characters. Our
physically simulated character can learn a diverse repertoire of skills while
providing controllability in the form of direct manipulation of the skills to
be performed. CASE divides the heterogeneous skill motions into distinct
subsets containing homogeneous samples for training a low-level conditional
model to learn conditional behavior distribution. The skill-conditioned
imitation learning naturally offers explicit control over the character's
skills after training. The training course incorporates the focal skill
sampling, skeletal residual forces, and element-wise feature masking to balance
diverse skills of varying complexities, mitigate dynamics mismatch to master
agile motions and capture more general behavior characteristics, respectively.
Once trained, the conditional model can produce highly diverse and realistic
skills, outperforming state-of-the-art models, and can be repurposed in various
downstream tasks. In particular, the explicit skill control handle allows a
high-level policy or user to direct the character with desired skill
specifications, which we demonstrate is advantageous for interactive character
animation.Comment: SIGGRAPH Asia 202
Zero-shot Domain Adaptation for Neural Machine Translation with Retrieved Phrase-level Prompts
Domain adaptation is an important challenge for neural machine translation.
However, the traditional fine-tuning solution requires multiple extra training
and yields a high cost. In this paper, we propose a non-tuning paradigm,
resolving domain adaptation with a prompt-based method. Specifically, we
construct a bilingual phrase-level database and retrieve relevant pairs from it
as a prompt for the input sentences. By utilizing Retrieved Phrase-level
Prompts (RePP), we effectively boost the translation quality. Experiments show
that our method improves domain-specific machine translation for 6.2 BLEU
scores and improves translation constraints for 11.5% accuracy without
additional training
3D-Aware Object Goal Navigation via Simultaneous Exploration and Identification
Object goal navigation (ObjectNav) in unseen environments is a fundamental
task for Embodied AI. Agents in existing works learn ObjectNav policies based
on 2D maps, scene graphs, or image sequences. Considering this task happens in
3D space, a 3D-aware agent can advance its ObjectNav capability via learning
from fine-grained spatial information. However, leveraging 3D scene
representation can be prohibitively unpractical for policy learning in this
floor-level task, due to low sample efficiency and expensive computational
cost. In this work, we propose a framework for the challenging 3D-aware
ObjectNav based on two straightforward sub-policies. The two sub-polices,
namely corner-guided exploration policy and category-aware identification
policy, simultaneously perform by utilizing online fused 3D points as
observation. Through extensive experiments, we show that this framework can
dramatically improve the performance in ObjectNav through learning from 3D
scene representation. Our framework achieves the best performance among all
modular-based methods on the Matterport3D and Gibson datasets, while requiring
(up to 30x) less computational cost for training.Comment: To appear in CVPR 202
Active spintronic-metasurface terahertz emitters with tunable chirality
The ability to manipulate the electric-field vector of broadband terahertz
waves is essential for applications of terahertz technologies in many areas,
and can open up new possibilities for nonlinear terahertz spectroscopy and
coherent control. Here, we propose a novel laser-driven terahertz emitter,
consisting of metasurface-patterned magnetic multilayer heterostructures. Such
hybrid terahertz emitters can combine the advantages of spintronic emitters for
being ultrabroadband, efficient and flexible, as well as those of metasurfaces
for the unique capability to manipulate terahertz waves with high precision and
degree of freedom. Taking a stripe-patterned metasurface as an example, we
demonstrate the generation of broadband terahertz waves with tunable chirality.
Based on experimental and theoretical studies, the interplay between the
laser-induced spintronic-origin currents and the metasurface-induced transient
charges/currents are investigated, revealing the strong influence on the device
functionality originated from both the light-matter interactions in individual
metasurface units and the dynamic coupling between them. Our work not only
offers a flexible, reliable and cost-effective solution for chiral terahertz
wave generation and manipulation, but also opens a new pathway to
metasurface-tailored spintronic devices for efficient vector-control of
electromagnetic waves in the terahertz regime
Deep Reflection Prior
Reflections are very common phenomena in our daily photography, which
distract people's attention from the scene behind the glass. The problem of
removing reflection artifacts is important but challenging due to its ill-posed
nature. Recent learning-based approaches have demonstrated a significant
improvement in removing reflections. However, these methods are limited as they
require a large number of synthetic reflection/clean image pairs for
supervision, at the risk of overfitting in the synthetic image domain. In this
paper, we propose a learning-based approach that captures the reflection
statistical prior for single image reflection removal. Our algorithm is driven
by optimizing the target with joint constraints enhanced between multiple input
images during the training stage, but is able to eliminate reflections only
from a single input for evaluation. Our framework allows to predict both
background and reflection via a one-branch deep neural network, which is
implemented by the controllable latent code that indicates either the
background or reflection output. We demonstrate superior performance over the
state-of-the-art methods on a large range of real-world images. We further
provide insightful analysis behind the learned latent code, which may inspire
more future work
Flexible generation of structured terahertz fields via programmable exchange-biased spintronic emitters
Structured light, particularly in the terahertz frequency range, holds
considerable potential for a diverse range of applications. However, the
generation and control of structured terahertz radiation pose major challenges.
In this work, we demonstrate a novel programmable spintronic emitter that can
flexibly generate a variety of structured terahertz waves. This is achieved
through the precise and high-resolution programming of the magnetization
pattern on the emitter surface, utilizing laser-assisted local field cooling of
an exchange-biased ferromagnetic heterostructure. Moreover, we outline a
generic design strategy for realizing specific complex structured terahertz
fields in the far field. Our device successfully demonstrates the generation of
terahertz waves with diverse structured polarization states, including
spatially separated circular polarizations, azimuthal or radial polarization
states, and a full Poincare beam. This innovation opens a new avenue for
designing and generating structured terahertz radiations, with potential
applications in terahertz microscopy, communication, quantum information, and
light-matter interactions
Single-cell multiomics of the human retina reveals hierarchical transcription factor collaboration in mediating cell type-specific effects of genetic variants on gene regulation
BACKGROUND: Systematic characterization of how genetic variation modulates gene regulation in a cell type-specific context is essential for understanding complex traits. To address this question, we profile gene expression and chromatin accessibility in cells from healthy retinae of 20 human donors through single-cell multiomics and genomic sequencing.
RESULTS: We map eQTL, caQTL, allelic-specific expression, and allelic-specific chromatin accessibility in major retinal cell types. By integrating these results, we identify and characterize regulatory elements and genetic variants effective on gene regulation in individual cell types. The majority of identified sc-eQTLs and sc-caQTLs display cell type-specific effects, while the cis-elements containing genetic variants with cell type-specific effects are often accessible in multiple cell types. Furthermore, the transcription factors whose binding sites are perturbed by genetic variants tend to have higher expression levels in the cell types where the variants exert their effects, compared to the cell types where the variants have no impact. We further validate our findings with high-throughput reporter assays. Lastly, we identify the enriched cell types, candidate causal variants and genes, and cell type-specific regulatory mechanism underlying GWAS loci.
CONCLUSIONS: Overall, genetic effects on gene regulation are highly context dependent. Our results suggest that cell type-dependent genetic effect is driven by precise modulation of both trans-factor expression and chromatin accessibility of cis-elements. Our findings indicate hierarchical collaboration among transcription factors plays a crucial role in mediating cell type-specific effects of genetic variants on gene regulation
InstructBrush: Learning Attention-based Instruction Optimization for Image Editing
In recent years, instruction-based image editing methods have garnered
significant attention in image editing. However, despite encompassing a wide
range of editing priors, these methods are helpless when handling editing tasks
that are challenging to accurately describe through language. We propose
InstructBrush, an inversion method for instruction-based image editing methods
to bridge this gap. It extracts editing effects from exemplar image pairs as
editing instructions, which are further applied for image editing. Two key
techniques are introduced into InstructBrush, Attention-based Instruction
Optimization and Transformation-oriented Instruction Initialization, to address
the limitations of the previous method in terms of inversion effects and
instruction generalization. To explore the ability of instruction inversion
methods to guide image editing in open scenarios, we establish a
TransformationOriented Paired Benchmark (TOP-Bench), which contains a rich set
of scenes and editing types. The creation of this benchmark paves the way for
further exploration of instruction inversion. Quantitatively and qualitatively,
our approach achieves superior performance in editing and is more semantically
consistent with the target editing effects.Comment: Project Page: https://royzhao926.github.io/InstructBrush
- …