280 research outputs found
Statistical Analysis of a Posteriori Channel and Noise Distribution Based on HARQ Feedback
In response to a comment on one of our manuscript, this work studies the
posterior channel and noise distributions conditioned on the NACKs and ACKs of
all previous transmissions in HARQ system with statistical approaches. Our main
result is that, unless the coherence interval (time or frequency) is large as
in block-fading assumption, the posterior distribution of the channel and noise
either remains almost identical to the prior distribution, or it mostly follows
the same class of distribution as the prior one. In the latter case, the
difference between the posterior and prior distribution can be modeled as some
parameter mismatch, which has little impact on certain type of applications.Comment: 15 pages, 2 figures, 4 table
Your Room is not Private: Gradient Inversion Attack on Reinforcement Learning
The prominence of embodied Artificial Intelligence (AI), which empowers
robots to navigate, perceive, and engage within virtual environments, has
attracted significant attention, owing to the remarkable advancements in
computer vision and large language models. Privacy emerges as a pivotal concern
within the realm of embodied AI, as the robot accesses substantial personal
information. However, the issue of privacy leakage in embodied AI tasks,
particularly in relation to reinforcement learning algorithms, has not received
adequate consideration in research. This paper aims to address this gap by
proposing an attack on the value-based algorithm and the gradient-based
algorithm, utilizing gradient inversion to reconstruct states, actions, and
supervision signals. The choice of using gradients for the attack is motivated
by the fact that commonly employed federated learning techniques solely utilize
gradients computed based on private user data to optimize models, without
storing or transmitting the data to public servers. Nevertheless, these
gradients contain sufficient information to potentially expose private data. To
validate our approach, we conduct experiments on the AI2THOR simulator and
evaluate our algorithm on active perception, a prevalent task in embodied AI.
The experimental results demonstrate the effectiveness of our method in
successfully reconstructing all information from the data across 120 room
layouts.Comment: 7 pages, 4 figures, 2 table
Generalizing Goal-Conditioned Reinforcement Learning with Variational Causal Reasoning
As a pivotal component to attaining generalizable solutions in human
intelligence, reasoning provides great potential for reinforcement learning
(RL) agents' generalization towards varied goals by summarizing part-to-whole
arguments and discovering cause-and-effect relations. However, how to discover
and represent causalities remains a huge gap that hinders the development of
causal RL. In this paper, we augment Goal-Conditioned RL (GCRL) with Causal
Graph (CG), a structure built upon the relation between objects and events. We
novelly formulate the GCRL problem into variational likelihood maximization
with CG as latent variables. To optimize the derived objective, we propose a
framework with theoretical performance guarantees that alternates between two
steps: using interventional data to estimate the posterior of CG; using CG to
learn generalizable models and interpretable policies. Due to the lack of
public benchmarks that verify generalization capability under reasoning, we
design nine tasks and then empirically show the effectiveness of the proposed
method against five baselines on these tasks. Further theoretical analysis
shows that our performance improvement is attributed to the virtuous cycle of
causal discovery, transition modeling, and policy training, which aligns with
the experimental evidence in extensive ablation studies.Comment: 28 pages, 5 figures, under revie
Seeing is not Believing: Robust Reinforcement Learning against Spurious Correlation
Robustness has been extensively studied in reinforcement learning (RL) to
handle various forms of uncertainty such as random perturbations, rare events,
and malicious attacks. In this work, we consider one critical type of
robustness against spurious correlation, where different portions of the state
do not have correlations induced by unobserved confounders. These spurious
correlations are ubiquitous in real-world tasks, for instance, a self-driving
car usually observes heavy traffic in the daytime and light traffic at night
due to unobservable human activity. A model that learns such useless or even
harmful correlation could catastrophically fail when the confounder in the test
case deviates from the training one. Although motivated, enabling robustness
against spurious correlation poses significant challenges since the uncertainty
set, shaped by the unobserved confounder and causal structure, is difficult to
characterize and identify. Existing robust algorithms that assume simple and
unstructured uncertainty sets are therefore inadequate to address this
challenge. To solve this issue, we propose Robust State-Confounded Markov
Decision Processes (RSC-MDPs) and theoretically demonstrate its superiority in
avoiding learning spurious correlations compared with other robust RL
counterparts. We also design an empirical algorithm to learn the robust optimal
policy for RSC-MDPs, which outperforms all baselines in eight realistic
self-driving and manipulation tasks.Comment: Accepted to NeurIPS 202
Semantically Controllable Generation of Physical Scenes with Explicit Knowledge
Deep Generative Models (DGMs) are known for their superior capability in
generating realistic data. Extending purely data-driven approaches, recent
specialized DGMs may satisfy additional controllable requirements such as
embedding a traffic sign in a driving scene, by manipulating patterns
\textit{implicitly} in the neuron or feature level. In this paper, we introduce
a novel method to incorporate domain knowledge \textit{explicitly} in the
generation process to achieve semantically controllable scene generation. We
categorize our knowledge into two types to be consistent with the composition
of natural scenes, where the first type represents the property of objects and
the second type represents the relationship among objects. We then propose a
tree-structured generative model to learn complex scene representation, whose
nodes and edges are naturally corresponding to the two types of knowledge
respectively. Knowledge can be explicitly integrated to enable semantically
controllable scene generation by imposing semantic rules on properties of nodes
and edges in the tree structure. We construct a synthetic example to illustrate
the controllability and explainability of our method in a clean setting. We
further extend the synthetic example to realistic autonomous vehicle driving
environments and conduct extensive experiments to show that our method
efficiently identifies adversarial traffic scenes against different
state-of-the-art 3D point cloud segmentation models satisfying the traffic
rules specified as the explicit knowledge.Comment: 14 pages, 6 figures. Under revie
- …