28 research outputs found
A state variable for crumpled thin sheets
Despite the apparent ease with which a sheet of paper is crumpled and tossed
away, crumpling dynamics are often considered a paradigm of complexity. This
complexity arises from the infinite number of configurations a disordered
crumpled sheet can take. Here we experimentally show that key aspects of
crumpling have a very simple description; the evolution of the damage in
crumpling dynamics can largely be described by a single global quantity, the
total length of all creases. We follow the evolution of the damage network in
repetitively crumpled elastoplastic sheets, and show that the dynamics of this
quantity are deterministic, and depend only on the instantaneous state of the
crease network and not at all on the crumpling history. We also show that this
global quantity captures the crumpling dynamics of a sheet crumpled for the
first time. This leads to a remarkable reduction in complexity, allowing a
description of a highly disordered system by a single state parameter. Similar
strategies may also be useful in analyzing other systems that evolve under
geometric and mechanical constraints, from faulting of tectonic plates to the
evolution of proteins
Decision-Focused Model-based Reinforcement Learning for Reward Transfer
Decision-focused (DF) model-based reinforcement learning has recently been
introduced as a powerful algorithm that can focus on learning the MDP dynamics
that are most relevant for obtaining high returns. While this approach
increases the agent's performance by directly optimizing the reward, it does so
by learning less accurate dynamics from a maximum likelihood perspective. We
demonstrate that when the reward function is defined by preferences over
multiple objectives, the DF model may be sensitive to changes in the objective
preferences.In this work, we develop the robust decision-focused (RDF)
algorithm, which leverages the non-identifiability of DF solutions to learn
models that maximize expected returns while simultaneously learning models that
transfer to changes in the preference over multiple objectives. We demonstrate
the effectiveness of RDF on two synthetic domains and two healthcare
simulators, showing that it significantly improves the robustness of DF model
learning to changes in the reward function without compromising training-time
return
Learning to search efficiently for causally near-optimal treatments
Finding an effective medical treatment often requires a search by trial and
error. Making this search more efficient by minimizing the number of
unnecessary trials could lower both costs and patient suffering. We formalize
this problem as learning a policy for finding a near-optimal treatment in a
minimum number of trials using a causal inference framework. We give a
model-based dynamic programming algorithm which learns from observational data
while being robust to unmeasured confounding. To reduce time complexity, we
suggest a greedy algorithm which bounds the near-optimality constraint. The
methods are evaluated on synthetic and real-world healthcare data and compared
to model-free reinforcement learning. We find that our methods compare
favorably to the model-free baseline while offering a more transparent
trade-off between search time and treatment efficacy