28 research outputs found

    A state variable for crumpled thin sheets

    Full text link
    Despite the apparent ease with which a sheet of paper is crumpled and tossed away, crumpling dynamics are often considered a paradigm of complexity. This complexity arises from the infinite number of configurations a disordered crumpled sheet can take. Here we experimentally show that key aspects of crumpling have a very simple description; the evolution of the damage in crumpling dynamics can largely be described by a single global quantity, the total length of all creases. We follow the evolution of the damage network in repetitively crumpled elastoplastic sheets, and show that the dynamics of this quantity are deterministic, and depend only on the instantaneous state of the crease network and not at all on the crumpling history. We also show that this global quantity captures the crumpling dynamics of a sheet crumpled for the first time. This leads to a remarkable reduction in complexity, allowing a description of a highly disordered system by a single state parameter. Similar strategies may also be useful in analyzing other systems that evolve under geometric and mechanical constraints, from faulting of tectonic plates to the evolution of proteins

    Decision-Focused Model-based Reinforcement Learning for Reward Transfer

    Full text link
    Decision-focused (DF) model-based reinforcement learning has recently been introduced as a powerful algorithm that can focus on learning the MDP dynamics that are most relevant for obtaining high returns. While this approach increases the agent's performance by directly optimizing the reward, it does so by learning less accurate dynamics from a maximum likelihood perspective. We demonstrate that when the reward function is defined by preferences over multiple objectives, the DF model may be sensitive to changes in the objective preferences.In this work, we develop the robust decision-focused (RDF) algorithm, which leverages the non-identifiability of DF solutions to learn models that maximize expected returns while simultaneously learning models that transfer to changes in the preference over multiple objectives. We demonstrate the effectiveness of RDF on two synthetic domains and two healthcare simulators, showing that it significantly improves the robustness of DF model learning to changes in the reward function without compromising training-time return

    Learning to search efficiently for causally near-optimal treatments

    Full text link
    Finding an effective medical treatment often requires a search by trial and error. Making this search more efficient by minimizing the number of unnecessary trials could lower both costs and patient suffering. We formalize this problem as learning a policy for finding a near-optimal treatment in a minimum number of trials using a causal inference framework. We give a model-based dynamic programming algorithm which learns from observational data while being robust to unmeasured confounding. To reduce time complexity, we suggest a greedy algorithm which bounds the near-optimality constraint. The methods are evaluated on synthetic and real-world healthcare data and compared to model-free reinforcement learning. We find that our methods compare favorably to the model-free baseline while offering a more transparent trade-off between search time and treatment efficacy
    corecore