10,328 research outputs found

    Batch Policy Learning under Constraints

    Get PDF
    When learning policies for real-world domains, two important questions arise: (i) how to efficiently use pre-collected off-policy, non-optimal behavior data; and (ii) how to mediate among different competing objectives and constraints. We thus study the problem of batch policy learning under multiple constraints, and offer a systematic solution. We first propose a flexible meta-algorithm that admits any batch reinforcement learning and online learning procedure as subroutines. We then present a specific algorithmic instantiation and provide performance guarantees for the main objective and all constraints. To certify constraint satisfaction, we propose a new and simple method for off-policy policy evaluation (OPE) and derive PAC-style bounds. Our algorithm achieves strong empirical results in different domains, including in a challenging problem of simulated car driving subject to multiple constraints such as lane keeping and smooth driving. We also show experimentally that our OPE method outperforms other popular OPE techniques on a standalone basis, especially in a high-dimensional setting

    Development of a Finite Element Analysis Methodology for the Propagation of Delaminations in Composite Structures

    Get PDF
    Analysing the collapse of skin-stiffened structures requires capturing the critical phenomenon of skin-stiffener separation, which can be considered analogous to interlaminar cracking. This paper presents the development of a numerical approach for simulating the propagation of interlaminar cracks in composite structures. A degradation methodology was applied in MSC.Marc that involved modelling the structure with shell layers connected by user-defined multiple point constraints (MPCs). User subroutines were written that apply the Virtual Crack Closure Technique (VCCT) to determine the onset of crack growth, and modify the properties of the user-defined MPCs to simulate crack propagation. Methodologies for the release of failing MPCs are presented and are discussed with reference to the VCCT assumption of self-similar crack growth. Numerical results applying the release methodologies are then compared with experimental results for a double cantilever beam specimen. Based on this comparison, recommendations for the future development of the degradation model are made, especially with reference to developing an approach for the collapse analysis of fuselage-representative structures
    • …
    corecore