26,041 research outputs found

    Safety-guided deep reinforcement learning via online gaussian process estimation

    Full text link
    An important facet of reinforcement learning (RL) has to do with how the agent goes about exploring the environment. Traditional exploration strategies typically focus on efficiency and ignore safety. However, for practical applications, ensuring safety of the agent during exploration is crucial since performing an unsafe action or reaching an unsafe state could result in irreversible damage to the agent. The main challenge of safe exploration is that characterizing the unsafe states and actions is difficult for large continuous state or action spaces and unknown environments. In this paper, we propose a novel approach to incorporate estimations of safety to guide exploration and policy search in deep reinforcement learning. By using a cost function to capture trajectory-based safety, our key idea is to formulate the state-action value function of this safety cost as a candidate Lyapunov function and extend control-theoretic results to approximate its derivative using online Gaussian Process (GP) estimation. We show how to use these statistical models to guide the agent in unknown environments to obtain high-performance control policies with provable stability certificates.Accepted manuscrip

    Derivation of equivalent linear properties of Bouc-Wen hysteretic systems for seismic response spectrum analysis via statistical linearization

    Get PDF
    A newly proposed statistical linearization based formulation is used to derive effective linear properties (ELPs), namely damping ratio and natural frequency, for stochastically excited hysteretic oscillatorsinvolving the Bouc-Wen force-deformation phenomenological model. This is achieved by first using a frequency domain statistical linearization step to substitute a Bouc-Wen oscillator by a third order linear system. Next, this third order linear system is reduced to a second order linear oscillator characterized by a set of ELPs by enforcing equality of certain response statistics of the two linear systems. The proposed formulation is utilized in conjunction with quasi-stationary stochastic processes compatible with elastic response spectra commonly used to represent the input seismic action in earthquake resistant design of structures. Then, the derived ELPs are used to estimate the peak response of Bouc-Wen hysteretic oscillators without numerical integration of the nonlinear equation of motion; this is done in the context of linear response spectrum-based dynamic analysis. Numerical results pertaining to the elastic response spectrum of the current European aseismic code provisions (EC8) are presented to demonstrate the usefulness of the proposed approach. These results are supported by pertinent Monte Carlo simulations involving an ensemble of non-stationary EC8 spectrum compatible accelerograms. The proposed approach can hopefully be an effective tool in the preliminary aseismic design stages of yielding structures and structural members commonly represented by the Bouc-Wen hysteretic model within either a force-based or a displacement-based context

    Software timing analysis for complex hardware with survivability and risk analysis

    Get PDF
    The increasing automation of safety-critical real-time systems, such as those in cars and planes, leads, to more complex and performance-demanding on-board software and the subsequent adoption of multicores and accelerators. This causes software's execution time dispersion to increase due to variable-latency resources such as caches, NoCs, advanced memory controllers and the like. Statistical analysis has been proposed to model the Worst-Case Execution Time (WCET) of software running such complex systems by providing reliable probabilistic WCET (pWCET) estimates. However, statistical models used so far, which are based on risk analysis, are overly pessimistic by construction. In this paper we prove that statistical survivability and risk analyses are equivalent in terms of tail analysis and, building upon survivability analysis theory, we show that Weibull tail models can be used to estimate pWCET distributions reliably and tightly. In particular, our methodology proves the correctness-by-construction of the approach, and our evaluation provides evidence about the tightness of the pWCET estimates obtained, which allow decreasing them reliably by 40% for a railway case study w.r.t. state-of-the-art exponential tails.This work is a collaboration between Argonne National Laboratory and the Barcelona Supercomputing Center within the Joint Laboratory for Extreme-Scale Computing. This research is supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, under contract number DE-AC02- 06CH11357, program manager Laura Biven, and by the Spanish Government (SEV2015-0493), by the Spanish Ministry of Science and Innovation (contract TIN2015-65316-P), by Generalitat de Catalunya (contract 2014-SGR-1051).Peer ReviewedPostprint (author's final draft
    • …
    corecore