788 research outputs found

    Privileged Knowledge Distillation for Sim-to-Real Policy Generalization

    Full text link
    Reinforcement Learning (RL) has recently achieved remarkable success in robotic control. However, most RL methods operate in simulated environments where privileged knowledge (e.g., dynamics, surroundings, terrains) is readily available. Conversely, in real-world scenarios, robot agents usually rely solely on local states (e.g., proprioceptive feedback of robot joints) to select actions, leading to a significant sim-to-real gap. Existing methods address this gap by either gradually reducing the reliance on privileged knowledge or performing a two-stage policy imitation. However, we argue that these methods are limited in their ability to fully leverage the privileged knowledge, resulting in suboptimal performance. In this paper, we propose a novel single-stage privileged knowledge distillation method called the Historical Information Bottleneck (HIB) to narrow the sim-to-real gap. In particular, HIB learns a privileged knowledge representation from historical trajectories by capturing the underlying changeable dynamic information. Theoretical analysis shows that the learned privileged knowledge representation helps reduce the value discrepancy between the oracle and learned policies. Empirical experiments on both simulated and real-world tasks demonstrate that HIB yields improved generalizability compared to previous methods.Comment: 22 page

    Numerical Solutions of 2-D Steady Incompressible Flow in a Driven Skewed Cavity

    Full text link
    The benchmark test case for non-orthogonal grid mesh, the "driven skewed cavity flow", first introduced by Demirdzic et al. (1992, IJNMF, 15, 329) for skew angles of alpha=30 and alpha=45, is reintroduced with a more variety of skew angles. The benchmark problem has non-orthogonal, skewed grid mesh with skew angle (alpha). The governing 2-D steady incompressible Navier-Stokes equations in general curvilinear coordinates are solved for the solution of driven skewed cavity flow with non-orthogonal grid mesh using a numerical method which is efficient and stable even at extreme skew angles. Highly accurate numerical solutions of the driven skewed cavity flow, solved using a fine grid (512x512) mesh, are presented for Reynolds number of 100 and 1000 for skew angles ranging between 15<alpha<165

    Adaptive optimal output regulation for wheel-legged robot Ollie: A data-driven approach

    Get PDF
    The dynamics of a robot may vary during operation due to both internal and external factors, such as non-ideal motor characteristics and unmodeled loads, which would lead to control performance deterioration and even instability. In this paper, the adaptive optimal output regulation (AOOR)-based controller is designed for the wheel-legged robot Ollie to deal with the possible model uncertainties and disturbances in a data-driven approach. We test the AOOR-based controller by forcing the robot to stand still, which is a conventional index to judge the balance controller for two-wheel robots. By online training with small data, the resultant AOOR achieves the optimality of the control performance and stabilizes the robot within a small displacement in rich experiments with different working conditions. Finally, the robot further balances a rolling cylindrical bottle on its top with the balance control using the AOOR, but it fails with the initial controller. Experimental results demonstrate that the AOOR-based controller shows the effectiveness and high robustness with model uncertainties and external disturbances

    Optimasi Portofolio Resiko Menggunakan Model Markowitz MVO Dikaitkan dengan Keterbatasan Manusia dalam Memprediksi Masa Depan dalam Perspektif Al-Qur`an

    Full text link
    Risk portfolio on modern finance has become increasingly technical, requiring the use of sophisticated mathematical tools in both research and practice. Since companies cannot insure themselves completely against risk, as human incompetence in predicting the future precisely that written in Al-Quran surah Luqman verse 34, they have to manage it to yield an optimal portfolio. The objective here is to minimize the variance among all portfolios, or alternatively, to maximize expected return among all portfolios that has at least a certain expected return. Furthermore, this study focuses on optimizing risk portfolio so called Markowitz MVO (Mean-Variance Optimization). Some theoretical frameworks for analysis are arithmetic mean, geometric mean, variance, covariance, linear programming, and quadratic programming. Moreover, finding a minimum variance portfolio produces a convex quadratic programming, that is minimizing the objective function ðð¥with constraintsð ð 𥠥 ðandð´ð¥ = ð. The outcome of this research is the solution of optimal risk portofolio in some investments that could be finished smoothly using MATLAB R2007b software together with its graphic analysis

    Measurement of the Splitting Function in &ITpp &ITand Pb-Pb Collisions at root&ITsNN&IT=5.02 TeV

    Get PDF
    Data from heavy ion collisions suggest that the evolution of a parton shower is modified by interactions with the color charges in the dense partonic medium created in these collisions, but it is not known where in the shower evolution the modifications occur. The momentum ratio of the two leading partons, resolved as subjets, provides information about the parton shower evolution. This substructure observable, known as the splitting function, reflects the process of a parton splitting into two other partons and has been measured for jets with transverse momentum between 140 and 500 GeV, in pp and PbPb collisions at a center-of-mass energy of 5.02 TeV per nucleon pair. In central PbPb collisions, the splitting function indicates a more unbalanced momentum ratio, compared to peripheral PbPb and pp collisions.. The measurements are compared to various predictions from event generators and analytical calculations.Peer reviewe

    Search for anomalous couplings in boosted WW/WZ -> l nu q(q)over-bar production in proton-proton collisions at root s=8TeV

    Get PDF
    Peer reviewe
    corecore