Search CORE

17 research outputs found

Efficient Exploration in Continuous-time Model-based Reinforcement Learning

Author: Dörfler Florian
Hübotter Jonas
Krause Andreas
Sukhija Bhavya
Treven Lenart
Publication venue
Publication date: 30/10/2023
Field of study

Reinforcement learning algorithms typically consider discrete-time dynamics, even though the underlying systems are often continuous in time. In this paper, we introduce a model-based reinforcement learning algorithm that represents continuous-time dynamics using nonlinear ordinary differential equations (ODEs). We capture epistemic uncertainty using well-calibrated probabilistic models, and use the optimistic principle for exploration. Our regret bounds surface the importance of the measurement selection strategy(MSS), since in continuous time we not only must decide how to explore, but also when to observe the underlying system. Our analysis demonstrates that the regret is sublinear when modeling ODEs with Gaussian Processes (GP) for common choices of MSS, such as equidistant sampling. Additionally, we propose an adaptive, data-dependent, practical MSS that, when combined with GP dynamics, also achieves sublinear regret with significantly fewer samples. We showcase the benefits of continuous-time modeling over its discrete-time counterpart, as well as our proposed adaptive MSS over standard baselines, on several applications

arXiv.org e-Print Archive

Tuning Legged Locomotion Controllers via Safe Bayesian Optimization

Author: Coros Stelian
Hübotter Jonas
Kang Dongho
Krause Andreas
Sukhija Bhavya
Widmer Daniel
Publication venue
Publication date: 25/10/2023
Field of study

This paper presents a data-driven strategy to streamline the deployment of model-based controllers in legged robotic hardware platforms. Our approach leverages a model-free safe learning algorithm to automate the tuning of control gains, addressing the mismatch between the simplified model used in the control formulation and the real system. This method substantially mitigates the risk of hazardous interactions with the robot by sample-efficiently optimizing parameters within a probably safe region. Additionally, we extend the applicability of our approach to incorporate the different gait parameters as contexts, leading to a safe, sample-efficient exploration algorithm capable of tuning a motion controller for diverse gait patterns. We validate our method through simulation and hardware experiments, where we demonstrate that the algorithm obtains superior performance on tuning a model-based motion controller for multiple gaits safely.Comment: This paper has been accepted to the 2023 Conference on Robot Learning (CoRL 2023.) The first two authors contributed equally. The supplementary video is available at https://youtu.be/zDBouUgegrU and the code implementation is available at https://github.com/lasgroup/gosafeop

arXiv.org e-Print Archive

Ein Nervenfall aus der Praxis eines chinesischen Arztes vor mehr als 2000 Jahren

Author: Hübotter
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Ein Fall von Hirnabsceß unklarer Genese

Author: Hübotter
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Dissertatio Inavgvralis De Obligatione Socii Innocentis In Delictis

Author: Hübotter Johann Heinrich
Publication venue: Universitäts- und Landesbibliothek Sachsen-Anhalt
Publication date
Field of study

Helmstedt, Univ., Jur. Diss., 1796Qvam Avctoritate Illvstris Ivreconsvltorvm Ordinis In Academia Ivlia Carolina Pro Svmmis In Vtroqve Ivre Honoribvs Rite Obtinendis Die XII Aprilis MDCCLXXXXVI Proposvit Ioannes Henricvs Hübotter HildesiensisVorlageform des Erscheinungsvermerks: Helmstadii Typis C. G. Fleckeisen Acad. Typogr

Digitale Bibliothek Uni Halle

Tuning Legged Locomotion Controllers via Safe Bayesian Optimization

Author: Coros Stelian
Hübotter Jonas
Kang Dongho
Krause Andreas
Sukhija Bhavya
Widmer Daniel
Publication venue: PMLR
Publication date: 01/01/2023
Field of study

ISSN:2640-349

Repository for Publications and Research Data

Learning policies for continuous control via transition models

Author: Buckley C.L.
Cialfi D.
Gerven M.A.J. van
Hübotter J.F.
Lanillos P.
Lanillos P.L.
Ramstead M.
Sajid N.
Shimazaki H.
Thill S.
Verbelen T.
Publication venue
Publication date: 01/01/2023
Field of study

It is doubtful that animals have perfect inverse models of their limbs (e.g., what muscle contraction must be applied to every joint to reach a particular location in space). However, in robot control, moving an arm's end-effector to a target position or along a target trajectory requires accurate forward and inverse models. Here we show that by learning the transition (forward) model from interaction, we can use it to drive the learning of an amortized policy. Hence, we revisit policy optimization in relation to the deep active inference framework and describe a modular neural network architecture that simultaneously learns the system dynamics from prediction errors and the stochastic policy that generates suitable continuous control commands to reach a desired reference position. We evaluated the model by comparing it against the baseline of a linear quadratic regulator, and conclude with additional steps to take toward human-like motor control