Search CORE

10 research outputs found

Partially Randomised Crossover.

Author: CVUT Praha Praha (Czech Republic)
Kubalik Jiri
Lazansky Jiri
Publication venue: Praha (Czech Republic) : CVUT Praha
Publication date: 01/01/1998
Field of study

Available from STL Prague, CZ / NTK - National Technical LibrarySIGLECZCzech Republi

OpenGrey Repository

Genetic Algorithms and their Testing.

Author: CVUT Praha Praha (Czech Republic)
Kubalik Jiri
Lazansky Jiri
Publication venue: Praha (Czech Republic) : CVUT Praha
Publication date: 01/01/1998
Field of study

Available from STL Prague, CZ / NTK - National Technical LibrarySIGLECZCzech Republi

OpenGrey Repository

High Level Diagnostics.

Author: CVUT Praha Praha (Czech Republic)
Kubalik Jiri
Publication venue: Praha (Czech Republic) : CVUT Praha
Publication date: 01/01/1998
Field of study

Available from STL Prague, CZ / NTK - National Technical LibrarySIGLECZCzech Republi

OpenGrey Repository

Policy derivation methods for critic-only reinforcement learning in continuous spaces

Author: Alibekov Eduard (author)
Babuska R. (author)
Kubalik Jiri (author)
Publication venue: 'Elsevier BV'
Publication date: 01/01/2018
Field of study

This paper addresses the problem of deriving a policy from the value function in the context of critic-only reinforcement learning (RL) in continuous state and action spaces. With continuous-valued states, RL algorithms have to rely on a numerical approximator to represent the value function. Numerical approximation due to its nature virtually always exhibits artifacts which damage the overall performance of the controlled system. In addition, when continuous-valued action is used, the most common approach is to discretize the action space and exhaustively search for the action that maximizes the right-hand side of the Bellman equation. Such a policy derivation procedure is computationally involved and results in steady-state error due to the lack of continuity. In this work, we propose policy derivation methods which alleviate the above problems by means of action space refinement, continuous approximation, and post-processing of the V-function by using symbolic regression. The proposed methods are tested on nonlinear control problems: 1-DOF and 2-DOF pendulum swing-up problems, and on magnetic manipulation. The results show significantly improved performance in terms of cumulative return and computational complexity.Accepted Author ManuscriptLearning & Autonomous Contro

TU Delft Repository

Selecting Informative Data Samples for Model Learning Through Symbolic Regression

Author: Babuska R. (author)
Derner Erik (author)
Kubalik Jiri (author)
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

Continual model learning for nonlinear dynamic systems, such as autonomous robots, presents several challenges. First, it tends to be computationally expensive as the amount of data collected by the robot quickly grows in time. Second, the model accuracy is impaired when data from repetitive motions prevail in the training set and outweigh scarcer samples that also capture interesting properties of the system. It is not known in advance which samples will be useful for model learning. Therefore, effective methods need to be employed to select informative training samples from the continuous data stream collected by the robot. Existing literature does not give any guidelines as to which of the available sample-selection methods are suitable for such a task. In this paper, we compare five sample-selection methods, including a novel method using the model prediction error. We integrate these methods into a model learning framework based on symbolic regression, which allows for learning accurate models in the form of analytic equations. Unlike the currently popular data-hungry deep learning methods, symbolic regression is able to build models even from very small training data sets. We demonstrate the approach on two real robots: the TurtleBot mobile robot and the Parrot Bebop drone. The results show that an accurate model can be constructed even from training sets as small as 24 samples. Informed sample-selection techniques based on prediction error and model variance clearly outperform uninformed methods, such as sequential or random selection.Learning & Autonomous Contro

TU Delft Repository

Zlepseni funkcnich vlastnosti genetickych algoritmu.

Author: Ceske vysoke uceni technicke v Praze Praha (Czech Republic)
Kubalik Jiri
Publication venue: Praha (Czech Republic) : Ceske vysoke uceni technicke v Praze
Publication date: 01/01/2000
Field of study

Available from STL Prague, CZ / NTK - National Technical LibrarySIGLECZCzech Republi

OpenGrey Repository

Symbolic Regression Methods for Reinforcement Learning

Author: Babuska R. (author)
Derner Erik (author)
Kubalik Jiri (author)
Zegklitz Jan (author)
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

Reinforcement learning algorithms can solve dynamic decision-making and optimal control problems. With continuous-valued state and input variables, reinforcement learning algorithms must rely on function approximators to represent the value function and policy mappings. Commonly used numerical approximators, such as neural networks or basis function expansions, have two main drawbacks: They are black-box models offering little insight into the mappings learned, and they require extensive trial and error tuning of their hyper-parameters. In this paper, we propose a new approach to constructing smooth value functions in the form of analytic expressions by using symbolic regression. We introduce three off-line methods for finding value functions based on a state-transition model: Symbolic value iteration, symbolic policy iteration, and a direct solution of the Bellman equation. The methods are illustrated on four nonlinear control problems: Velocity control under friction, one-link and two-link pendulum swing-up, and magnetic manipulation. The results show that the value functions yield well-performing policies and are compact, mathematically tractable, and easy to plug into other algorithms. This makes them potentially suitable for further analysis of the closed-loop system. A comparison with an alternative approach using neural networks shows that our method outperforms the neural network-based one. Learning & Autonomous Contro

TU Delft Repository

Directory of Open Access Journals

ExPlanTech: Applying multi-agent systems in production planning

Author: Ales Riha
BELLIFEMINE F.
FININ T.
Jiri Vokrinek
KUBALIK J.
Michal Pechoucek
PECHOUCEK M.
PECHOUCEK M.
SINGH M. P.
Vladimir Marik
Vojtech Prazma
WIKLINGA B. J.
Publication venue: 'Informa UK Limited'
Publication date
Field of study

Crossref