27 research outputs found

    Hypervolume-based multi-objective reinforcement learning

    No full text
    Indicator-based evolutionary algorithms are amongst the best performing methods for solving multi-objective optimization (MOO) problems. In reinforcement learning (RL), introducing a quality indicator in an algorithm’s decision logic was not attempted before. In this paper, we propose a novel on-line multi-objective reinforcement learning (MORL) algorithm that uses the hypervolume indicator as an action selection strategy. We call this algorithm the hypervolume-based MORL algorithm or HB-MORL and conduct an empirical study of the performance of the algorithm using multiple quality assessment metrics from multi-objective optimization. We compare the hypervolume-based learning algorithm on different environments to two multi-objective algorithms that rely on scalarization techniques, such as the linear scalarization and the weighted Chebyshev function. We conclude that HB-MORL significantly outperforms the linear scalarization method and performs similarly to the Chebyshev algorithm without requiring any user-specified emphasis on particular objectives

    Scalarized multi-objective reinforcement learning : novel design techniques

    No full text
    In multi-objective problems, it is key to find compromising solutions that balance different objectives. The linear scalarization function is often utilized to translate the multi-objective nature of a problem into a standard, single-objective problem. Generally, it is noted that such as linear combination can only find solutions in convex areas of the Pareto front, therefore making the method inapplicable in situations where the shape of the front is not known beforehand, as is often the case. We propose a non-linear scalarization function, called the Chebyshev scalarization function, as a basis for action selection strategies in multi-objective reinforcement learning. The Chebyshev scalarization method overcomes the flaws of the linear scalarization function as it can (i) discover Pareto optimal solutions regardless of the shape of the front, i.e. convex as well as non-convex , (ii) obtain a better spread amongst the set of Pareto optimal solutions and (iii) is not particularly dependent on the actual weights used

    Scalarized multi-objective reinforcement learning: Novel design techniques (abstract)

    No full text
    In multi-objective problems, it is key to find compromising solutions that balance different objectives. The linear scalarization function is often utilized to translate the multi-objective nature of a problem into a standard, single-objective problem. Generally, it is noted that such as linear combination can only find solutions in convex areas of the Pareto front, therefore making the method inapplicable in situations where the shape of the front is not known beforehand. We propose a non-linear scalarization function, called the Chebyshev scalarization function in multi-objective reinforcement learning. We show that the Chebyshev scalarization method overcomes the flaws of the linear scalarization function and is able to discover all Pareto optimal solutions in non-convex environments

    Almost Gibbsian versus weakly Gibbsian measures

    No full text
    We consider two possible extensions of the standard definition of Gibbs measures for lattice spin systems. When a random field has conditional distributions which are almost surely continuous (almost Gibbsian field), then there is a potential for that field which is almost surely summable (weakly Gibbsian field). This generalizes the standard Kozlov theorems. The converse is not true in general as is illustrated by counterexamples.Gibbs formalism Non-Gibbsian states

    Reinforcement learning of pareto-optimal multiobjective policies using steering

    Full text link
    There has been little research into multiobjective reinforcement learning (MORL) algorithms using stochastic or non-stationary policies, even though such policies may Pareto-dominate deterministic stationary policies. One approach is steering which forms a nonstationary combination of deterministic stationary base policies. This paper presents two new steering algorithms designed for the task of learning Pareto-optimal policies. The first algorithm (w-steering) is a direct adaptation of previous approaches to steering, and therefore requires prior knowledge of recurrent states which are guaranteed to be revisited. The second algorithm (Q-steering) eliminates this requirement. Empirical results show that both algorithms perform well when given knowledge of recurrent states, but that Q-steering provides substantial performance improvements over w-steering when this knowledge is not available. © Springer International Publishing Switzerland 2015

    Skin and psyche--from the surface to the depth of the inner world

    No full text
    About 30% of dermatology patients have signs or symptoms of psychological problems. Dermatologists should be familiar with the basics needed to identify, advise and treat these patients.Because of the complex interaction between skin and psyche, it is difficult to distinguish whether the primary problem is the skin or the psyche. Sometimes the clinical picture is a consequence of interactions between them and other factors.The interactions between skin and psyche are well known in history, art and literature--perhaps better known today because the marked emphasis on such images in our modern multimedia society. Aging is increasingly perceived as an illness and not as a physiological process. Through globalization, many different cultural approaches to the skin have entered in our daily life and influence our communication. This article considers the most important dermatoses which often show primary or secondary interaction with the psyche
    corecore