2,313 research outputs found

    Reinforcement learning based local search for grouping problems: A case study on graph coloring

    Get PDF
    Grouping problems aim to partition a set of items into multiple mutually disjoint subsets according to some specific criterion and constraints. Grouping problems cover a large class of important combinatorial optimization problems that are generally computationally difficult. In this paper, we propose a general solution approach for grouping problems, i.e., reinforcement learning based local search (RLS), which combines reinforcement learning techniques with descent-based local search. The viability of the proposed approach is verified on a well-known representative grouping problem (graph coloring) where a very simple descent-based coloring algorithm is applied. Experimental studies on popular DIMACS and COLOR02 benchmark graphs indicate that RLS achieves competitive performances compared to a number of well-known coloring algorithms

    Constrained Model-Free Reinforcement Learning for Process Optimization

    Full text link
    Reinforcement learning (RL) is a control approach that can handle nonlinear stochastic optimal control problems. However, despite the promise exhibited, RL has yet to see marked translation to industrial practice primarily due to its inability to satisfy state constraints. In this work we aim to address this challenge. We propose an 'oracle'-assisted constrained Q-learning algorithm that guarantees the satisfaction of joint chance constraints with a high probability, which is crucial for safety critical tasks. To achieve this, constraint tightening (backoffs) are introduced and adjusted using Broyden's method, hence making them self-tuned. This results in a general methodology that can be imbued into approximate dynamic programming-based algorithms to ensure constraint satisfaction with high probability. Finally, we present case studies that analyze the performance of the proposed approach and compare this algorithm with model predictive control (MPC). The favorable performance of this algorithm signifies a step toward the incorporation of RL into real world optimization and control of engineering systems, where constraints are essential in ensuring safety

    Learning and Designing Stochastic Processes from Logical Constraints

    Get PDF
    Stochastic processes offer a flexible mathematical formalism to model and reason about systems. Most analysis tools, however, start from the premises that models are fully specified, so that any parameters controlling the system's dynamics must be known exactly. As this is seldom the case, many methods have been devised over the last decade to infer (learn) such parameters from observations of the state of the system. In this paper, we depart from this approach by assuming that our observations are {\it qualitative} properties encoded as satisfaction of linear temporal logic formulae, as opposed to quantitative observations of the state of the system. An important feature of this approach is that it unifies naturally the system identification and the system design problems, where the properties, instead of observations, represent requirements to be satisfied. We develop a principled statistical estimation procedure based on maximising the likelihood of the system's parameters, using recent ideas from statistical machine learning. We demonstrate the efficacy and broad applicability of our method on a range of simple but non-trivial examples, including rumour spreading in social networks and hybrid models of gene regulation

    Predictive safety filter using system level synthesis

    Full text link
    Safety filters provide modular techniques to augment potentially unsafe control inputs (e.g. from learning-based controllers or humans) with safety guarantees in the form of constraint satisfaction. In this paper, we present an improved model predictive safety filter (MPSF) formulation, which incorporates system level synthesis techniques in the design. The resulting SL-MPSF scheme ensures safety for linear systems subject to bounded disturbances in an enlarged safe set. It requires less severe and frequent modifications of potentially unsafe control inputs compared to existing MPSF formulations to certify safety. In addition, we propose an explicit variant of the SL-MPSF formulation, which maintains scalability, and reduces the required online computational effort - the main drawback of the MPSF. The benefits of the proposed system level safety filter formulations compared to state-of-the-art MPSF formulations are demonstrated using a numerical example.Comment: https://gitlab.ethz.ch/ics/SLS_safety_filter

    Optimisation and Decision Support during the Conceptual Stage of Building Design

    Get PDF
    Merged with duplicate record 10026.1/726 on 28.02.2017 by CS (TIS)Modern building design is complex and involves many different disciplines operating in a fragmented manner. Appropriate computer-based decision support (DS) tools are sought that can raise the level of integration of different activities at the conceptual stage, in order to help create better designs solutions. This project investigates opportunities that exist for using techniques based upon the Genetic Algorithm (GA) to support critical activities of conceptual building design (CBD). Collective independent studies have shown that the GA is a powerful optimisation and exploratory search technique with widespread application. The GA is essentially very simple yet it offers robustness and domain independence. The GA efficiently searches a domain to exploit highly suitable information. It maintains multiple solutions to problems simultaneously and is well suited to non-linear problems and those of a discontinuous nature found in engineering design. The literature search first examines traditional approaches to supporting conceptual design. Existing GA techniques and applications are discussed which include pioneering studies in the field of detailed structural design. Broader GA studies are also reported which have demonstrated possibilities for investigating geometrical, topological and member size variation. The tasks and goals of conceptual design are studied. A rationale is introduced, aimed at enabling the GA to be applied in a manner that provides the most effective support to the designer. Numerical experiments with floor planning are presented. These studies provide a basic foundation for a subsequent design support system (DSS) capable of generating structural design concepts. A hierarchical Structured GA (SGA) created by Dasgupta et al [1] is investigated to support the generation of diverse structural design concepts. The SGA supports variation in the size, shape and structural configuration of a building and in the choice of structural frame type and floor system. The benefits and limitations of the SGA approach are discussed. The creation of a prototype DSS system, abritrarily called Designer-Pro (DPRO), is described. A detailed building design model is introduced which is required for design development and appraisal. Simplifications, design rationale and generic component modelling are mentioned. A cost-based single criteria optimisation problem (SCOP) is created in which other constraints are represented as design parameters. The thesis describes the importance of the object-oriented programming (OOP) paradigm for creating a versatile design model and the need for complementary graphical user interface (GUI) tools to provide human-computer interaction (HCI) capabilities for control and intelligent design manipulation. Techniques that increase flexibility in the generation and appraisal of concept are presented. Tools presented include a convergence plot of design solutions that supports cursor-interrogation to reveal the details of individual concepts. The graph permits study of design progression, or evolution of optimum design solutions. A visualisation tool is also presented. The DPRO system supports multiple operating modes, including single-design appraisal and enumerative search (ES). Case study examples are provided which demonstrate the applicability of the DPRO system to a range of different design scenarios. The DPRO system performs well in all tests. A parametric study demonstrates the potential of the system for DS. Limitations of the current approach and opportunities to broaden the study form part of the scope for further work. Some suggestions for further study are made, based upon newly-emerging techniques

    Stochastic arrays and learning networks

    Get PDF
    This thesis presents a study of stochastic arrays and learning networks. These arrays will be shown to consist of simple elements utilising probabilistic coding techniques which may interact with a random and noisy environment to produce useful results. Such networks have generated considerable interest since it is possible to design large parallel self-organising arrays of these elements which are trained by example rather than explicit instruction. Once the learning process has been completed, they then have the potential ability to form generalisations, perform global optimisation of traditionally difficult problems such as routing and incorporate an associative memory capability which can enable such tasks as image recognition and reconstruction to be performed, even when given a partial or noisy view of the target. Since the method of operation of such elements is thought to emulate the basic properties of the neurons of the brain, these arrays have been termed neural 'networks. The research demonstrates the use of stochastic elements for digital signal processing by presenting a novel systolic array, utilising a simple, replicated cell structure, which is shown to perform the operations of Cyclic Correlation and the Discrete Fourier Transform on inherently random and noisy probabilistic single bit inputs. This work is then extended into the field of stochastic learning automata and to neural networks by examining the Associative Reward-Punish (A(_R-P)) pattern recognising learning automaton. The thesis concludes that all the networks described may potentially be generalised to simple variations of one standard probabilistic element utilising stochastic coding, whose properties resemble those of biological neurons. A novel study is presented which describes how a powerful deterministic algorithm, previously considered to be biologically unviable due to its nature, may be represented in this way. It is expected that combinations of these methods may lead to a series of useful hybrid techniques for training networks. The nature of the element generalisation is particularly important as it reveals the potential for encoding successful algorithms in cheap, simple hardware with single bit interconnections. No claim is made that the particular algorithms described are those actually utilised by the brain, only to demonstrate that those properties observed of biological neurons are capable of endowing collective computational ability and that actual biological algorithms may perhaps then become apparent when viewed in this light
    corecore