134 research outputs found

    Pulsar Algorithms: A Class of Coarse-Grain Parallel Nonlinear Optimization Algorithms

    Get PDF
    Parallel architectures of modern computers formed of processors with high computing power motivate the search for new approaches to basic computational algorithms. Another motivating force for parallelization of algorithms has been the need to solve very large scale or complex problems. However, the complexity of a mathematical programming problem is not necessarily due to its scale or dimension; thus, we should search also for new parallel computation approaches to problems that might have a moderate size but are difficult for other reasons. One of such approaches might be coarse-grained parallelization based on a parametric imbedding of an algorithm and on an allocation of resulting algorithmic phases and variants to many processors with suitable coordination of data obtained that way. Each processor performs then a phase of the algorithm -- a substantial computational task which mitigates the problems related to data transmission and coordination. The paper presents a class of such coarse-grained parallel algorithms for unconstrained nonlinear optimization, called pulsar algorithms since the approximations of an optimal solution alternatively increase and reduce their spread in subsequent iterations. The main algorithmic phase of an algorithm of this class might be either a directional search or a restricted step determination in a trust region method. This class is exemplified by a modified, parallel Newton-type algorithm and a parallel rank-one variable metric algorithm. In the latter case, a consistent approximation of the inverse of the hessian matrix based on parallel produced data is available at each iteration, while the known deficiencies of a rank-one variable metric are suppressed by a parallel implementation. Additionally, pulsar algorithms might use a parametric imbedding into a family of regularized problems in order to counteract possible effects of ill-conditioning. Such parallel algorithms result not only in an increased speed of solving a problem but also in an increased robustness with respect to various sources of complexity of the problem. Necessary theoretical foundations, outlines of various variants of parallel algorithms and the results of preliminary tests are presented

    Structure and Problem Hardness: Goal Asymmetry and DPLL Proofs in<br> SAT-Based Planning

    Full text link
    In Verification and in (optimal) AI Planning, a successful method is to formulate the application as boolean satisfiability (SAT), and solve it with state-of-the-art DPLL-based procedures. There is a lack of understanding of why this works so well. Focussing on the Planning context, we identify a form of problem structure concerned with the symmetrical or asymmetrical nature of the cost of achieving the individual planning goals. We quantify this sort of structure with a simple numeric parameter called AsymRatio, ranging between 0 and 1. We run experiments in 10 benchmark domains from the International Planning Competitions since 2000; we show that AsymRatio is a good indicator of SAT solver performance in 8 of these domains. We then examine carefully crafted synthetic planning domains that allow control of the amount of structure, and that are clean enough for a rigorous analysis of the combinatorial search space. The domains are parameterized by size, and by the amount of structure. The CNFs we examine are unsatisfiable, encoding one planning step less than the length of the optimal plan. We prove upper and lower bounds on the size of the best possible DPLL refutations, under different settings of the amount of structure, as a function of size. We also identify the best possible sets of branching variables (backdoors). With minimum AsymRatio, we prove exponential lower bounds, and identify minimal backdoors of size linear in the number of variables. With maximum AsymRatio, we identify logarithmic DPLL refutations (and backdoors), showing a doubly exponential gap between the two structural extreme cases. The reasons for this behavior -- the proof arguments -- illuminate the prototypical patterns of structure causing the empirical behavior observed in the competition benchmarks

    Aspiration Based Decision Analysis and Support Part I: Theoretical and Methodological Backgrounds

    Get PDF
    In the interdisciplinary and intercultural systems analysis that constitutes the main theme of research in IIASA, a basic question is how to analyze and support decisions with help of mathematical models and logical procedures. This question -- particularly in its multi-criteria and multi-cultural dimensions -- has been investigated in System and Decision Sciences Program (SDS) since the beginning of IIASA. Researchers working both at IIASA and in a large international network of cooperating institutions contributed to a deeper understanding of this question. Around 1980, the concept of reference point multiobjective optimization was developed in SDS. This concept determined an international trend of research pursued in many countries cooperating with IIASA as well as in many research programs at IIASA -- such as energy, agricultural, environmental research. SDS organized since this time numerous international workshops, summer schools, seminar days and cooperative research agreements in the field of decision analysis and support. By this international and interdisciplinary cooperation, the concept of reference point multiobjective optimization has matured and was generalized into a framework of aspiration based decision analysis and support that can be understood as a synthesis of several known, antithetical approaches to this subject -- such as utility maximization approach, or satisficing approach, or goal -- program -- oriented planning approach. Jointly, the name of quasisatisficing approach can be also used, since the concept of aspirations comes from the satisficing approach. Both authors of the Working Paper contributed actively to this research: Andrzej Wierzbicki originated the concept of reference point multiobjective optimization and quasisatisficing approach, while Andrzej Lewandowski, working from the beginning in the numerous applications and extensions of this concept, has had the main contribution to its generalization into the framework of aspiration based decision analysis and support systems. This paper constitutes a draft of the first part of a book being prepared by these two authors. Part I, devoted to theoretical foundations and methodological background, written mostly by Andrzej Wierzbicki, will be followed by Part II, devoted to computer implementations and applications of decision support systems based on mathematical programming models, written mostly by Andrzej Lewandowski. Part III, devoted to decision support systems for the case of subjective evaluations of discrete decision alternatives, will be written by both authors

    Assembly and Disassembly Planning by using Fuzzy Logic & Genetic Algorithms

    Full text link
    The authors propose the implementation of hybrid Fuzzy Logic-Genetic Algorithm (FL-GA) methodology to plan the automatic assembly and disassembly sequence of products. The GA-Fuzzy Logic approach is implemented onto two levels. The first level of hybridization consists of the development of a Fuzzy controller for the parameters of an assembly or disassembly planner based on GAs. This controller acts on mutation probability and crossover rate in order to adapt their values dynamically while the algorithm runs. The second level consists of the identification of theoptimal assembly or disassembly sequence by a Fuzzy function, in order to obtain a closer control of the technological knowledge of the assembly/disassembly process. Two case studies were analyzed in order to test the efficiency of the Fuzzy-GA methodologies

    Adapt-to-learn policy transfer in reinforcement learning and deep model reference adaptive control

    Get PDF
    Adaptation and Learning from exploration have been a key in biological learning; Humans and animals do not learn every task in isolation; rather are able to quickly adapt the learned behaviors between similar tasks and learn new skills when presented with new situations. Inspired by this, adaptation has been an important direction of research in control as Adaptive Controllers. However, the Adaptive Controllers like Model Reference Adaptive Controller are mainly model-based controllers and do not rely on exploration instead make informed decisions exploiting the model's structure. Therefore such controllers are characterized by high sample efficiency and stability conditions and, therefore, suitable for safety-critical systems. On the other hand, we have Learning-based optimal control algorithms like Reinforcement Learning. Reinforcement learning is a trial and error method, where an agent explores the environment by taking random action and maximizing the likelihood of those particular actions that result in a higher return. However, these exploration techniques are expected to fail many times before exploring optimal policy. Therefore, they are highly sample-expensive and lack stability guarantees and hence not suitable for safety-critical systems. This thesis presents control algorithms for robotics where the best of both worlds that is ``Adaptation'' and ``Learning from exploration'' are brought together to propose new algorithms that can perform better than their conventional counterparts. In this effort, we first present an Adapt to learn policy transfer Algorithm, where we use control theoretical ideas of adaptation to transfer policy between two related but different tasks using the policy gradient method of reinforcement learning. Efficient and robust policy transfer remains a key challenge in reinforcement learning. Policy transfer through warm initialization, imitation, or interacting over a large set of agents with randomized instances, have been commonly applied to solve a variety of Reinforcement Learning (RL) tasks. However, this is far from how behavior transfer happens in the biological world: Here, we seek to answer the question: Will learning to combine adaptation reward with environmental reward lead to a more efficient transfer of policies between domains? We introduce a principled mechanism that can ``Adapt-to-Learn", which is adapt the source policy to learn to solve a target task with significant transition differences and uncertainties. Through theory and experiments, we show that our method leads to a significantly reduced sample complexity of transferring the policies between the tasks. In the second part of this thesis, information-enabled learning-based adaptive controllers like ``Gaussian Process adaptive controller using Model Reference Generative Network'' (GP-MRGeN), ``Deep Model Reference Adaptive Controller'' (DMRAC) are presented. Model reference adaptive control (MRAC) is a widely studied adaptive control methodology that aims to ensure that a nonlinear plant with significant model uncertainty behaves like a chosen reference model. MRAC methods try to adapt the system to changes by representing the system uncertainties as weighted combinations of known nonlinear functions and using weight update law that ensures that network weights are moved in the direction of minimizing the instantaneous tracking error. However, most MRAC adaptive controllers use a shallow network and only the instantaneous data for adaptation, restricting their representation capability and limiting their performance under fast-changing uncertainties and faults in the system. In this thesis, we propose a Gaussian process based adaptive controller called GP-MRGeN. We present a new approach to the online supervised training of GP models using a new architecture termed as Model Reference Generative Network (MRGeN). Our architecture is very loosely inspired by the recent success of generative neural network models. Nevertheless, our contributions ensure that the inclusion of such a model in closed-loop control does not affect the stability properties. The GP-MRGeN controller, through using a generative network, is capable of achieving higher adaptation rates without losing robustness properties of the controller, hence suitable for mitigating faults in fast-evolving systems. Further, in this thesis, we present a new neuroadaptive architecture: Deep Neural Network-based Model Reference Adaptive Control. This architecture utilizes deep neural network representations for modeling significant nonlinearities while marrying it with the boundedness guarantees that characterize MRAC based controllers. We demonstrate through simulations and analysis that DMRAC can subsume previously studied learning-based MRAC methods, such as concurrent learning and GP-MRAC. This makes DMRAC a highly powerful architecture for high-performance control of nonlinear systems with long-term learning properties. Theoretical proofs of the controller generalizing capability over unseen data points and boundedness properties of the tracking error are also presented. Experiments with the quadrotor vehicle demonstrate the controller performance in achieving reference model tracking in the presence of significant matched uncertainties. A software+communication architecture is designed to ensure online real-time inference of the deep network on a high-bandwidth computation-limited platform to achieve these results. These results demonstrate the efficacy of deep networks for high bandwidth closed-loop attitude control of unstable and nonlinear robots operating in adverse situations. We expect that this work will benefit other closed-loop deep-learning control architectures for robotics
    • …
    corecore