316 research outputs found

    Controlled Sequential Monte Carlo

    Full text link
    Sequential Monte Carlo methods, also known as particle methods, are a popular set of techniques for approximating high-dimensional probability distributions and their normalizing constants. These methods have found numerous applications in statistics and related fields; e.g. for inference in non-linear non-Gaussian state space models, and in complex static models. Like many Monte Carlo sampling schemes, they rely on proposal distributions which crucially impact their performance. We introduce here a class of controlled sequential Monte Carlo algorithms, where the proposal distributions are determined by approximating the solution to an associated optimal control problem using an iterative scheme. This method builds upon a number of existing algorithms in econometrics, physics, and statistics for inference in state space models, and generalizes these methods so as to accommodate complex static models. We provide a theoretical analysis concerning the fluctuation and stability of this methodology that also provides insight into the properties of related algorithms. We demonstrate significant gains over state-of-the-art methods at a fixed computational complexity on a variety of applications

    Control of a Buck DC/DC Converter Using Approximate Dynamic Programming and Artificial Neural Networks

    Get PDF
    This paper proposes a novel artificial neural network (ANN) based control method for a dc/dc buck converter. The ANN is trained to implement optimal control based on approximate dynamic programming (ADP). Special characteristics of the proposed ANN control include: 1) The inputs to the ANN contain error signals and integrals of the error signals, enabling the ANN to have PI control ability; 2) The ANN receives voltage feedback signals from the dc/dc converter, making the combined system equivalent to a recurrent neural network; 3) The ANN is trained to minimize a cost function over a long time horizon, making the ANN have a stronger predictive control ability than a conventional predictive controller; 4) The ANN is trained offline, preventing the instability of the network caused by weight adjustments of an on-line training algorithm. The ANN performance is evaluated through simulation and hardware experiments and compared with conventional control methods, which shows that the ANN controller has a strong ability to track rapidly changing reference commands, maintain stable output voltage for a variable load, and manage maximum duty-ratio and current constraints properly

    Reinforcement learning control of a flexible two-link manipulator: an experimental investigation

    Get PDF
    This article discusses the control design and experiment validation of a flexible two-link manipulator (FTLM) system represented by ordinary differential equations (ODEs). A reinforcement learning (RL) control strategy is developed that is based on actor-critic structure to enable vibration suppression while retaining trajectory tracking. Subsequently, the closed-loop system with the proposed RL control algorithm is proved to be semi-global uniform ultimate bounded (SGUUB) by Lyapunov's direct method. In the simulations, the control approach presented has been tested on the discretized ODE dynamic model and the analytical claims have been justified under the existence of uncertainty. Eventually, a series of experiments in a Quanser laboratory platform are investigated to demonstrate the effectiveness of the presented control and its application effect is compared with PD control

    Stable Adaptive Control Using New Critic Designs

    Full text link
    Classical adaptive control proves total-system stability for control of linear plants, but only for plants meeting very restrictive assumptions. Approximate Dynamic Programming (ADP) has the potential, in principle, to ensure stability without such tight restrictions. It also offers nonlinear and neural extensions for optimal control, with empirically supported links to what is seen in the brain. However, the relevant ADP methods in use today -- TD, HDP, DHP, GDHP -- and the Galerkin-based versions of these all have serious limitations when used here as parallel distributed real-time learning systems; either they do not possess quadratic unconditional stability (to be defined) or they lead to incorrect results in the stochastic case. (ADAC or Q-learning designs do not help.) After explaining these conclusions, this paper describes new ADP designs which overcome these limitations. It also addresses the Generalized Moving Target problem, a common family of static optimization problems, and describes a way to stabilize large-scale economic equilibrium models, such as the old long-term energy model of DOE.Comment: Includes general reviews of alternative control technologies and reinforcement learning. 4 figs, >70p., >200 eqs. Implementation details, stability analysis. Included in 9/24/98 patent disclosure. pdf version uploaded 2012, based on direct conversion of the original word/html file, because of issues of format compatabilit

    Adaptive dynamic programming with eligibility traces and complexity reduction of high-dimensional systems

    Get PDF
    This dissertation investigates the application of a variety of computational intelligence techniques, particularly clustering and adaptive dynamic programming (ADP) designs especially heuristic dynamic programming (HDP) and dual heuristic programming (DHP). Moreover, a one-step temporal-difference (TD(0)) and n-step TD (TD(λ)) with their gradients are utilized as learning algorithms to train and online-adapt the families of ADP. The dissertation is organized into seven papers. The first paper demonstrates the robustness of model order reduction (MOR) for simulating complex dynamical systems. Agglomerative hierarchical clustering based on performance evaluation is introduced for MOR. This method computes the reduced order denominator of the transfer function by clustering system poles in a hierarchical dendrogram. Several numerical examples of reducing techniques are taken from the literature to compare with our work. In the second paper, a HDP is combined with the Dyna algorithm for path planning. The third paper uses DHP with an eligibility trace parameter (λ) to track a reference trajectory under uncertainties for a nonholonomic mobile robot by using a first-order Sugeno fuzzy neural network structure for the critic and actor networks. In the fourth and fifth papers, a stability analysis for a model-free action-dependent HDP(λ) is demonstrated with batch- and online-implementation learning, respectively. The sixth work combines two different gradient prediction levels of critic networks. In this work, we provide a convergence proofs. The seventh paper develops a two-hybrid recurrent fuzzy neural network structures for both critic and actor networks. They use a novel n-step gradient temporal-difference (gradient of TD(λ)) of an advanced ADP algorithm called value-gradient learning (VGL(λ)), and convergence proofs are given. Furthermore, the seventh paper is the first to combine the single network adaptive critic with VGL(λ). --Abstract, page iv

    Air-Fuel Ratio Control of Spark Ignition Engines With Unknown System Dynamics Estimator: Theory and Experiments

    Get PDF
    This brief addresses the emission reduction of spark ignition engines by proposing a new control to regulate the air-fuel ratio (AFR) around the ideal value. After revisiting the engine dynamics, the AFR regulation is represented as a tracking control of the injected fuel amount. This allows to take the fuel film dynamics into consideration and simplify the control design. The lumped unknown engine dynamics in the new formulation are online estimated by suggesting a new effective unknown system dynamics estimator. The estimated variable can be superimposed on a commercially configured, well-calibrated gain scheduling like proportional-integral-differential (PID) control to achieve a better AFR response. The salient feature of this proposed control scheme lies in its simplicity and the small number of required measurements, that is, only the air mass flow rate, the pressure and temperature in the intake manifold, and the measured AFR value are used. Practical experiments on a Tata Motors Limited two-cylinder gasoline engine are carried out under a realistic driving cycle. The comparative results show that the proposed control can achieve an improved AFR control response and reduced emissions

    Application of Optimal Switching Using Adaptive Dynamic Programming in Power Electronics

    Get PDF
    In this dissertation, optimal switching in switched systems using adaptive dynamic programming (ADP) is presented. Two applications in power electronics, namely single-phase inverter control and permanent magnet synchronous motor (PMSM) control are studied using ADP. In both applications, the objective of the control problem is to design an optimal switching controller, which is also relatively robust to parameter uncertainties and disturbances in the system. An inverter is used to convert the direct current (DC) voltage to an alternating current (AC) voltage. The control scheme of the single-phase inverter uses a single function approximator, called critic, to evaluate the optimal cost and determine the optimal switching. After offline training of the critic, which is a function of system states and elapsed time, the resulting optimal weights are used in online control, to get a smooth output AC voltage in a feedback form. Simulations show the desirable performance of this controller with linear and nonlinear load and its relative robustness to parameter uncertainty and disturbances. Furthermore, the proposed controller is upgraded so that the inverter is suitable for single-phase variable frequency drives. Finally, as one of the few studies in the field of adaptive dynamic programming (ADP), the proposed controllers are implemented on a physical prototype to show the performance in practice. The torque control of PMSMs has become an interesting topic recently. A new approach based on ADP is proposed to control the torque, and consequently the speed of a PMSM when an unknown load torque is applied on it. The proposed controller achieves a fast transient response, low ripples and small steady-state error. The control algorithm uses two neural networks, called critic and actor. The former is utilized to evaluate the cost and the latter is used to generate control signals. The training is done once offline and the calculated optimal weights of actor network are used in online control to achieve fast and accurate torque control of PMSMs. This algorithm is compared with field-oriented control (FOC) and direct torque control based on space vector modulation (DTC-SVM). Simulations and experimental results show that the proposed algorithm provides desirable results under both accurate and uncertain modeled dynamics

    Review of advanced guidance and control algorithms for space/aerospace vehicles

    Get PDF
    The design of advanced guidance and control (G&C) systems for space/aerospace vehicles has received a large amount of attention worldwide during the last few decades and will continue to be a main focus of the aerospace industry. Not surprisingly, due to the existence of various model uncertainties and environmental disturbances, robust and stochastic control-based methods have played a key role in G&C system design, and numerous effective algorithms have been successfully constructed to guide and steer the motion of space/aerospace vehicles. Apart from these stability theory-oriented techniques, in recent years, we have witnessed a growing trend of designing optimisation theory-based and artificial intelligence (AI)-based controllers for space/aerospace vehicles to meet the growing demand for better system performance. Related studies have shown that these newly developed strategies can bring many benefits from an application point of view, and they may be considered to drive the onboard decision-making system. In this paper, we provide a systematic survey of state-of-the-art algorithms that are capable of generating reliable guidance and control commands for space/aerospace vehicles. The paper first provides a brief overview of space/aerospace vehicle guidance and control problems. Following that, a broad collection of academic works concerning stability theory-based G&C methods is discussed. Some potential issues and challenges inherent in these methods are reviewed and discussed. Then, an overview is given of various recently developed optimisation theory-based methods that have the ability to produce optimal guidance and control commands, including dynamic programming-based methods, model predictive control-based methods, and other enhanced versions. The key aspects of applying these approaches, such as their main advantages and inherent challenges, are also discussed. Subsequently, a particular focus is given to recent attempts to explore the possible uses of AI techniques in connection with the optimal control of the vehicle systems. The highlights of the discussion illustrate how space/aerospace vehicle control problems may benefit from these AI models. Finally, some practical implementation considerations, together with a number of future research topics, are summarised
    corecore