285 research outputs found

    Software packages for food engineering needs

    Full text link
    The graphic user interface (GUI) software packages “ANNEKs” and “OPT-PROx” are developed to meet food engineering needs. “OPT-RROx” (OPTimal PROfile) is software developed to carry out thermal food processing optimization based on the variable retort temperature processing and global optimization technique. “ANNEKs” (Artificial Neural Network Enzyme Kinetics) is software designed for determining the kinetics of enzyme hydrolysis of protein at different initial reaction parameters based on the Artificial Neural Network’s and adaptive random search algorithm. The “OPT-PROx” software was successfully tested on the real thermal food processing problems, and has demonstrated a significant advantage over the traditionally used constant temperature processes in terms of total processing time and final product quality. The “ANNEKs” software can be effectively used for process design of enzyme hydrolysis of protein. The proposed in this research user friendly dialogue software can be useful for food scientists and engineers

    Study and development of quantum algorithms for solving and estimating parameters of partial differential equations

    Get PDF
    Les Equacions Diferencials en derivades Parcials (EDPs) descriuen una gran varietat de fenòmens físics. En moltes situacions, es pot tenir accés a observacions d’un determinat sistema físic i disposar d’alguna idea inicial sobre un aspecte qualitatiu de la seva dinàmica. Aquest coneixement previ és suficient per a determinar l’estructura global de l’EDP, però no els seus coeficients específics. De fet, els paràmetres dels models d’EDPs normalment codifiquen interpretacions científiques rellevants, de manera que és de gran interès poder determinar els seus valors. Aquests coeficients s’estimen a partir de mesures disponibles, que acostumen a presentar soroll. Aquest projecte presenta un algoritme híbrid quàntic-clàssic per a inferir els paràmetres d’una EDP donat un conjunt de dades d’observacions empíriques. Per tal de dur a terme l’estimació de paràmetres, cal tenir accés a una solució de l’EDP. Aquesta tesi proposa un algoritme quàntic per a resoldre EDPs basat en un circuit quàntic parametritzat. Aquest circuit codifica les variables d’entrada utilitzant una aplicació coneguda com a Chebyshev feature map que ofereix una base de polinomis molt representativa i que posseeix gran expressivitat. A continuació, la solució es calcula al circuit quàntic mitjançant mesures de valors esperats. Les derivades espacials i temporals es calculen al circuit quàntic mitjançant derivació automàtica (a través de l’anomenada parameter shift rule) de forma analítica, de manera que s’eviten les inexactituds derivades dels procediments que utilitzen diferències finites per a calcular gradients. Per últim, el circuit quàntic s’entrena per tal de satisfer l’EDP donada i les condicions de frontera especificades. Com a cas d’estudi, l’algoritme s’il·lustra a partir de diverses simulacions per tal de determinar el circuit quàntic que resol l’equació de la calor amb millor expressivitat i exactitud. Amb aquesta configuració es determinen els paràmetres de l’equació de la calor.Las Ecuaciones Diferenciales en derivadas Parciales (EDPs) describen una gran variedad de fenómenos físicos. En muchas situaciones, se puede tener acceso a observaciones de un determinado sistema físico y disponer de alguna idea inicial sobre un aspecto cualitativo de su dinámica. Este conocimiento previo es suficiente para determinar la estructura global de la EDP, pero no sus coeficientes específicos. De hecho, los parámetros de los modelos de EDPs normalmente codifican interpretaciones científicas relevantes, de manera que es de gran interés poder determinar sus valores. Estos coeficientes se estiman a partir de medidas disponibles, que acostumbran a presentar ruido. Este proyecto presenta un algoritmo híbrido cuántico-clásico para inferir los parámetros de una EDP dado un conjunto de datos de observaciones empíricas. Para la estimación de parámetros, hace falta tener acceso a una solución de la EDP. Esta tesis propone un algoritmo cuántico para resolver EDPs basado en un circuito cuántico parametrizado. Este circuito codifica las variables de entrada usando una aplicación conocida como Chebyshev feature map que ofrece una base de polinomios muy representativa y que posee gran expresividad. A continuación, la solución se calcula en el circuito cuántico mediante medidas de valores esperados. Las derivadas espaciales y temporales se calculan en el circuito cuántico mediante diferenciación automática (a través de la parameter shift rule) de forma analítica, evitando así las inexactitudes derivadas de los procedimientos que usan diferencias finitas para calcular gradientes. Por último, el circuito cuántico se entrena para satisfacer la EDP dada y las condiciones de frontera especificadas. Como caso de estudio, el algoritmo se ilustra a partir de varias simulaciones con el fin de determinar el circuito cuántico que resuelve la ecuación del calor con mejor expresividad y exactitud. Con esta configuración se determinan los parámetros de la ecuación del calor.Partial differential equations (PDEs) describe a wide variety of physical phenomena. In many situations, one can have access to observations on some physical system and some initial idea of some qualitative aspects of its dynamics. This prior knowledge is enough to determine the overall structure of the PDE, but not its specific coefficients. In fact, the parameters of PDE models encode insightful scientific interpretations, so it is of great interest to determine their values. These coefficients are estimated from the available noisy measurements of the system. This project presents a hybrid quantum-classical approach to infer the parameters of a PDE given a data-set of empirical observations. In order to perform parameter estimation, it is necessary to have access to a PDE solver. This thesis proposes a quantum algorithm to solve PDEs based on a parameterized quantum circuit. This circuit encodes the input variables in a Chebyshev quantum feature map that offers a powerful basis set of fitting polynomials and possesses rich expressivity. Then, the surrogate of the real solution is computed by measuring expectation values. The spatial and temporal derivatives of the surrogates are computed in the differentiable quantum circuit (DQC) through automatic differentiation (via the so-called parameter shift rule) in an analytical form, thus avoiding inaccurate finite difference procedures for calculating gradients. The DQC is then trained to satisfy the given PDE and specified boundary conditions. As a case study, the algorithm is illustrated via several simulations in order to determine the DQC that solves the Heat equation with best expressivity and accuracy. The parameters of the Heat equation are then estimated with this particular setting.Outgoin

    Leveraging deep reinforcement learning in the smart grid environment

    Full text link
    L’apprentissage statistique moderne démontre des résultats impressionnants, où les or- dinateurs viennent à atteindre ou même à excéder les standards humains dans certaines applications telles que la vision par ordinateur ou les jeux de stratégie. Pourtant, malgré ces avancées, force est de constater que les applications fiables en déploiement en sont encore à leur état embryonnaire en comparaison aux opportunités qu’elles pourraient apporter. C’est dans cette perspective, avec une emphase mise sur la théorie de décision séquentielle et sur les recherches récentes en apprentissage automatique, que nous démontrons l’applica- tion efficace de ces méthodes sur des cas liés au réseau électrique et à l’optimisation de ses acteurs. Nous considérons ainsi des instances impliquant des unités d’emmagasinement éner- gétique ou des voitures électriques, jusqu’aux contrôles thermiques des bâtiments intelligents. Nous concluons finalement en introduisant une nouvelle approche hybride qui combine les performances modernes de l’apprentissage profond et de l’apprentissage par renforcement au cadre d’application éprouvé de la recherche opérationnelle classique, dans le but de faciliter l’intégration de nouvelles méthodes d’apprentissage statistique sur différentes applications concrètes.While modern statistical learning is achieving impressive results, as computers start exceeding human baselines in some applications like computer vision, or even beating pro- fessional human players at strategy games without any prior knowledge, reliable deployed applications are still in their infancy compared to what these new opportunities could fathom. In this perspective, with a keen focus on sequential decision theory and recent statistical learning research, we demonstrate efficient application of such methods on instances involving the energy grid and the optimization of its actors, from energy storage and electric cars to smart buildings and thermal controls. We conclude by introducing a new hybrid approach combining the modern performance of deep learning and reinforcement learning with the proven application framework of operations research, in the objective of facilitating seamlessly the integration of new statistical learning-oriented methodologies in concrete applications

    Analiza i predviđanje potrošnje energije poslovne zgrade korišćenjem višestruko linearno regresionog modela, metode potpornih vektora i neuronske mreže

    Get PDF
    Considering the constant growth of interest in energy efficiency in the building sector, it is necessary to apply and improve existing and also to develop new methods for prediction and analysis of building energy consumption. In this paper cooling consumption of the model of a typical commercial building in Belgrade is analyzed. Detailed energy simulation is done using software HAP (Hourly Analysis Program). The influence of various building characteristics is investigated, and for creating building consumption database, three variables that most largely affect the cooling consumption are chosen: specific lighting power, window area and window shade coefficient. Those three parameters are varied and 245 simulations in total are used for creating and testing the prediction models. The multiple linear model is created and the obtained equation is used for cooling consumption evaluation taking these three building parameters as input. The artificial neural network and support vector machine (SVM) models are also developed for prediction and their results are compared with linear regression model. It has been shown that the statistical methods, such are neural networks and support vector machines can achieve much higher accuracy in prediction than the linear regression model, gaining almost perfect match with simulated values (mean absolute percentage error for testing the SVM model 0,26%).S obzirom na stalni porast interesovanja za povećanje energetske efikasnosti u zgradarstvu, neophodno je primenjivati i unapređivati postojeće i razvijati nove metode za predviđanje i analizu potrošnje zgrada. Na modelu tipične poslovne zgrade u Beogradu ispitivan je uticaj različitih karakteristika zgrade. Simulacija potrošnje energije na časovnoj bazi urađena je korišćenjem programa HAP (Hourly Analysis Program). Za dalju analizu izabrana su tri faktora koja u najvećoj meri utiču na potrošnju energije za hlađenje: specifična instalisana snaga osvetljenja, udeo prozora u spoljašnjem zidu i koeficijent propustljivosti Sunčevog zračenja kroz prozore. Analiza je vršena za različite vrednosti ova tri parametra. Za kreiranje i testiranje višeparametarskog modela korišćeno je 245 simulacija. Predložen je višestruko linearni model koji može da se koristi za određivanje potrošnje energije za hlađenje, a koji kao ulazne veličine koristi pomenuta tri parametra. U cilju predviđanja potrošnje, razvijeni su modeli primenom metode potpornih vektora (support vector machine) i veštačkih neuronskih mreža i izvršeno je poređenje rezultata sa višestruko linearnim modelom. Pokazano je da modeli zasnovani na metodi potpornih vektora i neuronskim mrežama postižu veću tačnost predvidjanja u odnosu na linearni višeparametarski model

    Analiza i predviđanje potrošnje energije poslovne zgrade korišćenjem višestruko linearno regresionog modela, metode potpornih vektora i neuronske mreže

    Get PDF
    Considering the constant growth of interest in energy efficiency in the building sector, it is necessary to apply and improve existing and also to develop new methods for prediction and analysis of building energy consumption. In this paper cooling consumption of the model of a typical commercial building in Belgrade is analyzed. Detailed energy simulation is done using software HAP (Hourly Analysis Program). The influence of various building characteristics is investigated, and for creating building consumption database, three variables that most largely affect the cooling consumption are chosen: specific lighting power, window area and window shade coefficient. Those three parameters are varied and 245 simulations in total are used for creating and testing the prediction models. The multiple linear model is created and the obtained equation is used for cooling consumption evaluation taking these three building parameters as input. The artificial neural network and support vector machine (SVM) models are also developed for prediction and their results are compared with linear regression model. It has been shown that the statistical methods, such are neural networks and support vector machines can achieve much higher accuracy in prediction than the linear regression model, gaining almost perfect match with simulated values (mean absolute percentage error for testing the SVM model 0,26%).S obzirom na stalni porast interesovanja za povećanje energetske efikasnosti u zgradarstvu, neophodno je primenjivati i unapređivati postojeće i razvijati nove metode za predviđanje i analizu potrošnje zgrada. Na modelu tipične poslovne zgrade u Beogradu ispitivan je uticaj različitih karakteristika zgrade. Simulacija potrošnje energije na časovnoj bazi urađena je korišćenjem programa HAP (Hourly Analysis Program). Za dalju analizu izabrana su tri faktora koja u najvećoj meri utiču na potrošnju energije za hlađenje: specifična instalisana snaga osvetljenja, udeo prozora u spoljašnjem zidu i koeficijent propustljivosti Sunčevog zračenja kroz prozore. Analiza je vršena za različite vrednosti ova tri parametra. Za kreiranje i testiranje višeparametarskog modela korišćeno je 245 simulacija. Predložen je višestruko linearni model koji može da se koristi za određivanje potrošnje energije za hlađenje, a koji kao ulazne veličine koristi pomenuta tri parametra. U cilju predviđanja potrošnje, razvijeni su modeli primenom metode potpornih vektora (support vector machine) i veštačkih neuronskih mreža i izvršeno je poređenje rezultata sa višestruko linearnim modelom. Pokazano je da modeli zasnovani na metodi potpornih vektora i neuronskim mrežama postižu veću tačnost predvidjanja u odnosu na linearni višeparametarski model

    Thermodynamic Assessment and Optimisation of Supercritical and Transcritical Power Cycles Operating on CO2 Mixtures by Means of Artificial Neural Networks

    Get PDF
    Feb 21, 2022 to Feb 24, 2022, San Antonio, TX, United StatesClosed supercritical and transcritical power cycles operating on Carbon Dioxide have proven to be a promising technology for power generation and, as such, they are being researched by numerous international projects today. Despite the advantageous features of these cycles enabling very high efficiencies in intermediate temperature applications, the major shortcoming of the technology is a strong dependence on ambient temperature; in order to perform compression near the CO2 critical point (31ºC), low ambient temperatures are needed. This is particularly challenging in Concentrated Solar Power applications, typically found in hot, semi-arid locations. To overcome this limitation, the SCARABEUS project explores the idea of blending raw carbon dioxide with small amounts of certain dopants in order to shift the critical temperature of the resulting working fluid to higher values, hence enabling gaseous compression near the critical point or even liquid compression regardless of a high ambient temperature. Different dopants have been studied within the project so far (i.e. C6F6, TiCl4 and SO2) but the final selection will have to account for trade-offs between thermodynamic performance, economic metrics and system reliability. Bearing all this in mind, the present paper deals with the development of a non-physics-based model using Artificial Neural Networks (ANN), developed using Matlab’s Deep Learning Toolbox, to enable SCARABEUS system optimisation without running the detailed – and extremely time consuming – thermal models, developed with Thermoflex and Matlab software. In the first part of the paper, the candidate dopants and cycle layouts are presented and discussed, and a thorough description of the ANN training methodology is provided, along with all the main assumptions and hypothesis made. In the second part of the manuscript, results confirms that the ANN is a reliable tool capable of successfully reproducing the detailed Thermoflex model, estimating the cycle thermal efficiency with a Root Mean Square Error lower than 0.2 percentage points. Furthermore, the great advantage of using the Artificial Neural Network proposed is demonstrated by the huge reduction in the computational time needed, up to 99% lower than the one consumed by the detailed model. Finally, the high flexibility and versatility of the ANN is shown, applying this tool in different scenarios and estimating different cycle thermal efficiency for a great variety of boundary conditions.Unión Europea H2020-81498

    Monitoring and Fault Diagnosis for Chylla-Haase Polymerization Reactor

    Get PDF
    The main objective of this research is to develop a fault detection and isolation (FDI) methodologies for Cylla-Haase polymerization reactor, and implement the developed methods to the nonlinear simulation model of the proposed reactor to evaluate the effectiveness of FDI methods. The first part of this research focus of this chapter is to understand the nonlinear dynamic behaviour of the Chylla-Haase polymerization reactor. In this part, the mathematical model of the proposed reactor is described. The Simulink model of the proposed reactor is set up using Simulink/MATLAB. The design of Simulink model is developed based on a set of ordinary differential equations that describe the dynamic behaviour of the proposed polymerization reactor. An independent radial basis function neural networks (RBFNN) are developed and employed here for an on-line diagnosis of actuator and sensor faults. In this research, a robust fault detection and isolation (FDI) scheme is developed for open-loop exothermic semi-batch polymerization reactor described by Chylla-Haase. The independent (RBFNN) is employed here when the system is subjected to system uncertainties and disturbances. Two different techniques to employ RBF neural networks are investigated. Firstly, an independent neural network is used to model the reactor dynamics and generate residuals. Secondly, an additional RBF neural network is developed as a classifier to isolate faults from the generated residuals. In the third part of this research, a robust fault detection and isolation (FDI) scheme is developed to monitor the Chylla-Haase polymerization reactor, when it is under the cascade PI control. This part is really challenging task as the controller output cannot be designed when the reactor is under closed-loop control, and the control action will correct small changes of the states caused by faults. The proposed FDI strategy employed a radial basis function neural network (RBFNN) in an independent mode to model the process dynamics, and using the weighted sum-squared prediction error as the residual. The Recursive Orthogonal Least Squares algorithm (ROLS) is employed to train the model to overcome the training difficulty of the independent mode of the network. Then, another RBFNN is used as a fault classifier to isolate faults from different features involved in the residual vector. In this research, an independent MLP neural network is implemented here to generate residuals for detection task. And another RBF is applied for isolation task performing as a classifier. The fault diagnosis scheme is developed for a Chylla-Haase reactor under open-loop and closed-loop control system. The comparison between these two neural network architectures (MPL and RBF) are shown that RBF configuration trained by (RLS) algorithm have several advantages. The first one is greater efficiency in finding optimal weights for field strength prediction in complex dynamic systems. The RBF configuration is less complex network that results in faster convergence. The training algorithms (RLs and ROLS) that used for training RBFNN in chapter (4) and (5) have proven to be efficient, which results in significant faster computer time in comparison to back-propagation one. Another fault diagnosis (FD) scheme is developed in this research for an exothermic semi-batch polymerization reactor. The scheme includes two parts: the first part is to generate residual using an extended Kalman filter (EKF), and the second part is the decision making to report fault using a standardized hypothesis of statistical tests. The FD simulation results are presented to demonstrate the effectiveness of the proposed method. In the lase section of this research, a robust fault diagnosis scheme for abrupt and incipient faults in nonlinear dynamic system. A general framework is developed for model-based fault detection and diagnosis using on-line approximators and adaptation/learning schemes. In this framework, neural network models constitute an important class of on-line approximators. The changes in the system dynamics due to fault are modelled as nonlinear functions of the state, while the time profile of the fault is assumed to be exponentially developing. The changes in the system dynamics are monitored by an on-line approximation model, which is used for detecting the failures. A systematic procedure for constructing nonlinear estimation algorithm is developed, and a stable learning scheme is derived using Lyapunov theory. Simulation studies are used to illustrate the results and to show the effectiveness of the fault diagnosis methodology. Finally, the success of the proposed fault diagnosis methods illustrates the potential of the application of an independent RBFNN, an independent MLP, an Extended kalman filter and an adaptive nonlinear observer based FD, to chemical reactors

    모델기반강화학습을이용한공정제어및최적화

    Get PDF
    학위논문(박사)--서울대학교 대학원 :공과대학 화학생물공학부,2020. 2. 이종민.순차적 의사결정 문제는 공정 최적화의 핵심 분야 중 하나이다. 이 문제의 수치적 해법 중 가장 많이 사용되는 것은 순방향으로 작동하는 직접법 (direct optimization) 방법이지만, 몇가지 한계점을 지니고 있다. 최적해는 open-loop의 형태를 지니고 있으며, 불확정성이 존재할때 방법론의 수치적 복잡도가 증가한다는 것이다. 동적 계획법 (dynamic programming) 은 이러한 한계점을 근원적으로 해결할 수 있지만, 그동안 공정 최적화에 적극적으로 고려되지 않았던 이유는 동적 계획법의 결과로 얻어진 편미분 방정식 문제가 유한차원 벡터공간이 아닌 무한차원의 함수공간에서 다루어지기 때문이다. 소위 차원의 저주라고 불리는 이 문제를 해결하기 위한 한가지 방법으로서, 샘플을 이용한 근사적 해법에 초점을 둔 강화학습 방법론이 연구되어 왔다. 본 학위논문에서는 강화학습 방법론 중, 공정 최적화에 적합한 모델 기반 강화학습에 대해 연구하고, 이를 공정 최적화의 대표적인 세가지 순차적 의사결정 문제인 스케줄링, 상위단계 최적화, 하위단계 제어에 적용하는 것을 목표로 한다. 이 문제들은 각각 부분관측 마르코프 결정 과정 (partially observable Markov decision process), 제어-아핀 상태공간 모델 (control-affine state space model), 일반적 상태공간 모델 (general state space model)로 모델링된다. 또한 각 수치적 모델들을 해결하기 위해 point based value iteration (PBVI), globalized dual heuristic programming (GDHP), and differential dynamic programming (DDP)로 불리는 방법들을 도입하였다. 이 세가지 문제와 방법론에서 제시된 특징들을 다음과 같이 요약할 수 있다: 첫번째로, 스케줄링 문제에서 closed-loop 피드백 형태의 해를 제시할 수 있었다. 이는 기존 직접법에서 얻을 수 없었던 형태로서, 강화학습의 강점을 부각할 수 있는 측면이라 생각할 수 있다. 두번째로 고려한 하위단계 제어 문제에서, 동적 계획법의 무한차원 함수공간 최적화 문제를 함수 근사 방법을 통해 유한차원 벡터공간 최적화 문제로 완화할 수 있는 방법을 도입하였다. 특히, 심층 신경망을 이용하여 함수 근사를 하였고, 이때 발생하는 여러가지 장점과 수렴 해석 결과를 본 학위논문에 실었다. 마지막 문제는 상위 단계 동적 최적화 문제이다. 동적 최적화 문제에서 발생하는 제약 조건하에서 강화학습을 수행하기 위해, 원-쌍대 미분동적 계획법 (primal-dual DDP) 방법론을 새로 제안하였다. 앞서 설명한 세가지 문제에 적용된 방법론을 검증하고, 동적 계획법이 직접법에 비견될 수 있는 방법론이라는 주장을 실증하기 위해 여러가지 공정 예제를 실었다.Sequential decision making problem is a crucial technology for plant-wide process optimization. While the dominant numerical method is the forward-in-time direct optimization, it is limited to the open-loop solution and has difficulty in considering the uncertainty. Dynamic programming method complements the limitations, nonetheless associated functional optimization suffers from the curse-of-dimensionality. The sample-based approach for approximating the dynamic programming, referred to as reinforcement learning (RL) can resolve the issue and investigated throughout this thesis. The method that accounts for the system model explicitly is in particular interest. The model-based RL is exploited to solve the three representative sequential decision making problems; scheduling, supervisory optimization, and regulatory control. The problems are formulated with partially observable Markov decision process, control-affine state space model, and general state space model, and associated model-based RL algorithms are point based value iteration (PBVI), globalized dual heuristic programming (GDHP), and differential dynamic programming (DDP), respectively. The contribution for each problem can be written as follows: First, for the scheduling problem, we developed the closed-loop feedback scheme which highlights the strength compared to the direct optimization method. In the second case, the regulatory control problem is tackled by the function approximation method which relaxes the functional optimization to the finite dimensional vector space optimization. Deep neural networks (DNNs) is utilized as the approximator, and the advantages as well as the convergence analysis is performed in the thesis. Finally, for the supervisory optimization problem, we developed the novel constraint RL framework that uses the primal-dual DDP method. Various illustrative examples are demonstrated to validate the developed model-based RL algorithms and to support the thesis statement on which the dynamic programming method can be considered as a complementary method for direct optimization method.1. Introduction 1 1.1 Motivation and previous work 1 1.2 Statement of contributions 9 1.3 Outline of the thesis 11 2. Background and preliminaries 13 2.1 Optimization problem formulation and the principle of optimality 13 2.1.1 Markov decision process 15 2.1.2 State space model 19 2.2 Overview of the developed RL algorithms 28 2.2.1 Point based value iteration 28 2.2.2 Globalized dual heuristic programming 29 2.2.3 Differential dynamic programming 32 3. A POMDP framework for integrated scheduling of infrastructure maintenance and inspection 35 3.1 Introduction 35 3.2 POMDP solution algorithm 38 3.2.1 General point based value iteration 38 3.2.2 GapMin algorithm 46 3.2.3 Receding horizon POMDP 49 3.3 Problem formulation for infrastructure scheduling 54 3.3.1 State 56 3.3.2 Maintenance and inspection actions 57 3.3.3 State transition function 61 3.3.4 Cost function 67 3.3.5 Observation set and observation function 68 3.3.6 State augmentation 69 3.4 Illustrative example and simulation result 69 3.4.1 Structural point for the analysis of a high dimensional belief space 72 3.4.2 Infinite horizon policy under the natural deterioration process 72 3.4.3 Receding horizon POMDP 79 3.4.4 Validation of POMDP policy via Monte Carlo simulation 83 4. A model-based deep reinforcement learning method applied to finite-horizon optimal control of nonlinear control-affine system 88 4.1 Introduction 88 4.2 Function approximation and learning with deep neural networks 91 4.2.1 GDHP with a function approximator 91 4.2.2 Stable learning of DNNs 96 4.2.3 Overall algorithm 103 4.3 Results and discussions 107 4.3.1 Example 1: Semi-batch reactor 107 4.3.2 Example 2: Diffusion-Convection-Reaction (DCR) process 120 5. Convergence analysis of the model-based deep reinforcement learning for optimal control of nonlinear control-affine system 126 5.1 Introduction 126 5.2 Convergence proof of globalized dual heuristic programming (GDHP) 128 5.3 Function approximation with deep neural networks 137 5.3.1 Function approximation and gradient descent learning 137 5.3.2 Forward and backward propagations of DNNs 139 5.4 Convergence analysis in the deep neural networks space 141 5.4.1 Lyapunov analysis of the neural network parameter errors 141 5.4.2 Lyapunov analysis of the closed-loop stability 150 5.4.3 Overall Lyapunov function 152 5.5 Simulation results and discussions 157 5.5.1 System description 158 5.5.2 Algorithmic settings 160 5.5.3 Control result 161 6. Primal-dual differential dynamic programming for constrained dynamic optimization of continuous system 170 6.1 Introduction 170 6.2 Primal-dual differential dynamic programming for constrained dynamic optimization 172 6.2.1 Augmented Lagrangian method 172 6.2.2 Primal-dual differential dynamic programming algorithm 175 6.2.3 Overall algorithm 179 6.3 Results and discussions 179 7. Concluding remarks 186 7.1 Summary of the contributions 187 7.2 Future works 189 Bibliography 192Docto
    corecore