427 research outputs found

    Hybrid spiral-dynamic bacteria-chemotaxis algorithm with application to control two-wheeled machines

    Get PDF
    This paper presents the implementation of the hybrid spiral-dynamic bacteria-chemotaxis (HSDBC) approach to control two different configurations of a two-wheeled vehicle. The HSDBC is a combination of bacterial chemotaxis used in bacterial forging algorithm (BFA) and the spiral-dynamic algorithm (SDA). BFA provides a good exploration strategy due to the chemotaxis approach. However, it endures an oscillation problem near the end of the search process when using a large step size. Conversely; for a small step size, it affords better exploitation and accuracy with slower convergence. SDA provides better stability when approaching an optimum point and has faster convergence speed. This may cause the search agents to get trapped into local optima which results in low accurate solution. HSDBC exploits the chemotactic strategy of BFA and fitness accuracy and convergence speed of SDA so as to overcome the problems associated with both the SDA and BFA algorithms alone. The HSDBC thus developed is evaluated in optimizing the performance and energy consumption of two highly nonlinear platforms, namely single and double inverted pendulum-like vehicles with an extended rod. Comparative results with BFA and SDA show that the proposed algorithm is able to result in better performance of the highly nonlinear systems

    Human inspired humanoid robots control architecture

    Get PDF
    This PhD Thesis tries to present a different point of view when talking about the development of control architectures for humanoid robots. Specifically, this Thesis is focused on studying the human postural control system as well as on the use of this knowledge to develop a novel architecture for postural control in humanoid robots. The research carried on in this thesis shows that there are two types of components for postural control: a reactive one, and other predictive or anticipatory. This work has focused on the development of the second component through the implementation of a predictive system complementing the reactive one. The anticipative control system has been analysed in the human case and it has been extrapolated to the architecture for controlling the humanoid robot TEO. In this way, its different components have been developed based on how humans work without forgetting the tasks it has been designed for. This control system is based on the composition of sensorial perceptions, the evaluation of stimulus through the use of the psychophysics theory of the surprise, and the creation of events that can be used for activating some reaction strategies (synergies) The control system developed in this Thesis, as well as the human being does, processes information coming from different sensorial sources. It also composes the named perceptions, which depend on the type of task the postural control acts over. The value of those perceptions is obtained using bio-inspired evaluation techniques of sensorial inference. Once the sensorial input has been obtained, it is necessary to process it in order to foresee possible disturbances that may provoke an incorrect performance of a task. The system developed in this Thesis evaluates the sensorial information, previously transformed into perceptions, through the use of the “Surprise Theory”, and it generates some events called “surprises” used for predicting the evolution of a task. Finally, the anticipative system for postural control can compose, if necessary, the proper reactions through the use of predefined movement patterns called synergies. Those reactions can complement or substitute completely the normal performance of a task. The performance of the anticipative system for postural control as well as the performance of each one of its components have been tested through simulations and the application of the results in the humanoid robot TEO from the RoboticsLab research group in the Systems Engineering and Automation Department from the Carlos III University of Madrid. ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Esta Tesis Doctoral pretende aportar un punto de vista diferente en el desarrollo de arquitecturas de control para robots humanoides. En concreto, esta Tesis se centra en el estudio del sistema de control postural humano y en la aplicación de este conocimiento en el desarrollo de una nueva arquitectura de control postural para robots humanoides. El estudio realizado en esta Tesis pone de manifiesto la existencia de una componente de control postural reactiva y otra predictiva o anticipativa. Este trabajo se ha centrado en el desarrollo de la segunda componente mediante la implementación de un sistema predictivo que complemente al sistema reactivo. El sistema de control anticipativo ha sido estudiado en el caso humano y extrapolado para la arquitectura de control del robot humanoide TEO. De este modo, sus diferentes componentes han sido desarrollados inspirándose en el funcionamiento humano y considerando las tareas para las que dicho robot ha sido concebido. Dicho sistema está basado en la composición de percepciones sensoriales, la evaluación de los estímulos mediante el uso de la teoría psicofísica de la sorpresa y la generación de eventos que sirvan para activar estrategias de reacción (sinergias). El sistema de control desarrollado en esta Tesis, al igual que el ser humano, procesa información de múltiples fuentes sensoriales y compone las denominadas percepciones, que dependen del tipo de tarea sobre la que actúa el control postural. El valor de estas percepciones es obtenido utilizando técnicas de evaluación bioinspiradas de inferencia sensorial. Una vez la entrada sensorial ha sido obtenida, es necesario procesarla para prever posibles perturbaciones que puedan ocasionar una incorrecta realización de una tarea. El sistema desarrollado en esta Tesis evalúa la información sensorial, previamente transformada en percepciones, mediante la ‘Teoría de la Sorpresa’ y genera eventos llamados ‘sorpresas’ que sirven para predecir la evolución de una tarea. Por último, el sistema anticipativo de control postural puede componer, si fuese necesario, las reacciones adecuadas mediante el uso de patrones de movimientos predefinidos llamados sinergias. Dichas reacciones pueden complementar o sustituir por completo la ejecución normal de una tarea. El funcionamiento del sistema anticipativo de control postural y de cada uno de sus componentes ha sido probado tanto por medio de simulaciones como por su aplicación en el robot humanoide TEO del grupo de investigación RoboticsLab en el Departamento de Ingeniería de Sistemas y Automática de la Universidad Carlos III de Madrid

    Modelling and control of a novel structure two-wheeled robot with an extendable intermediate body

    Get PDF

    Intelligent model-based control of complex multi-link mechanisms

    Get PDF
    Complex under-actuated multilink mechanism involves a system whose number of control inputs is smaller than the dimension of the configuration space. The ability to control such a system through the manipulation of its natural dynamics would allow for the design of more energy-efficient machines with the ability to achieve smooth motions similar to those found in the natural world. This research aims to understand the complex nature of the Robogymnast, a triple link underactuated pendulum built at Cardiff University with the purpose of studying the behaviour of non-linear systems and understanding the challenges in developing its control system. A mathematical model of the robot was derived from the Euler-Lagrange equations. The design of the control system was based on the discrete-time linear model around the downward position and a sampling time of 2.5 milliseconds. Firstly, Invasive Weed Optimization (IWO) was used to optimize the swing-up motion of the robot by determining the optimum values of parameters that control the input signals of the Robogymnast’s two motors. The values obtained from IWO were then applied to both simulation and experiment. The results showed that the swing-up motion of the Robogymnast from the stable downward position to the inverted configuration to be successfully achieved. Secondly, due to the complex nature and nonlinearity of the Robogymnast, a novel approach of modelling the Robogymnast using a multi-layered Elman neural ii network (ENN) was proposed. The ENN model was then tested with various inputs and its output were analysed. The results showed that the ENN model to be capable of providing a better representation of the actual system compared to the mathematical model. Thirdly, IWO is used to investigate the optimum Q values of the Linear Quadratic Regulator (LQR) for inverted balance control of the Robogymnast. IWO was used to obtain the optimal Q values required by the LQR to maintain the Robogymnast in an upright configuration. Two fitness criteria were investigated: cost function J and settling time T. A controller was developed using values obtained from each fitness criteria. The results showed that LQRT performed faster but LQRJ was capable of stabilizing the Robogymnast from larger deflection angles. Finally, fitness criteria J and T were used simultaneously to obtain the optimal Q values for the LQR. For this purpose, two multi-objective optimization methods based on the IWO, namely the Weighted Criteria Method IWO (WCMIWO) and the Fuzzy Logic IWO Hybrid (FLIWOH) were developed. Two LQR controllers were first developed using the parameters obtained from the two optimization methods. The same process was then repeated with disturbance applied to the Robogymnast states to develop another two LQR controllers. The response of the controllers was then tested in different scenarios using simulation and their performance was evaluated. The results showed that all four controllers were able to balance the Robogymnast with the fastest settling time achieved by WMCIWO with disturbance followed by in the ascending order: FLIWOH with disturbance, FLIWOH, and WCMIWO

    Soft-computing based intelligent adaptive control design of complex dynamic systems

    Get PDF

    Value Function Estimation in Optimal Control via Takagi-Sugeno Models and Linear Programming

    Full text link
    [ES] La presente Tesis emplea técnicas de programación dinámica y aprendizaje por refuerzo para el control de sistemas no lineales en espacios discretos y continuos. Inicialmente se realiza una revisión de los conceptos básicos de programación dinámica y aprendizaje por refuerzo para sistemas con un número finito de estados. Se analiza la extensión de estas técnicas mediante el uso de funciones de aproximación que permiten ampliar su aplicabilidad a sistemas con un gran número de estados o sistemas continuos. Las contribuciones de la Tesis son: -Se presenta una metodología que combina identificación y ajuste de la función Q, que incluye la identificación de un modelo Takagi-Sugeno, el cálculo de controladores subóptimos a partir de desigualdades matriciales lineales y el consiguiente ajuste basado en datos de la función Q a través de una optimización monotónica. -Se propone una metodología para el aprendizaje de controladores utilizando programación dinámica aproximada a través de programación lineal. La metodología hace que ADP-LP funcione en aplicaciones prácticas de control con estados y acciones continuos. La metodología propuesta estima una cota inferior y superior de la función de valor óptima a través de aproximadores funcionales. Se establecen pautas para los datos y la regularización de regresores con el fin de obtener resultados satisfactorios evitando soluciones no acotadas o mal condicionadas. -Se plantea una metodología bajo el enfoque de programación lineal aplicada a programación dinámica aproximada para obtener una mejor aproximación de la función de valor óptima en una determinada región del espacio de estados. La metodología propone aprender gradualmente una política utilizando datos disponibles sólo en la región de exploración. La exploración incrementa progresivamente la región de aprendizaje hasta obtener una política convergida.[CA] La present Tesi empra tècniques de programació dinàmica i aprenentatge per reforç per al control de sistemes no lineals en espais discrets i continus. Inicialment es realitza una revisió dels conceptes bàsics de programació dinàmica i aprenentatge per reforç per a sistemes amb un nombre finit d'estats. S'analitza l'extensió d'aquestes tècniques mitjançant l'ús de funcions d'aproximació que permeten ampliar la seua aplicabilitat a sistemes amb un gran nombre d'estats o sistemes continus. Les contribucions de la Tesi són: -Es presenta una metodologia que combina identificació i ajust de la funció Q, que inclou la identificació d'un model Takagi-Sugeno, el càlcul de controladors subòptims a partir de desigualtats matricials lineals i el consegüent ajust basat en dades de la funció Q a través d'una optimització monotónica. -Es proposa una metodologia per a l'aprenentatge de controladors utilitzant programació dinàmica aproximada a través de programació lineal. La metodologia fa que ADP-LP funcione en aplicacions pràctiques de control amb estats i accions continus. La metodologia proposada estima una cota inferior i superior de la funció de valor òptima a través de aproximadores funcionals. S'estableixen pautes per a les dades i la regularització de regresores amb la finalitat d'obtenir resultats satisfactoris evitant solucions no fitades o mal condicionades. -Es planteja una metodologia sota l'enfocament de programació lineal aplicada a programació dinàmica aproximada per a obtenir una millor aproximació de la funció de valor òptima en una determinada regió de l'espai d'estats. La metodologia proposa aprendre gradualment una política utilitzant dades disponibles només a la regió d'exploració. L'exploració incrementa progressivament la regió d'aprenentatge fins a obtenir una política convergida.[EN] The present Thesis employs dynamic programming and reinforcement learning techniques in order to obtain optimal policies for controlling nonlinear systems with discrete and continuous states and actions. Initially, a review of the basic concepts of dynamic programming and reinforcement learning is carried out for systems with a finite number of states. After that, the extension of these techniques to systems with a large number of states or continuous state systems is analysed using approximation functions. The contributions of the Thesis are: -A combined identification/Q-function fitting methodology, which involves identification of a Takagi-Sugeno model, computation of (sub)optimal controllers from Linear Matrix Inequalities, and the subsequent data-based fitting of Q-function via monotonic optimisation. -A methodology for learning controllers using approximate dynamic programming via linear programming is presented. The methodology makes that ADP-LP approach can work in practical control applications with continuous state and input spaces. The proposed methodology estimates a lower bound and upper bound of the optimal value function through functional approximators. Guidelines are provided for data and regressor regularisation in order to obtain satisfactory results avoiding unbounded or ill-conditioned solutions. -A methodology of approximate dynamic programming via linear programming in order to obtain a better approximation of the optimal value function in a specific region of state space. The methodology proposes to gradually learn a policy using data available only in the exploration region. The exploration progressively increases the learning region until a converged policy is obtained.This work was supported by the National Department of Higher Education, Science, Technology and Innovation of Ecuador (SENESCYT), and the Spanish ministry of Economy and European Union, grant DPI2016-81002-R (AEI/FEDER,UE). The author also received the grant for a predoctoral stay, Programa de Becas Iberoamérica- Santander Investigación 2018, of the Santander Bank.Díaz Iza, HP. (2020). Value Function Estimation in Optimal Control via Takagi-Sugeno Models and Linear Programming [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/139135TESI

    Intelligent Learning Control System Design Based on Adaptive Dynamic Programming

    Get PDF
    Adaptive dynamic programming (ADP) controller is a powerful neural network based control technique that has been investigated, designed, and tested in a wide range of applications for solving optimal control problems in complex systems. The performance of ADP controller is usually obtained by long training periods because the data usage efficiency is low as it discards the samples once used. Experience replay is a powerful technique showing potential to accelerate the training process of learning and control. However, its existing design can not be directly used for model-free ADP design, because it focuses on the forward temporal difference (TD) information (e.g., state-action pair) between the current time step and the future time step, and will need a model network for future information prediction. Uniform random sampling again used for experience replay, is not an efficient technique to learn. Prioritized experience replay (PER) presents important transitions more frequently and has proven to be efficient in the learning process. In order to solve long training periods of ADP controller, the first goal of this thesis is to avoid the usage of model network or identifier of the system. Specifically, the experience tuple is designed with one step backward state-action information and the TD can be achieved by a previous time step and a current time step. The proposed approach is tested for two case studies: cart-pole and triple-link pendulum balancing tasks. The proposed approach improved the required average trial to succeed by 26.5% for cart-pole and 43% for triple-link. The second goal of this thesis is to integrate the efficient learning capability of PER into ADP. The detailed theoretical analysis is presented in order to verify the stability of the proposed control technique. The proposed approach improved the required average trial to succeed compared to traditional ADP controller by 60.56% for cart-pole and 56.89% for triple-link balancing tasks. The final goal of this thesis is to validate ADP controller in smart grid to improve current control performance of virtual synchronous machine (VSM) at sudden load changes and a single line to ground fault and reduce harmonics in shunt active filters (SAF) during different loading conditions. The ADP controller produced the fastest response time, low overshoot and in general, the best performance in comparison to the traditional current controller. In SAF, ADP controller reduced total harmonic distortion (THD) of the source current by an average of 18.41% compared to a traditional current controller alone

    Active fault-tolerant control of nonlinear systems with wind turbine application

    Get PDF
    The thesis concerns the theoretical development of Active Fault-Tolerant Control (AFTC) methods for nonlinear system via T-S multiple-modelling approach. The thesis adopted the estimation and compensation approach to AFTC within a tracking control framework. In this framework, the thesis considers several approaches to robust T-S fuzzy control and T-S fuzzy estimation: T-S fuzzy proportional multiple integral observer (PMIO); T-S fuzzy proportional-proportional integral observer (PPIO); T-S fuzzy virtual sensor (VS) based AFTC; T-S fuzzy Dynamic Output Feedback Control TSDOFC; T-S observer-based feedback control; Sliding Mode Control (SMC). The theoretical concepts have been applied to an offshore wind turbine (OWT) application study. The key developments that present in this thesis are:• The development of three active Fault Tolerant Tracking Control (FTTC) strategies for nonlinear systems described via T-S fuzzy inference modelling. The proposals combine the use of Linear Reference Model Fuzzy Control (LRMFC) with either the estimation and compensation concept or the control reconfiguration concept.• The development of T-S fuzzy observer-based state estimate fuzzy control strategy for nonlinear systems. The developed strategy has the capability to tolerate simultaneous actuator and sensor faults within tracking and regulating control framework. Additionally, a proposal to recover the Separation Principle has also been developed via the use of TSDOFC within the FTTC framework.• The proposals of two FTTC strategies based on the estimation and compensation concept for sustainable OWTs control. The proposals have introduced a significant attribute to the literature of sustainable OWTs control via (1) Obviating the need for Fault Detection and Diagnosis (FDD) unit, (2) Providing useful information to evaluate fault severity via the fault estimation signals.• The development of FTTC architecture for OWTs that combines the use of TSDOFC and a form of cascaded observers (cascaded analytical redundancy). This architecture is proposed in order to ensure the robustness of both the TSDOFC and the EWS estimator against the generator and rotor speed sensor faults.• A sliding mode baseline controller has been proposed within three FTTC strategies for sustainable OWTs control. The proposals utilise the inherent robustness of the SMC to tolerate some matched faults without the need for analytical redundancy. Following this, the combination of SMC and estimation and compensation framework proposed to ensure the close-loop system robustness to various faults.• Within the framework of the developed T-S fuzzy based FTTC strategies, a new perspective to reduce the T-S fuzzy control design conservatism problem has been proposed via the use of different control techniques that demand less design constraints. Moreover, within the SMC based FTTC, an investigation is given to demonstrate the SMC robustness against a wider than usual set of faults is enhanced via designing the sliding surface with minimum dimension of the feedback signals
    corecore