4,522 research outputs found

    Connections Between Adaptive Control and Optimization in Machine Learning

    Full text link
    This paper demonstrates many immediate connections between adaptive control and optimization methods commonly employed in machine learning. Starting from common output error formulations, similarities in update law modifications are examined. Concepts in stability, performance, and learning, common to both fields are then discussed. Building on the similarities in update laws and common concepts, new intersections and opportunities for improved algorithm analysis are provided. In particular, a specific problem related to higher order learning is solved through insights obtained from these intersections.Comment: 18 page

    Algorithm for rigorous integration of Delay Differential Equations and the computer-assisted proof of periodic orbits in the Mackey-Glass equation

    Get PDF
    We present an algorithm for the rigorous integration of Delay Differential Equations (DDEs) of the form x′(t)=f(x(t−τ),x(t))x'(t)=f(x(t-\tau),x(t)). As an application, we give a computer assisted proof of the existence of two attracting periodic orbits (before and after the first period-doubling bifurcation) in the Mackey-Glass equation

    A review of convex approaches for control, observation and safety of linear parameter varying and Takagi-Sugeno systems

    Get PDF
    This paper provides a review about the concept of convex systems based on Takagi-Sugeno, linear parameter varying (LPV) and quasi-LPV modeling. These paradigms are capable of hiding the nonlinearities by means of an equivalent description which uses a set of linear models interpolated by appropriately defined weighing functions. Convex systems have become very popular since they allow applying extended linear techniques based on linear matrix inequalities (LMIs) to complex nonlinear systems. This survey aims at providing the reader with a significant overview of the existing LMI-based techniques for convex systems in the fields of control, observation and safety. Firstly, a detailed review of stability, feedback, tracking and model predictive control (MPC) convex controllers is considered. Secondly, the problem of state estimation is addressed through the design of proportional, proportional-integral, unknown input and descriptor observers. Finally, safety of convex systems is discussed by describing popular techniques for fault diagnosis and fault tolerant control (FTC).Peer ReviewedPostprint (published version

    Risk-Sensitive Reinforcement Learning: A Constrained Optimization Viewpoint

    Full text link
    The classic objective in a reinforcement learning (RL) problem is to find a policy that minimizes, in expectation, a long-run objective such as the infinite-horizon discounted or long-run average cost. In many practical applications, optimizing the expected value alone is not sufficient, and it may be necessary to include a risk measure in the optimization process, either as the objective or as a constraint. Various risk measures have been proposed in the literature, e.g., mean-variance tradeoff, exponential utility, the percentile performance, value at risk, conditional value at risk, prospect theory and its later enhancement, cumulative prospect theory. In this article, we focus on the combination of risk criteria and reinforcement learning in a constrained optimization framework, i.e., a setting where the goal to find a policy that optimizes the usual objective of infinite-horizon discounted/average cost, while ensuring that an explicit risk constraint is satisfied. We introduce the risk-constrained RL framework, cover popular risk measures based on variance, conditional value-at-risk and cumulative prospect theory, and present a template for a risk-sensitive RL algorithm. We survey some of our recent work on this topic, covering problems encompassing discounted cost, average cost, and stochastic shortest path settings, together with the aforementioned risk measures in a constrained framework. This non-exhaustive survey is aimed at giving a flavor of the challenges involved in solving a risk-sensitive RL problem, and outlining some potential future research directions

    Scheduling a multi class queue with many exponential servers: asymptotic optimality in heavy traffic

    Full text link
    We consider the problem of scheduling a queueing system in which many statistically identical servers cater to several classes of impatient customers. Service times and impatience clocks are exponential while arrival processes are renewal. Our cost is an expected cumulative discounted function, linear or nonlinear, of appropriately normalized performance measures. As a special case, the cost per unit time can be a function of the number of customers waiting to be served in each class, the number actually being served, the abandonment rate, the delay experienced by customers, the number of idling servers, as well as certain combinations thereof. We study the system in an asymptotic heavy-traffic regime where the number of servers n and the offered load r are simultaneously scaled up and carefully balanced: n\approx r+\beta \sqrtr for some scalar \beta. This yields an operation that enjoys the benefits of both heavy traffic (high server utilization) and light traffic (high service levels.

    Event-triggered control for rational and Lur’e type nonlinear systems

    Get PDF
    In the present work, the design of event-triggered controllers for two classes of nonlinear systems is addressed: rational systems and Lur’e type systems. Lyapunov theory techniques are used in both cases to derive asymptotic stability conditions in the form of linear matrix inequalities that are then used in convex optimization problems as means of computing the control system parameters aiming at a reduction of the number of events generated. In the context of rational systems, state-feedback control is considered and differentialalgebraic representations are used as means to obtain tractable stability conditions. An event-triggering strategy which uses weighting matrices to strive for less events is proposed and then it is proven that this strategy does not lead to Zeno behavior. In the case of Lur’e systems, observer-based state-feedback is addressed with event generators that have access only to the system output and observed state, but it imposes the need of a dwell-time, i.e. a time interval after each event where the trigger condition is not evaluated, to cope with Zeno behavior. Two distinct approaches, exact time-discretization and looped-functional techniques, are considered to ensure asymptotic stability in the presence of the dwell-time. For both system classes, emulation design and co-design are addressed. In the emulation design context, the control law (and the observer gains, when appropriate) are given and the task is to compute the event generator parameters. In the co-design context, the event generator and the control law or the observer can be simultaneously designed. Numerical examples are presented to illustrate the application of the proposed methods.Neste trabalho é abordado o projeto de controladores baseados em eventos para duas classes de sistemas não lineares: sistemas racionais e sistemas tipo Lur’e. Técnicas da teoria de Lyapunov são usadas em ambos os casos para derivar condições de estabilidade assintótica na forma de inequações matriciais lineares. Tais condições são então utilizadas em problemas de otimização convexa como meio de calcular os parâmetros do sistema de controle, visando uma redução no número de eventos gerados. No contexto de sistemas racionais, realimentação de estados é considerada e representações algébrico-diferenciais são usadas como meio de obter condições de estabilidade tratáveis computacionalmente. Uma estratégia de disparo de eventos que usa uma medida de erro ponderado através de matrizes definidas positivas é proposta e é demonstrado que tal estratégia não gera comportamento de Zenão. No caso de sistemas tipo Lur’e, considera-se o caso de controladores com restrições de informações, a saber, com acesso apenas às saídas do sistema. Um observador de estados é então utilizado para recuperar a informação faltante. Neste contexto, é necessária a introdução de um tempo de espera (dwell time, em inglês) para garantir a inexistência de comportamento de Zenão. Todavia, a introdução do tempo de espera apresenta um desafio adicional na garantia de estabilidade que é tratado neste trabalho considerando duas técnicas possíveis: a discretização exata do sistema e o uso de looped-functionals (funcionais em laço, em uma tradução livre). Para ambas classes de sistemas, são tratados os problemas de projeto por emulação e co-design (projeto simultâneo, em uma tradução livre). No projeto por emulação, a lei de controle (e os ganhos do observador, quando apropriado) são dados a priori e a tarefa é projetar os parâmetros do gerador de eventos. No caso do co-design, o gerador de eventos e a lei de controle ou o observador são projetados simultaneamente. Exemplos numéricos são usados para ilustrar a aplicação dos métodos propostos
    • …
    corecore