A Behavioral Approach to Robust Machine Learning

Abstract

Machine learning is revolutionizing almost all fields of science and technology and has been proposed as a pathway to solving many previously intractable problems such as autonomous driving and other complex robotics tasks. While the field has demonstrated impressive results on certain problems, many of these results have not translated to applications in physical systems, partly due to the cost of system fail- ure and partly due to the difficulty of ensuring reliable and robust model behavior. Deep neural networks, for instance, have simultaneously demonstrated both incredible performance in game playing and image processing, and remarkable fragility. This combination of high average performance and a catastrophically bad worst case performance presents a serious danger as deep neural networks are currently being used in safety critical tasks such as assisted driving. In this thesis, we propose a new approach to training models that have built in robustness guarantees. Our approach to ensuring stability and robustness of the models trained is distinct from prior methods; where prior methods learn a model and then attempt to verify robustness/stability, we directly optimize over sets of models where the necessary properties are known to hold. Specifically, we apply methods from robust and nonlinear control to the analysis and synthesis of recurrent neural networks, equilibrium neural networks, and recurrent equilibrium neural networks. The techniques developed allow us to enforce properties such as incremental stability, incremental passivity, and incremental l2 gain bounds / Lipschitz bounds. A central consideration in the development of our model sets is the difficulty of fitting models. All models can be placed in the image of a convex set, or even R^N , allowing useful properties to be easily imposed during the training procedure via simple interior point methods, penalty methods, or unconstrained optimization. In the final chapter, we study the problem of learning networks of interacting models with guarantees that the resulting networked system is stable and/or monotone, i.e., the order relations between states are preserved. While our approach to learning in this chapter is similar to the previous chapters, the model set that we propose has a separable structure that allows for the scalable and distributed identification of large-scale systems via the alternating directions method of multipliers (ADMM)

    Similar works