Search CORE

170 research outputs found

Boosting Combinatorial Problem Modeling with Machine Learning

Author: Lombardi Michele
Milano Michela
Publication venue: 'International Joint Conferences on Artificial Intelligence'
Publication date: 15/07/2018
Field of study

In the past few years, the area of Machine Learning (ML) has witnessed tremendous advancements, becoming a pervasive technology in a wide range of applications. One area that can significantly benefit from the use of ML is Combinatorial Optimization. The three pillars of constraint satisfaction and optimization problem solving, i.e., modeling, search, and optimization, can exploit ML techniques to boost their accuracy, efficiency and effectiveness. In this survey we focus on the modeling component, whose effectiveness is crucial for solving the problem. The modeling activity has been traditionally shaped by optimization and domain experts, interacting to provide realistic results. Machine Learning techniques can tremendously ease the process, and exploit the available data to either create models or refine expert-designed ones. In this survey we cover approaches that have been recently proposed to enhance the modeling process by learning either single constraints, objective functions, or the whole model. We highlight common themes to multiple approaches and draw connections with related fields of research.Comment: Originally submitted to IJCAI201

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Automated Reachability Analysis of Neural Network-Controlled Systems via Adaptive Polytopes

Author: Entesari Taha
Fazlyab Mahyar
Publication venue
Publication date: 13/02/2023
Field of study

Over-approximating the reachable sets of dynamical systems is a fundamental problem in safety verification and robust control synthesis. The representation of these sets is a key factor that affects the computational complexity and the approximation error. In this paper, we develop a new approach for over-approximating the reachable sets of neural network dynamical systems using adaptive template polytopes. We use the singular value decomposition of linear layers along with the shape of the activation functions to adapt the geometry of the polytopes at each time step to the geometry of the true reachable sets. We then propose a branch-and-bound method to compute accurate over-approximations of the reachable sets by the inferred templates. We illustrate the utility of the proposed approach in the reachability analysis of linear systems driven by neural network controllers

arXiv.org e-Print Archive

Combining learning and optimization for transprecision computing

Author: Bartolini Andrea
Graillat Stef
Kingma Diederik P
Mach S.
Malossi A Cristiano I
Rubio-González Cindy
Tagliavini Giuseppe
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2020
Field of study

The growing demands of the worldwide IT infrastructure stress the need for reduced power consumption, which is addressed in so-called transprecision computing by improving energy efficiency at the expense of precision. For example, reducing the number of bits for some floating-point operations leads to higher efficiency, but also to a non-linear decrease of the computation accuracy. Depending on the application, small errors can be tolerated, thus allowing to fine-tune the precision of the computation. Finding the optimal precision for all variables in respect of an error bound is a complex task, which is tackled in the literature via heuristics. In this paper, we report on a first attempt to address the problem by combining a Mathematical Programming (MP) model and a Machine Learning (ML) model, following the Empirical Model Learning methodology. The ML model learns the relation between variables precision and the output error; this information is then embedded in the MP focused on minimizing the number of bits. An additional refinement phase is then added to improve the quality of the solution. The experimental results demonstrate an average speedup of 6.5% and a 3% increase in solution quality compared to the state-of-the-art. In addition, experiments on a hardware platform capable of mixed-precision arithmetic (PULPissimo) show the benefits of the proposed approach, with energy savings of around 40% compared to fixed-precision

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

ReachLipBnB: A branch-and-bound method for reachability analysis of neural autonomous systems using Lipschitz bounds

Author: Entesari Taha
Fazlyab Mahyar
Sharifi Sina
Publication venue
Publication date: 01/11/2022
Field of study

We propose a novel Branch-and-Bound method for reachability analysis of neural networks in both open-loop and closed-loop settings. Our idea is to first compute accurate bounds on the Lipschitz constant of the neural network in certain directions of interest offline using a convex program. We then use these bounds to obtain an instantaneous but conservative polyhedral approximation of the reachable set using Lipschitz continuity arguments. To reduce conservatism, we incorporate our bounding algorithm within a branching strategy to decrease the over-approximation error within an arbitrary accuracy. We then extend our method to reachability analysis of control systems with neural network controllers. Finally, to capture the shape of the reachable sets as accurately as possible, we use sample trajectories to inform the directions of the reachable set over-approximations using Principal Component Analysis (PCA). We evaluate the performance of the proposed method in several open-loop and closed-loop settings

arXiv.org e-Print Archive

Data-driven deep-learning methods for the accelerated simulation of Eulerian fluid dynamics

Author: Lino Valencia Mario
Publication venue: Aeronautics, Imperial College London
Publication date: 01/09/2023
Field of study

Deep-learning (DL) methods for the fast inference of the temporal evolution of ﬂuid-dynamics systems, based on the previous recognition of features underlying large sets of ﬂuid-dynamics data, have been studied. Speciﬁcally, models based on convolution neural networks (CNNs) and graph neural networks (GNNs) were proposed and discussed. A U-Net, a popular fully-convolutional architecture, was trained to infer wave dynamics on liquid surfaces surrounded by walls, given as input the system state at previous time-points. A term for penalising the error of the spatial derivatives was added to the loss function, which resulted in a suppression of spurious oscillations and a more accurate location and length of the predicted wavefronts. This model proved to accurately generalise to complex wall geometries not seen during training. As opposed to the image data-structures processed by CNNs, graphs oﬀer higher freedom on how data is organised and processed. This motivated the use of graphs to represent the state of ﬂuid-dynamic systems discretised by unstructured sets of nodes, and GNNs to process such graphs. Graphs have enabled more accurate representations of curvilinear geometries and higher resolution placement exclusively in areas where physics is more challenging to resolve. Two novel GNN architectures were designed for ﬂuid-dynamics inference: the MuS-GNN, a multi-scale GNN, and the REMuS-GNN, a rotation-equivariant multi-scale GNN. Both architectures work by repeatedly passing messages from each node to its nearest nodes in the graph. Additionally, lower-resolutions graphs, with a reduced number of nodes, are deﬁned from the original graph, and messages are also passed from ﬁner to coarser graphs and vice-versa. The low-resolution graphs allowed for eﬃciently capturing physics encompassing a range of lengthscales. Advection and ﬂuid ﬂow, modelled by the incompressible Navier-Stokes equations, were the two types of problems used to assess the proposed GNNs. Whereas a single-scale GNN was suﬃcient to achieve high generalisation accuracy in advection simulations, ﬂow simulation highly beneﬁted from an increasing number of low-resolution graphs. The generalisation and long-term accuracy of these simulations were further improved by the REMuS-GNN architecture, which processes the system state independently of the orientation of the coordinate system thanks to a rotation-invariant representation and carefully designed components. To the best of the author’s knowledge, the REMuS-GNN architecture was the ﬁrst rotation-equivariant and multi-scale GNN. The simulations were accelerated between one (in a CPU) and three (in a GPU) orders of magnitude with respect to a CPU-based numerical solver. Additionally, the parallelisation of multi-scale GNNs resulted in a close-to-linear speedup with the number of CPU cores or GPUs.Open Acces

Spiral - Imperial College Digital Repository

Analysis of robust neural networks for control

Author: Newton Matthew
Publication venue
Publication date: 03/01/2024
Field of study

The prevalence of neural networks in many application areas is expanding at an increasing rate, with the potential to provide huge benefits across numerous sectors. However, one of the greatest shortcomings of a trained neural network is its sensitivity to adversarial attacks. It is becoming clear that providing robust guarantees on systems that use neural networks is very important, especially in safety-critical applications. However, quantifying their safety and robustness properties has proven challenging due to the non-linearities of the activation functions inside the neural network. This thesis addresses this problem from many different perspectives. Firstly, we investigate the sparsity that arises in a recently proposed semidefinite programming framework to verify a fully connected feed-forward neural network. We reformulate and exploit the sparsity in the optimisation problem, showing a significant speed-up in computation. In addition, we approach the problem using polynomial optimisation and show that by using the Positivstellensatz, bounds on the robustness guarantees can be tightened significantly over other popular methods. We then reformulate this approach to simultaneously exploit the sparsity in the problem, whilst improving the accuracy. Neural networks have also seen a recent increased use in feedback control systems. This is primarily because they have the potential to improve the performance of these systems compared to traditional controllers, due to their ability to act as general function approximators. However, since feedback systems are usually subject to external perturbations and neural networks are sensitive to small changes, providing robustness guarantees has proven challenging. In this thesis, we analyse non-linear systems that contain neural network controllers. We first address this problem by computing outer-approximations of the reachable sets using sparse polynomial optimisation. We then use a Sum of Squares programming framework to compute the stability of these systems. Both of these approaches provide better robustness guarantees over existing methods. Finally, we extend these approaches to neural network controllers with rational activation functions. We then propose a method to recover a stabilising controller from a Sum of Squares program and apply it to a modified rational neural network controller

Oxford University Research Archive

CBR and MBR techniques: review for an application in the emergencies domain

Author: Merida-Campos Carlos
Rollón Rico Emma
Publication venue
Publication date: 01/01/2003
Field of study

The purpose of this document is to provide an in-depth analysis of current reasoning engine practice and the integration strategies of Case Based Reasoning and Model Based Reasoning that will be used in the design and development of the RIMSAT system. RIMSAT (Remote Intelligent Management Support and Training) is a European Commission funded project designed to: a.. Provide an innovative, 'intelligent', knowledge based solution aimed at improving the quality of critical decisions b.. Enhance the competencies and responsiveness of individuals and organisations involved in highly complex, safety critical incidents - irrespective of their location. In other words, RIMSAT aims to design and implement a decision support system that using Case Base Reasoning as well as Model Base Reasoning technology is applied in the management of emergency situations. This document is part of a deliverable for RIMSAT project, and although it has been done in close contact with the requirements of the project, it provides an overview wide enough for providing a state of the art in integration strategies between CBR and MBR technologies.Postprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

A sufficient condition for the improvement of Restricted Boltzmann Machines

Author: MARK THOMAS
Publication venue
Publication date: 01/01/2023
Field of study

This thesis explores Restricted Boltzmann Machines (RBMs) and their training, focusing on the minimization of the Kullback-Leibler (KL) divergence. Neural networks and the importance of the KL divergence are introduced and motivated. Examples of KL divergence calculations are demonstrated for various model and target distributions. A demonstration of the non-universality of the ability to improve models by introducing a new parameter without re-training the existing ones is made. The Ising model is explored as an example of available training data, and the work of G. Cossu et al., ‘Machine learning determination of dynamical parameters: The Ising model case,’ Phys. Rev. B, 100, 064304 (2019) in training a set of RBMs on the one-dimensional Ising model is successfully reproduced. Connections between the mathematics of RBMs and lattice Quantum Field Theory (QFT) are explored, and insights from QFT are utilized to inform the design choices of RBMs to consider. Leveraging these insights, a linearisation procedure is employed to produce a sufficient condition for the possibility of improvement of an RBM with bilinear inter-layer mixing and a Gaussian hidden layer through the introduction of new parameters, without the need to re-train already-existing parameters. This condition is tested and potential issues with the linearisation procedure performed are highlighted

Cronfa at Swansea University