Search CORE

34 research outputs found

Recommended from our members

Efficient Neural Network Verification Using Branch and Bound

Author: Wang Shiqi
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2022
Field of study

Neural networks have demonstrated great success in modern machine learning systems. However, they remain susceptible to incorrect corner-case behaviors, often behaving unpredictably and producing surprisingly wrong results. Therefore, it is desirable to formally guarantee their trustworthiness for certain robustness properties when applied to safety-/security-sensitive systems like autonomous vehicles and aircraft. Unfortunately, the task is extremely challenging due to the complexity of neural networks, and traditional formal methods were not efficient enough to verify practical properties. Recently, a Branch and Bound (BaB) framework is generally extended for neural network verification and shows great success in accelerating the verification. This dissertation focuses on state-of-the-art neural network verifiers using BaB. We will first introduce two efficient neural network verifiers ReluVal and Neurify using basic BaB approaches involving two main steps: (1) They will recursively split the original verification problem into easier independent subproblems by splitting input or hidden neurons; (2) For each split subproblem, we propose an efficient and tight bound propagation method called symbolic interval analysis, producing sound estimated bounds for outputs using convex linear relaxations. Both ReluVal and Neurify are three orders of magnitude faster than previously state-of-the-art formal analysis systems on standard verification benchmarks. However, basic BaB approaches like Neurify have to construct each subproblem into a Linear Programming (LP) problem and solve it using expensive LP solvers, significantly limiting the overall efficiency. This is because each step of BaB will introduce neuron split constraints (e.g., a ReLU neuron larger or smaller than 0), which are hard to be handled by existing efficient bound propagation methods. We propose novel designs of bound propagation method -CROWN and its improved variance -CROWN, solving the verification problem by optimizing Lagrangian multipliers and with gradient ascent without requiring to call any expensive LP solvers. They were built based on previous work CROWN, a generalized efficient bound propagation method using linear relaxation. BaB verification using -CROWN and -CROWN cannot only provide tighter output estimations than most of the bound propagation methods but also can fully leverage the accelerations by GPUs with massive parallelization. Combining our methods with BaB empowers the state-of-the-art verifier ,-CROWN (alpha-beta-CROWN), the winning tool in the second International Verification of Neural Networks Competition (VNN-COMP 2021) with the highest total score. Our $\alpha,-CROWN can be three orders of magnitude faster than LP solver based BaB verifiers and is notably faster than all existing approaches on GPUs. Recently, we further generalize -CROWN and propose an efficient iterative approach that can tighten all intermediate layer bounds under neuron split constraints and strengthen the bound tightness without LP solvers. This new approach in BaB can greatly improve the efficiency of ,-CROWN, especially on several challenging benchmarks. Lastly, we study verifiable training that incorporates verification properties in training procedures to enhance the verifiable robustness of trained models and scale verification to larger models and datasets. We propose two general verifiable training frameworks: (1) MixTrain that can significantly improve verifiable training efficiency and scalability and (2) adaptive verifiable training that can improve trained verifiable robustness accounting for label similarity. The combination of verifiable training and BaB based verifiers opens promising directions for more efficient and scalable neural network verification

Columbia University Academic Commons

Optimization-based design of fault-tolerant avionics

Author: Khamvilai Thanakorn
Publication venue: Georgia Institute of Technology
Publication date: 14/01/2022
Field of study

This dissertation considers the problem of improving the self-consciousness for avionic systems using numerical optimization techniques, emphasizing UAV applications. This self-consciousness implies a sense of awareness for oneself to make a reliable decision on some crucial aspects. In the context of the avionics or aerospace industry, those aspects are SWaP-C as well as safety and reliability. The decision-making processes to optimize these aspects, which are the main contributions of this work, are presented. In addition, implementation on various types of applications related to avionics and UAV are also provided. The first half of this thesis lays out the background of avionics development ranging from a mechanical gyroscope to a current state-of-the-art electronics system. The relevant mathematics regarding convex optimization and its algorithms, which will be used for formulating this self-consciousness problem, are also provided. The latter half presents two problem formulations for redundancy design automation and reconfigurable middleware. The first formulation focuses on the minimization of SWaP-C while satisfying safety and reliability requirements. The other one aims to maximize the system safety and reliability by introducing a fault-tolerant capability via the task scheduler of middleware or RTOS. The usage of these two formulations is shown by four aerospace applications---reconfigurable multicore avionics, a SITL simulation of a UAV GNC system, a modular drone, and a HITL simulation of a fault-tolerant distributed engine control architecture.Ph.D

Scholarly Materials And Research @ Georgia Tech

A Polyhedral Study of Mixed 0-1 Set

Author: Agra Agostinho
Doostmohammadi Mahdi
Publication venue: ALIO-EURO 2011
Publication date: 01/01/2011
Field of study

We consider a variant of the well-known single node fixed charge network flow set with constant capacities. This set arises from the relaxation of more general mixed integer sets such as lot-sizing problems with multiple suppliers. We provide a complete polyhedral characterization of the convex hull of the given set

University of Strathclyde Institutional Repository

Recommended from our members

Deep Energy-Based Models for Structured Prediction

Author: Belanger David
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/11/2017
Field of study

We introduce structured prediction energy networks (SPENs), a flexible frame- work for structured prediction. A deep architecture is used to define an energy func- tion over candidate outputs and predictions are produced by gradient-based energy minimization. This deep energy captures dependencies between labels that would lead to intractable graphical models, and allows us to automatically discover discrim- inative features of the structured output. Furthermore, practitioners can explore a wide variety of energy function architectures without having to hand-design predic- tion and learning methods for each model. This is because all of our prediction and learning methods interact with the energy only via the standard interface for deep networks: forward and back-propagation. In a variety of applications, we find that we can obtain better accuracy using approximate minimization of non-convex deep energy functions than baseline models that employ simple energy functions for which exact minimization is tractable

ScholarWorks@UMass Amherst

Applications of biased-randomized algorithms and simheuristics in integrated logistics

Author: Do Carmo Martins Leandro
Publication venue: 'Fundacio per la Universitat Oberta de Catalunya'
Publication date: 13/09/2021
Field of study

Transportation and logistics (T&L) activities play a vital role in the development of many businesses from different industries. With the increasing number of people living in urban areas, the expansion of on-demand economy and e-commerce activities, the number of services from transportation and delivery has considerably increased. Consequently, several urban problems have been potentialized, such as traffic congestion and pollution. Several related problems can be formulated as a combinatorial optimization problem (COP). Since most of them are NP-Hard, the finding of optimal solutions through exact solution methods is often impractical in a reasonable amount of time. In realistic settings, the increasing need for 'instant' decision-making further refutes their use in real life. Under these circumstances, this thesis aims at: (i) identifying realistic COPs from different industries; (ii) developing different classes of approximate solution approaches to solve the identified T&L problems; (iii) conducting a series of computational experiments to validate and measure the performance of the developed approaches. The novel concept of 'agile optimization' is introduced, which refers to the combination of biased-randomized heuristics with parallel computing to deal with real-time decision-making.Las actividades de transporte y logística (T&L) juegan un papel vital en el desarrollo de muchas empresas de diferentes industrias. Con el creciente número de personas que viven en áreas urbanas, la expansión de la economía a lacarta y las actividades de comercio electrónico, el número de servicios de transporte y entrega ha aumentado considerablemente. En consecuencia, se han potencializado varios problemas urbanos, como la congestión del tráfico y la contaminación. Varios problemas relacionados pueden formularse como un problema de optimización combinatoria (COP). Dado que la mayoría de ellos son NP-Hard, la búsqueda de soluciones óptimas a través de métodos de solución exactos a menudo no es práctico en un período de tiempo razonable. En entornos realistas, la creciente necesidad de una toma de decisiones "instantánea" refuta aún más su uso en la vida real. En estas circunstancias, esta tesis tiene como objetivo: (i) identificar COP realistas de diferentes industrias; (ii) desarrollar diferentes clases de enfoques de solución aproximada para resolver los problemas de T&L identificados; (iii) realizar una serie de experimentos computacionales para validar y medir el desempeño de los enfoques desarrollados. Se introduce el nuevo concepto de optimización ágil, que se refiere a la combinación de heurísticas aleatorias sesgadas con computación paralela para hacer frente a la toma de decisiones en tiempo real.Les activitats de transport i logística (T&L) tenen un paper vital en el desenvolupament de moltes empreses de diferents indústries. Amb l'augment del nombre de persones que viuen a les zones urbanes, l'expansió de l'economia a la carta i les activitats de comerç electrònic, el nombre de serveis del transport i el lliurament ha augmentat considerablement. En conseqüència, s'han potencialitzat diversos problemes urbans, com ara la congestió del trànsit i la contaminació. Es poden formular diversos problemes relacionats com a problema d'optimització combinatòria (COP). Com que la majoria són NP-Hard, la recerca de solucions òptimes mitjançant mètodes de solució exactes sovint no és pràctica en un temps raonable. En entorns realistes, la creixent necessitat de prendre decisions "instantànies" refuta encara més el seu ús a la vida real. En aquestes circumstàncies, aquesta tesi té com a objectiu: (i) identificar COP realistes de diferents indústries; (ii) desenvolupar diferents classes d'aproximacions aproximades a la solució per resoldre els problemes identificats de T&L; (iii) la realització d'una sèrie d'experiments computacionals per validar i mesurar el rendiment dels enfocaments desenvolupats. S'introdueix el nou concepte d'optimització àgil, que fa referència a la combinació d'heurístiques esbiaixades i aleatòries amb informàtica paral·lela per fer front a la presa de decisions en temps real.Tecnologies de la informació i de xarxe

Tesis Doctorals en Xarxa

Quayside Operations Planning Under Uncertainty

Author: Iris Cagatay
Jin Jian Gang
Lee Der-Hong
Publication venue
Publication date: 01/01/2015
Field of study

Online Research Database In Technology

A novel dynamic and social perspective of multiple criteria decision making

Author: Corrente Salvatore
Di Stefano Alessandro
Giacchi Evelina
Greco Salvatore
La Corte Aurelio
Scatá Marialisa
Publication venue
Publication date: 01/01/2015
Field of study

Teeside University's Research Repository

Novel neural architectures & algorithms for efficient inference

Author: Kag Anil
Publication venue
Publication date: 30/08/2023
Field of study

In the last decade, the machine learning universe embraced deep neural networks (DNNs) wholeheartedly with the advent of neural architectures such as recurrent neural networks (RNNs), convolutional neural networks (CNNs), transformers, etc. These models have empowered many applications, such as ChatGPT, Imagen, etc., and have achieved state-of-the-art (SOTA) performance on many vision, speech, and language modeling tasks. However, SOTA performance comes with various issues, such as large model size, compute-intensive training, increased inference latency, higher working memory, etc. This thesis aims at improving the resource efficiency of neural architectures, i.e., significantly reducing the computational, storage, and energy consumption of a DNN without any significant loss in performance. Towards this goal, we explore novel neural architectures as well as training algorithms that allow low-capacity models to achieve near SOTA performance. We divide this thesis into two dimensions: \textit{Efficient Low Complexity Models}, and \textit{Input Hardness Adaptive Models}. Along the first dimension, i.e., \textit{Efficient Low Complexity Models}, we improve DNN performance by addressing instabilities in the existing architectures and training methods. We propose novel neural architectures inspired by ordinary differential equations (ODEs) to reinforce input signals and attend to salient feature regions. In addition, we show that carefully designed training schemes improve the performance of existing neural networks. We divide this exploration into two parts: \textsc{(a) Efficient Low Complexity RNNs.} We improve RNN resource efficiency by addressing poor gradients, noise amplifications, and BPTT training issues. First, we improve RNNs by solving ODEs that eliminate vanishing and exploding gradients during the training. To do so, we present Incremental Recurrent Neural Networks (iRNNs) that keep track of increments in the equilibrium surface. Next, we propose Time Adaptive RNNs that mitigate the noise propagation issue in RNNs by modulating the time constants in the ODE-based transition function. We empirically demonstrate the superiority of ODE-based neural architectures over existing RNNs. Finally, we propose Forward Propagation Through Time (FPTT) algorithm for training RNNs. We show that FPTT yields significant gains compared to the more conventional Backward Propagation Through Time (BPTT) scheme. \textsc{(b) Efficient Low Complexity CNNs.} Next, we improve CNN architectures by reducing their resource usage. They require greater depth to generate high-level features, resulting in computationally expensive models. We design a novel residual block, the Global layer, that constrains the input and output features by approximately solving partial differential equations (PDEs). It yields better receptive fields than traditional convolutional blocks and thus results in shallower networks. Further, we reduce the model footprint by enforcing a novel inductive bias that formulates the output of a residual block as a spatial interpolation between high-compute anchor pixels and low-compute cheaper pixels. This results in spatially interpolated convolutional blocks (SI-CNNs) that have better compute and performance trade-offs. Finally, we propose an algorithm that enforces various distributional constraints during training in order to achieve better generalization. We refer to this scheme as distributionally constrained learning (DCL). In the second dimension, i.e., \textit{Input Hardness Adaptive Models}, we introduce the notion of the hardness of any input relative to any architecture. In the first dimension, a neural network allocates the same resources, such as compute, storage, and working memory, for all the inputs. It inherently assumes that all examples are equally hard for a model. In this dimension, we challenge this assumption using input hardness as our reasoning that some inputs are relatively easy for a network to predict compared to others. Input hardness enables us to create selective classifiers wherein a low-capacity network handles simple inputs while abstaining from a prediction on the complex inputs. Next, we create hybrid models that route the hard inputs from the low-capacity abstaining network to a high-capacity expert model. We design various architectures that adhere to this hybrid inference style. Further, input hardness enables us to selectively distill the knowledge of a high-capacity model into a low-capacity model by cleverly discarding hard inputs during the distillation procedure. Finally, we conclude this thesis by sketching out various interesting future research directions that emerge as an extension of different ideas explored in this work

Boston University Institutional Repository (OpenBU)

A Unified Framework for Gradient-based Hyperparameter Optimization and Meta-learning

Author: Franceschi Luca
Publication venue: UCL (University College London)
Publication date: 28/06/2021
Field of study

Machine learning algorithms and systems are progressively becoming part of our societies, leading to a growing need of building a vast multitude of accurate, reliable and interpretable models which should possibly exploit similarities among tasks. Automating segments of machine learning itself seems to be a natural step to undertake to deliver increasingly capable systems able to perform well in both the big-data and the few-shot learning regimes. Hyperparameter optimization (HPO) and meta-learning (MTL) constitute two building blocks of this growing effort. We explore these two topics under a unifying perspective, presenting a mathematical framework linked to bilevel programming that captures existing similarities and translates into procedures of practical interest rooted in algorithmic differentiation. We discuss the derivation, applicability and computational complexity of these methods and establish several approximation properties for a class of objective functions of the underlying bilevel programs. In HPO, these algorithms generalize and extend previous work on gradient-based methods. In MTL, the resulting framework subsumes classic and emerging strategies and provides a starting basis from which to build and analyze novel techniques. A series of examples and numerical simulations offer insight and highlight some limitations of these approaches. Experiments on larger-scale problems show the potential gains of the proposed methods in real-world applications. Finally, we develop two extensions of the basic algorithms apt to optimize a class of discrete hyperparameters (graph edges) in an application to relational learning and to tune online learning rate schedules for training neural network models, an old but crucially important issue in machine learning

UCL Discovery