34 research outputs found

    Optimization-based design of fault-tolerant avionics

    Get PDF
    This dissertation considers the problem of improving the self-consciousness for avionic systems using numerical optimization techniques, emphasizing UAV applications. This self-consciousness implies a sense of awareness for oneself to make a reliable decision on some crucial aspects. In the context of the avionics or aerospace industry, those aspects are SWaP-C as well as safety and reliability. The decision-making processes to optimize these aspects, which are the main contributions of this work, are presented. In addition, implementation on various types of applications related to avionics and UAV are also provided. The first half of this thesis lays out the background of avionics development ranging from a mechanical gyroscope to a current state-of-the-art electronics system. The relevant mathematics regarding convex optimization and its algorithms, which will be used for formulating this self-consciousness problem, are also provided. The latter half presents two problem formulations for redundancy design automation and reconfigurable middleware. The first formulation focuses on the minimization of SWaP-C while satisfying safety and reliability requirements. The other one aims to maximize the system safety and reliability by introducing a fault-tolerant capability via the task scheduler of middleware or RTOS. The usage of these two formulations is shown by four aerospace applications---reconfigurable multicore avionics, a SITL simulation of a UAV GNC system, a modular drone, and a HITL simulation of a fault-tolerant distributed engine control architecture.Ph.D

    A Polyhedral Study of Mixed 0-1 Set

    Get PDF
    We consider a variant of the well-known single node fixed charge network flow set with constant capacities. This set arises from the relaxation of more general mixed integer sets such as lot-sizing problems with multiple suppliers. We provide a complete polyhedral characterization of the convex hull of the given set

    Applications of biased-randomized algorithms and simheuristics in integrated logistics

    Get PDF
    Transportation and logistics (T&L) activities play a vital role in the development of many businesses from different industries. With the increasing number of people living in urban areas, the expansion of on-demand economy and e-commerce activities, the number of services from transportation and delivery has considerably increased. Consequently, several urban problems have been potentialized, such as traffic congestion and pollution. Several related problems can be formulated as a combinatorial optimization problem (COP). Since most of them are NP-Hard, the finding of optimal solutions through exact solution methods is often impractical in a reasonable amount of time. In realistic settings, the increasing need for 'instant' decision-making further refutes their use in real life. Under these circumstances, this thesis aims at: (i) identifying realistic COPs from different industries; (ii) developing different classes of approximate solution approaches to solve the identified T&L problems; (iii) conducting a series of computational experiments to validate and measure the performance of the developed approaches. The novel concept of 'agile optimization' is introduced, which refers to the combination of biased-randomized heuristics with parallel computing to deal with real-time decision-making.Las actividades de transporte y logística (T&L) juegan un papel vital en el desarrollo de muchas empresas de diferentes industrias. Con el creciente número de personas que viven en áreas urbanas, la expansión de la economía a lacarta y las actividades de comercio electrónico, el número de servicios de transporte y entrega ha aumentado considerablemente. En consecuencia, se han potencializado varios problemas urbanos, como la congestión del tráfico y la contaminación. Varios problemas relacionados pueden formularse como un problema de optimización combinatoria (COP). Dado que la mayoría de ellos son NP-Hard, la búsqueda de soluciones óptimas a través de métodos de solución exactos a menudo no es práctico en un período de tiempo razonable. En entornos realistas, la creciente necesidad de una toma de decisiones "instantánea" refuta aún más su uso en la vida real. En estas circunstancias, esta tesis tiene como objetivo: (i) identificar COP realistas de diferentes industrias; (ii) desarrollar diferentes clases de enfoques de solución aproximada para resolver los problemas de T&L identificados; (iii) realizar una serie de experimentos computacionales para validar y medir el desempeño de los enfoques desarrollados. Se introduce el nuevo concepto de optimización ágil, que se refiere a la combinación de heurísticas aleatorias sesgadas con computación paralela para hacer frente a la toma de decisiones en tiempo real.Les activitats de transport i logística (T&L) tenen un paper vital en el desenvolupament de moltes empreses de diferents indústries. Amb l'augment del nombre de persones que viuen a les zones urbanes, l'expansió de l'economia a la carta i les activitats de comerç electrònic, el nombre de serveis del transport i el lliurament ha augmentat considerablement. En conseqüència, s'han potencialitzat diversos problemes urbans, com ara la congestió del trànsit i la contaminació. Es poden formular diversos problemes relacionats com a problema d'optimització combinatòria (COP). Com que la majoria són NP-Hard, la recerca de solucions òptimes mitjançant mètodes de solució exactes sovint no és pràctica en un temps raonable. En entorns realistes, la creixent necessitat de prendre decisions "instantànies" refuta encara més el seu ús a la vida real. En aquestes circumstàncies, aquesta tesi té com a objectiu: (i) identificar COP realistes de diferents indústries; (ii) desenvolupar diferents classes d'aproximacions aproximades a la solució per resoldre els problemes identificats de T&L; (iii) la realització d'una sèrie d'experiments computacionals per validar i mesurar el rendiment dels enfocaments desenvolupats. S'introdueix el nou concepte d'optimització àgil, que fa referència a la combinació d'heurístiques esbiaixades i aleatòries amb informàtica paral·lela per fer front a la presa de decisions en temps real.Tecnologies de la informació i de xarxe

    Novel neural architectures & algorithms for efficient inference

    Get PDF
    In the last decade, the machine learning universe embraced deep neural networks (DNNs) wholeheartedly with the advent of neural architectures such as recurrent neural networks (RNNs), convolutional neural networks (CNNs), transformers, etc. These models have empowered many applications, such as ChatGPT, Imagen, etc., and have achieved state-of-the-art (SOTA) performance on many vision, speech, and language modeling tasks. However, SOTA performance comes with various issues, such as large model size, compute-intensive training, increased inference latency, higher working memory, etc. This thesis aims at improving the resource efficiency of neural architectures, i.e., significantly reducing the computational, storage, and energy consumption of a DNN without any significant loss in performance. Towards this goal, we explore novel neural architectures as well as training algorithms that allow low-capacity models to achieve near SOTA performance. We divide this thesis into two dimensions: \textit{Efficient Low Complexity Models}, and \textit{Input Hardness Adaptive Models}. Along the first dimension, i.e., \textit{Efficient Low Complexity Models}, we improve DNN performance by addressing instabilities in the existing architectures and training methods. We propose novel neural architectures inspired by ordinary differential equations (ODEs) to reinforce input signals and attend to salient feature regions. In addition, we show that carefully designed training schemes improve the performance of existing neural networks. We divide this exploration into two parts: \textsc{(a) Efficient Low Complexity RNNs.} We improve RNN resource efficiency by addressing poor gradients, noise amplifications, and BPTT training issues. First, we improve RNNs by solving ODEs that eliminate vanishing and exploding gradients during the training. To do so, we present Incremental Recurrent Neural Networks (iRNNs) that keep track of increments in the equilibrium surface. Next, we propose Time Adaptive RNNs that mitigate the noise propagation issue in RNNs by modulating the time constants in the ODE-based transition function. We empirically demonstrate the superiority of ODE-based neural architectures over existing RNNs. Finally, we propose Forward Propagation Through Time (FPTT) algorithm for training RNNs. We show that FPTT yields significant gains compared to the more conventional Backward Propagation Through Time (BPTT) scheme. \textsc{(b) Efficient Low Complexity CNNs.} Next, we improve CNN architectures by reducing their resource usage. They require greater depth to generate high-level features, resulting in computationally expensive models. We design a novel residual block, the Global layer, that constrains the input and output features by approximately solving partial differential equations (PDEs). It yields better receptive fields than traditional convolutional blocks and thus results in shallower networks. Further, we reduce the model footprint by enforcing a novel inductive bias that formulates the output of a residual block as a spatial interpolation between high-compute anchor pixels and low-compute cheaper pixels. This results in spatially interpolated convolutional blocks (SI-CNNs) that have better compute and performance trade-offs. Finally, we propose an algorithm that enforces various distributional constraints during training in order to achieve better generalization. We refer to this scheme as distributionally constrained learning (DCL). In the second dimension, i.e., \textit{Input Hardness Adaptive Models}, we introduce the notion of the hardness of any input relative to any architecture. In the first dimension, a neural network allocates the same resources, such as compute, storage, and working memory, for all the inputs. It inherently assumes that all examples are equally hard for a model. In this dimension, we challenge this assumption using input hardness as our reasoning that some inputs are relatively easy for a network to predict compared to others. Input hardness enables us to create selective classifiers wherein a low-capacity network handles simple inputs while abstaining from a prediction on the complex inputs. Next, we create hybrid models that route the hard inputs from the low-capacity abstaining network to a high-capacity expert model. We design various architectures that adhere to this hybrid inference style. Further, input hardness enables us to selectively distill the knowledge of a high-capacity model into a low-capacity model by cleverly discarding hard inputs during the distillation procedure. Finally, we conclude this thesis by sketching out various interesting future research directions that emerge as an extension of different ideas explored in this work

    A Unified Framework for Gradient-based Hyperparameter Optimization and Meta-learning

    Get PDF
    Machine learning algorithms and systems are progressively becoming part of our societies, leading to a growing need of building a vast multitude of accurate, reliable and interpretable models which should possibly exploit similarities among tasks. Automating segments of machine learning itself seems to be a natural step to undertake to deliver increasingly capable systems able to perform well in both the big-data and the few-shot learning regimes. Hyperparameter optimization (HPO) and meta-learning (MTL) constitute two building blocks of this growing effort. We explore these two topics under a unifying perspective, presenting a mathematical framework linked to bilevel programming that captures existing similarities and translates into procedures of practical interest rooted in algorithmic differentiation. We discuss the derivation, applicability and computational complexity of these methods and establish several approximation properties for a class of objective functions of the underlying bilevel programs. In HPO, these algorithms generalize and extend previous work on gradient-based methods. In MTL, the resulting framework subsumes classic and emerging strategies and provides a starting basis from which to build and analyze novel techniques. A series of examples and numerical simulations offer insight and highlight some limitations of these approaches. Experiments on larger-scale problems show the potential gains of the proposed methods in real-world applications. Finally, we develop two extensions of the basic algorithms apt to optimize a class of discrete hyperparameters (graph edges) in an application to relational learning and to tune online learning rate schedules for training neural network models, an old but crucially important issue in machine learning