137 research outputs found

    Novel neural architectures & algorithms for efficient inference

    Get PDF
    In the last decade, the machine learning universe embraced deep neural networks (DNNs) wholeheartedly with the advent of neural architectures such as recurrent neural networks (RNNs), convolutional neural networks (CNNs), transformers, etc. These models have empowered many applications, such as ChatGPT, Imagen, etc., and have achieved state-of-the-art (SOTA) performance on many vision, speech, and language modeling tasks. However, SOTA performance comes with various issues, such as large model size, compute-intensive training, increased inference latency, higher working memory, etc. This thesis aims at improving the resource efficiency of neural architectures, i.e., significantly reducing the computational, storage, and energy consumption of a DNN without any significant loss in performance. Towards this goal, we explore novel neural architectures as well as training algorithms that allow low-capacity models to achieve near SOTA performance. We divide this thesis into two dimensions: \textit{Efficient Low Complexity Models}, and \textit{Input Hardness Adaptive Models}. Along the first dimension, i.e., \textit{Efficient Low Complexity Models}, we improve DNN performance by addressing instabilities in the existing architectures and training methods. We propose novel neural architectures inspired by ordinary differential equations (ODEs) to reinforce input signals and attend to salient feature regions. In addition, we show that carefully designed training schemes improve the performance of existing neural networks. We divide this exploration into two parts: \textsc{(a) Efficient Low Complexity RNNs.} We improve RNN resource efficiency by addressing poor gradients, noise amplifications, and BPTT training issues. First, we improve RNNs by solving ODEs that eliminate vanishing and exploding gradients during the training. To do so, we present Incremental Recurrent Neural Networks (iRNNs) that keep track of increments in the equilibrium surface. Next, we propose Time Adaptive RNNs that mitigate the noise propagation issue in RNNs by modulating the time constants in the ODE-based transition function. We empirically demonstrate the superiority of ODE-based neural architectures over existing RNNs. Finally, we propose Forward Propagation Through Time (FPTT) algorithm for training RNNs. We show that FPTT yields significant gains compared to the more conventional Backward Propagation Through Time (BPTT) scheme. \textsc{(b) Efficient Low Complexity CNNs.} Next, we improve CNN architectures by reducing their resource usage. They require greater depth to generate high-level features, resulting in computationally expensive models. We design a novel residual block, the Global layer, that constrains the input and output features by approximately solving partial differential equations (PDEs). It yields better receptive fields than traditional convolutional blocks and thus results in shallower networks. Further, we reduce the model footprint by enforcing a novel inductive bias that formulates the output of a residual block as a spatial interpolation between high-compute anchor pixels and low-compute cheaper pixels. This results in spatially interpolated convolutional blocks (SI-CNNs) that have better compute and performance trade-offs. Finally, we propose an algorithm that enforces various distributional constraints during training in order to achieve better generalization. We refer to this scheme as distributionally constrained learning (DCL). In the second dimension, i.e., \textit{Input Hardness Adaptive Models}, we introduce the notion of the hardness of any input relative to any architecture. In the first dimension, a neural network allocates the same resources, such as compute, storage, and working memory, for all the inputs. It inherently assumes that all examples are equally hard for a model. In this dimension, we challenge this assumption using input hardness as our reasoning that some inputs are relatively easy for a network to predict compared to others. Input hardness enables us to create selective classifiers wherein a low-capacity network handles simple inputs while abstaining from a prediction on the complex inputs. Next, we create hybrid models that route the hard inputs from the low-capacity abstaining network to a high-capacity expert model. We design various architectures that adhere to this hybrid inference style. Further, input hardness enables us to selectively distill the knowledge of a high-capacity model into a low-capacity model by cleverly discarding hard inputs during the distillation procedure. Finally, we conclude this thesis by sketching out various interesting future research directions that emerge as an extension of different ideas explored in this work

    Machine Learning and Its Application to Reacting Flows

    Get PDF
    This open access book introduces and explains machine learning (ML) algorithms and techniques developed for statistical inferences on a complex process or system and their applications to simulations of chemically reacting turbulent flows. These two fields, ML and turbulent combustion, have large body of work and knowledge on their own, and this book brings them together and explain the complexities and challenges involved in applying ML techniques to simulate and study reacting flows. This is important as to the world’s total primary energy supply (TPES), since more than 90% of this supply is through combustion technologies and the non-negligible effects of combustion on environment. Although alternative technologies based on renewable energies are coming up, their shares for the TPES is are less than 5% currently and one needs a complete paradigm shift to replace combustion sources. Whether this is practical or not is entirely a different question, and an answer to this question depends on the respondent. However, a pragmatic analysis suggests that the combustion share to TPES is likely to be more than 70% even by 2070. Hence, it will be prudent to take advantage of ML techniques to improve combustion sciences and technologies so that efficient and “greener” combustion systems that are friendlier to the environment can be designed. The book covers the current state of the art in these two topics and outlines the challenges involved, merits and drawbacks of using ML for turbulent combustion simulations including avenues which can be explored to overcome the challenges. The required mathematical equations and backgrounds are discussed with ample references for readers to find further detail if they wish. This book is unique since there is not any book with similar coverage of topics, ranging from big data analysis and machine learning algorithm to their applications for combustion science and system design for energy generation

    Knowledge Augmented Machine Learning with Applications in Autonomous Driving: A Survey

    Get PDF
    The existence of representative datasets is a prerequisite of many successful artificial intelligence and machine learning models. However, the subsequent application of these models often involves scenarios that are inadequately represented in the data used for training. The reasons for this are manifold and range from time and cost constraints to ethical considerations. As a consequence, the reliable use of these models, especially in safety-critical applications, is a huge challenge. Leveraging additional, already existing sources of knowledge is key to overcome the limitations of purely data-driven approaches, and eventually to increase the generalization capability of these models. Furthermore, predictions that conform with knowledge are crucial for making trustworthy and safe decisions even in underrepresented scenarios. This work provides an overview of existing techniques and methods in the literature that combine data-based models with existing knowledge. The identified approaches are structured according to the categories integration, extraction and conformity. Special attention is given to applications in the field of autonomous driving

    International Conference on Mathematical Analysis and Applications in Science and Engineering – Book of Extended Abstracts

    Get PDF
    The present volume on Mathematical Analysis and Applications in Science and Engineering - Book of Extended Abstracts of the ICMASC’2022 collects the extended abstracts of the talks presented at the International Conference on Mathematical Analysis and Applications in Science and Engineering – ICMA2SC'22 that took place at the beautiful city of Porto, Portugal, in June 27th-June 29th 2022 (3 days). Its aim was to bring together researchers in every discipline of applied mathematics, science, engineering, industry, and technology, to discuss the development of new mathematical models, theories, and applications that contribute to the advancement of scientific knowledge and practice. Authors proposed research in topics including partial and ordinary differential equations, integer and fractional order equations, linear algebra, numerical analysis, operations research, discrete mathematics, optimization, control, probability, computational mathematics, amongst others. The conference was designed to maximize the involvement of all participants and will present the state-of- the-art research and the latest achievements.info:eu-repo/semantics/publishedVersio

    Physics-guided machine learning for turbulence closure and reduced-order modeling

    Get PDF
    A recent advance in scientific machine learning has started to show promising results in fluid mechanics. Despite their early success, the application of data-driven methods to turbulent flow simulation is non-trivial due to underlying highly nonlinear multiscale interactions. Here we present novel physics-guided machine learning (PGML) approaches for turbulence closure model discovery and model order reduction of complex multiscale systems. Our turbulence closure model discovery approach is based on exploiting big data without relying on underlying turbulence physics and learning from physical constraints. Specifically, we propose a frame invariant neural network model that can incorporate physical symmetries as inductive biases and illustrates its stable performance in the coarse-grid simulation without any kind of post-processing of the predicted subgrid-scale closure model. The frame invariant SGS model guarantees desired physical constraints without the need for any regularization terms and ultimately generalizes to different initial conditions and Reynolds numbers. To achieve data-efficient training and improved generalization, we propose a concatenated neural network with an uncertainty quantification mechanism that leverages information from hierarchies of models. The concatenated neural network is based on embedding information from cheap to evaluate low-fidelity approximations into the certain hidden layer of the neural network both during training and deployment. This framework is demonstrated for a range of problems, including turbulent boundary layer reconstruction, and reduced-order modeling of the vortex merging process. Furthermore, we investigate the seamless integration of sparse and noisy observations into non-intrusive reduced-order models, and hybrid models where the dynamical core of the system is modeled using the known governing equations, and the subgrid-scale processes are modeled using a deep learning model. To summarize, this work builds a bridge between extensive physics-based theories and data-driven modeling paradigms and paves the way for using hybrid physics-informed learning algorithms to generate predictive technologies for turbulent fluid flows

    Effects of inhaled therapies on pulmonary hypertension and right ventricular function in cardiac surgery

    Full text link
    Au Canada, on estime que 30 000 chirurgies cardiaques sont effectuées chaque année (1). L'insuffisance ventriculaire droite demeure une complication courante chez les patients subissant une chirurgie cardiaque. L'incidence de l’insuffisance ventriculaire droite périopératoire aiguë sévère peut aller de 0,1 % après une cardiotomie à 20 à 30 % après l'implantation d'un dispositif d'assistance ventriculaire gauche (2). La survenue d'une défaillance ventriculaire droite est encore plus fréquente en présence d'hypertension pulmonaire. Les conséquences de l'insuffisance ventriculaire droite en chirurgie cardiaque comprennent une détérioration périopératoire et des effets indésirables tels qu'un sevrage difficile de la circulation extracorporelle, une utilisation accrue d'agents vasoactifs intraveineux, et un risque accru de mortalité. Par conséquent, le diagnostic et le traitement de l’hypertension pulmonaire et de la dysfonction ventriculaire droite sont essentiels dans la période périopératoire pour éviter les complications. La surveillance simultanée et en continue des courbes de pression de l’artère pulmonaire et du ventricule droit à l'aide du cathétérisme de l'artère pulmonaire est un outil de surveillance important chez les patients en chirurgie cardiaque pour la détection précoce d'un dysfonctionnement du ventricule droit et pour évaluer la réponse au traitement. Les stratégies thérapeutiques dans ce contexte devraient se concentrer sur la réduction de la postcharge du ventricule droit et l'amélioration de la fonction du ventricule droit tout en évitant l'hypotension systémique. Les hypothèses de cette thèse sont les suivantes : 1) les vasodilatateurs inhalés sont supérieurs aux agents administrés par voie intraveineuse pour le traitement et la gestion de l’hypertension pulmonaire en chirurgie cardiaque, 2) la combinaison d'époprosténol inhalé et de la milrinone inhalée (iE&iM) est une stratégie efficace pour faciliter le sevrage de la circulation extracorporelle et pour réduire les besoins en inotropes intraveineux, 3) tous les patients n'ont pas une réponse vasodilatatrice positive à la combinaison de l’iE&iM, 4) la réponse à l’iE&iM est associée à des changements des courbes de pression du ventricule droit et de l’artère pulmonaire, et 5) le gradient de la chambre de chasse du ventricule droit et la vitesse d’augmentation de la pression intraventriculaire droite (dP/dt) ont le potentiel d'être des marqueurs pharmacodynamiques de la réponse au traitement. Le travail compris dans cette thèse consiste en 3 études. La première est une revue systématique et méta-analyse d'essais contrôlés randomisés démontrant que l'administration de vasodilatateurs inhalés pour le traitement de l’hypertension pulmonaire pendant la chirurgie cardiaque est associée à une amélioration de la performance du ventricule droit comparé aux agents administrés par voie intraveineuse. La deuxième étude est une analyse de cohorte rétrospective de 128 patients recevant l’iE&iM avant la circulation extracorporelle. Cette étude a démontré une réponse vasodilatatrice au traitement par l’iE&iM chez 77% des patients. Une réponse favorable était associée à un sevrage facile de la circulation extracorporelle plus fréquent et à une utilisation plus faible d'inotropes intraveineux. De plus, cette étude a également démontré qu'une hypertension pulmonaire plus sévère est prédictive d'une réponse vasodilatatrice pulmonaire positive, tandis qu'un European System for Cardiac Operative Risk Evaluation score (EuroSCORE) II élevé est un prédicteur de non-réponse au traitement. La dernière étude de cette thèse est une étude de cohorte prospective incluant 26 patients recevant iE&iM avec surveillance continue de la courbe de pression du ventricule droit démontrant l'innocuité et l'efficacité de cette approche thérapeutique dans l'amélioration de la fonction ventriculaire droite.In Canada there is an estimated 30,000 cardiac surgeries that are performed each year (1). Right ventricular failure (RVF) remains a common complication in patients undergoing cardiac surgery. The incidence of severe acute perioperative RVF can range from 0.1% after cardiotomy to 20-30% after left ventricular assist device implantation (2). The occurrence of RVF is even more frequent in the presence of pulmonary hypertension (PH). Consequences of RVF in cardiac surgery include perioperative deterioration and adverse outcomes such as difficult separation from cardiopulmonary bypass (CPB), increased use of intravenous (IV) vasoactive agents and an increased risk of mortality. Therefore, the diagnosis and treatment of PH and right ventricular (RV) dysfunction is essential in the perioperative period to circumvent complications. Continuous and simultaneous monitoring of both pulmonary artery pressure (Ppa) and RV pressure (Prv) waveforms using pulmonary artery catheterization is an important monitoring tool in cardiac surgery patients for early detection of RV dysfunction and for evaluating response to treatment. Therapeutic strategies in this context should focus on reducing RV afterload and improving RV function while avoiding systemic hypotension. The hypotheses of this thesis are the following: 1) inhaled aerosolized vasodilators are superior to IV administered agents for the treatment and management of PH in cardiac surgery, 2) the combination of inhaled epoprostenol and inhaled milrinone (iE&iM) is an effective strategy to facilitate separation from CPB and reduce the requirements for IV inotropes, 3) not all patients have a positive vasodilator response to iE&iM, 4) response to iE&iM is associated with changes in RV and PA pressure waveforms, and 5) RV outflow tract (RVOT) gradient and RV maximal rate of pressure rise during early systole (dP/dt) have the potential to be pharmacodynamic markers of response to treatment. The work comprised in this thesis consist of 3 studies. The first is a systematic review and meta-analysis of randomized controlled trials showing that administration of inhaled vasodilators for the treatment of PH during cardiac surgery is associated with improved RV performance compared to IV administered agents. The second study is a retrospective cohort analysis of 128 patients receiving iE&iM before CPB. This study showed that 77% of patients have a vasodilator response to iE&iM treatment. A favorable vasodilator response was associated with more frequent easy separation from CPB and lower use of IV inotropes post-CPB. In addition, more severe PH at baseline is shown to be predictive of a positive pulmonary vasodilator response while high European System for Cardiac Operative Risk Evaluation score (EuroSCORE) II is a predictor of non-response to treatment. The last study of this thesis is a prospective cohort study including 26 patients receiving iE&iM with continuous monitoring of Prv waveform demonstrating the safety and efficacy of this treatment approach in improving RV function

    Three-dimensional stochastic cubic nonlinear wave equation with almost space-time white noise

    Get PDF
    We study the stochastic cubic nonlinear wave equation (SNLW) with an additive noise on the three-dimensional torus T3\mathbb{T}^3. In particular, we prove local well-posedness of the (renormalized) SNLW when the noise is almost a space-time white noise. In recent years, the paracontrolled calculus has played a crucial role in the well-posedness study of singular SNLW on T3\mathbb{T}^3 by Gubinelli, Koch, and the first author (2018), Okamoto, Tolomeo, and the first author (2020), and Bringmann (2020). Our approach, however, does not rely on the paracontrolled calculus. We instead proceed with the second order expansion and study the resulting equation for the residual term, using multilinear dispersive smoothing.Comment: 55 pages. Expanded Remark 1.10. Published in Stoch. Partial Differ. Equ. Anal. Comput. (2022). Special issue dedicated to Professor Istv\'an Gy\"ongy on the occasion of his seventieth birthda
    corecore