Search CORE

3 research outputs found

Uncertainty Aware ML-based surrogate models for particle accelerators: A Study at the Fermilab Booster Accelerator Complex

Author: John Jason St.
Li Karthik Somayaji Peng
Rajput Kishansingh
Schram Malachi
Sharma Himanshu
Publication venue
Publication date: 16/09/2022
Field of study

Standard deep learning methods, such as Ensemble Models, Bayesian Neural Networks and Quantile Regression Models provide estimates to prediction uncertainties for data-driven deep learning models. However, they can be limited in their applications due to their heavy memory, inference cost, and ability to properly capture out-of-distribution uncertainties. Additionally, some of these models require post-training calibration which limits their ability to be used for continuous learning applications. In this paper, we present a new approach to provide prediction with calibrated uncertainties that includes out-of-distribution contributions and compare it to standard methods on the Fermi National Accelerator Laboratory (FNAL) Booster accelerator complex

arXiv.org e-Print Archive

Extreme Risk Mitigation in Reinforcement Learning using Extreme Value Theory

Author: Drgona Jan
Halappanavar Mahantesh
Li Peng
Liu Frank
NS Karthik Somayaji
Schram Malachi
Wang Yu
Publication venue
Publication date: 24/08/2023
Field of study

Risk-sensitive reinforcement learning (RL) has garnered significant attention in recent years due to the growing interest in deploying RL agents in real-world scenarios. A critical aspect of risk awareness involves modeling highly rare risk events (rewards) that could potentially lead to catastrophic outcomes. These infrequent occurrences present a formidable challenge for data-driven methods aiming to capture such risky events accurately. While risk-aware RL techniques do exist, their level of risk aversion heavily relies on the precision of the state-action value function estimation when modeling these rare occurrences. Our work proposes to enhance the resilience of RL agents when faced with very rare and risky events by focusing on refining the predictions of the extreme values predicted by the state-action value function distribution. To achieve this, we formulate the extreme values of the state-action value function distribution as parameterized distributions, drawing inspiration from the principles of extreme value theory (EVT). This approach effectively addresses the issue of infrequent occurrence by leveraging EVT-based parameterization. Importantly, we theoretically demonstrate the advantages of employing these parameterized distributions in contrast to other risk-averse algorithms. Our evaluations show that the proposed method outperforms other risk averse RL algorithms on a diverse range of benchmark tasks, each encompassing distinct risk scenarios

arXiv.org e-Print Archive

AutoNF: Automated Architecture Optimization of Normalizing Flows with Unconstrained Continuous Relaxation Admitting Optimal Discrete Solution

Author: Drgoňa Ján
Li Peng
Liu Frank
Nanjangud Suryanarayana Karthik Somayaji
Schram Malachi
Wang Yu
Zhang Jiaxin
Publication venue: Association for the Advancement of Artificial Intelligence
Publication date: 26/06/2023
Field of study

Normalizing flows (NF) build upon invertible neural networks and have wide applications in probabilistic modeling. Currently, building a powerful yet computationally efficient flow model relies on empirical fine-tuning over a large design space. While introducing neural architecture search (NAS) to NF is desirable, the invertibility constraint of NF brings new challenges to existing NAS methods whose application is limited to unstructured neural networks. Developing efficient NAS methods specifically for NF remains an open problem. We present AutoNF, the first automated NF architectural optimization framework. First, we present a new mixture distribution formulation that allows efficient differentiable architecture search of flow models without violating the invertibility constraint. Second, under the new formulation, we convert the original NP-hard combinatorial NF architectural optimization problem to an unconstrained continuous relaxation admitting the discrete optimal architectural solution, circumventing the loss of optimality due to binarization in architectural optimization. We evaluate AutoNF with various density estimation datasets and show its superior performance-cost trade-offs over a set of existing hand-crafted baselines

Association for the Advancement of Artificial Intelligence: AAAI Publications