864 research outputs found

    Pseudo Label Selection is a Decision Problem

    Full text link
    Pseudo-Labeling is a simple and effective approach to semi-supervised learning. It requires criteria that guide the selection of pseudo-labeled data. The latter have been shown to crucially affect pseudo-labeling's generalization performance. Several such criteria exist and were proven to work reasonably well in practice. However, their performance often depends on the initial model fit on labeled data. Early overfitting can be propagated to the final model by choosing instances with overconfident but wrong predictions, often called confirmation bias. In two recent works, we demonstrate that pseudo-label selection (PLS) can be naturally embedded into decision theory. This paves the way for BPLS, a Bayesian framework for PLS that mitigates the issue of confirmation bias. At its heart is a novel selection criterion: an analytical approximation of the posterior predictive of pseudo-samples and labeled data. We derive this selection criterion by proving Bayes-optimality of this "pseudo posterior predictive". We empirically assess BPLS for generalized linear, non-parametric generalized additive models and Bayesian neural networks on simulated and real-world data. When faced with data prone to overfitting and thus a high chance of confirmation bias, BPLS outperforms traditional PLS methods. The decision-theoretic embedding further allows us to render PLS more robust towards the involved modeling assumptions. To achieve this goal, we introduce a multi-objective utility function. We demonstrate that the latter can be constructed to account for different sources of uncertainty and explore three examples: model selection, accumulation of errors and covariate shift.Comment: Accepted for presentation at the 46th German Conference on Artificial Intelligenc

    Pseudo-Label Selection: Insights From Decision Theory

    Get PDF

    An Empirical Study of Prior-Data Conflicts in Bayesian Neural Networks

    Get PDF
    Imprecise Probabilities (IP) allow for the representation of incomplete information. In the context of Bayesian statistics, this is achieved by generalized Bayesian inference, where a set of priors is used instead of a single prior [ 1 , Chapter 7.4]. The latter has been shown to be particularly useful in the case of prior-data conflict, where evidence from data (likelihood) contradicts prior information. In these practically highly relevant scenarios, classical (precise) probability models typically fail to adequately represent the uncertainty arising from this conflict. Generalized Bayesian inference by IP, however, was proven to handle these prior-data conflicts well when inference in canonical exponential families is considered [3]. Our study [2] aims at accessing the extent to which these problems of precise probability models are also present in Bayesian neural networks (BNNs). Unlike traditional neural networks, BNNs utilize stochastic weights that can be learned by updating the prior belief with the likelihood for each individual weight using Bayes’ rule. In light of this, we investigate the impact of prior selection on the posterior of BNNs in the context of prior-data conflict. While the literature often advocates for the use of normal priors centered around 0, the consequences of this choice remain unknown when the data suggests high values for the individual weights. For this purpose, we designed synthetic datasets which were generated using neural networks (NN) with fixed high-weight values. This approach enables us to measure the effect of prior-data conflict, as well as reduce the model uncertainty by knowing the exact weights and functional relationship. We utilized BNNs that use the Mean-Field Variational Inference (MFVI) approach, which has not only seen an increasing interest due to its scalability but also allows analytical computation of the posterior distributions, as opposed to simulation-based methods like Markov Chain Monte Carlo (MCMC). In MFVI, the posterior distribution is approximated by a tractable distribution with a factorized form. In our work [ 2, Chapter 4.2], we provide evidence that exact priors centered around the exact weights, which are known from the neural network (NN), outperform their inexact counterparts centered around zero in terms of predictive accuracy, data efficiency and reasonable uncertainty estimations. These results directly imply that selecting a prior centered around 0 may be unintentionally informative, as previously noted by [ 4], resulting in significant losses in prediction accuracy and data requirement, rendering uncertainty estimation impractical. BNNs learned under prior-data conflict resulted in posterior means that were a weighted average of the prior mean and the likelihood highest probability values and therefore exhibited significant differences from the correct weights while also exhibiting an unreasonably low posterior variance, indicating a high degree of certainty in their estimates. Varying the prior variance yielded similar observations, with models using priors with data conflict exhibiting overconfidence in their posterior estimates compared to those using exact priors. To investigate the potential of IP methods, we are currently conducting the effect of expectation- valued interval- parameter, to generate resonable uncertainty predictions. Overall, our preliminary results show that classical BNNs produce overly confident but erroneous predictions in the presence of prior-data conflict. These findings motivate using IP methods in Deep Learning

    Regression-Based Model Error Compensation for Hierarchical MPC Building Energy Management System

    Full text link
    One of the major challenges in the development of energy management systems (EMSs) for complex buildings is accurate modeling. To address this, we propose an EMS, which combines a Model Predictive Control (MPC) approach with data-driven model error compensation. The hierarchical MPC approach consists of two layers: An aggregator controls the overall energy flows of the building in an aggregated perspective, while a distributor distributes heating and cooling powers to individual temperature zones. The controllers of both layers employ regression-based error estimation to predict and incorporate the model error. The proposed approach is evaluated in a software-in-the-loop simulation using a physics-based digital twin model. Simulation results show the efficacy and robustness of the proposed approachComment: 8 pages, 4 figures. To be published in 2023 IEEE Conference on Control Technology and Applications (CCTA) proceeding

    Interpreting Generalized Bayesian Inference by Generalized Bayesian Inference

    Get PDF
    The concept of safe Bayesian inference [ 4] with learning rates [5 ] has recently sparked a lot of research, e.g. in the context of generalized linear models [ 2]. It is occasionally also referred to as generalized Bayesian inference, e.g. in [2 , page 1] – a fact that should let IP advocates sit up straight and take notice, as this term is commonly used to describe Bayesian updating of credal sets. On this poster, we demonstrate that this reminiscence extends beyond terminology

    The Triad of Idiopathic Normal-Pressure Hydrocephalus A Clinical Practice Case Report

    Get PDF
    An 89-year-old white male presented with memory impairment, slowness in responsiveness, and frequent falls over a two-year duration. Six months earlier, the patient was believed to have had a “dementia with parkinsonian features,” but showed no response to incrementing doses of both donepezil and carbidopa-levodopa. Urinary urgency was believed to have been due to prostate hypertrophy. A head CT with contrast revealed moderate ventriculomegaly in the setting of mild diffuse cortical atrophy. A diagnosis of idiopathic normal-pressure hydrocephalus (INPH) was made

    Implicit Incorporation of Heuristics in MPC-Based Control of a Hydrogen Plant

    Full text link
    The replacement of fossil fuels in combination with an increasing share of renewable energy sources leads to an increased focus on decentralized microgrids. One option is the local production of green hydrogen in combination with fuel cell vehicles (FCVs). In this paper, we develop a control strategy based on Model Predictive Control (MPC) for an energy management system (EMS) of a hydrogen plant, which is currently under installation in Offenbach, Germany. The plant includes an electrolyzer, a compressor, a low pressure storage tank, and six medium pressure storage tanks with complex heuristic physical coupling during the filling and extraction of hydrogen. Since these heuristics are too complex to be incorporated into the optimal control problem (OCP) explicitly, we propose a novel approach to do so implicitly. First, the MPC is executed without considering them. Then, the so-called allocator uses a heuristic model (of arbitrary complexity) to verify whether the MPC's plan is valid. If not, it introduces additional constraints to the MPC's OCP to implicitly respect the tanks' pressure levels. The MPC is executed again and the new plan is applied to the plant. Simulation results with real-world measurement data of the facility's energy management and realistic fueling scenarios show its advantages over rule-based control.Comment: 8 pages, 3 figures. To be published in IEEE 3rd International Conference on Power Electronics, Smart Grid, and Renewable Energy (PESGRE 2023) proceeding

    Incorporating Human Preferences in Decision Making for Dynamic Multi-Objective Optimization in Model Predictive Control

    Get PDF
    We present a new two-step approach for automatized a posteriori decision making in multi-objective optimization problems, i.e., selecting a solution from the Pareto front. In the first step, a knee region is determined based on the normalized Euclidean distance from a hyperplane defined by the furthest Pareto solution and the negative unit vector. The size of the knee region depends on the Pareto front’s shape and a design parameter. In the second step, preferences for all objectives formulated by the decision maker, e.g., 50–20–30 for a 3D problem, are translated into a hyperplane which is then used to choose a final solution from the knee region. This way, the decision maker’s preference can be incorporated, while its influence depends on the Pareto front’s shape and a design parameter, at the same time favorizing knee points if they exist. The proposed approach is applied in simulation for the multi-objective model predictive control (MPC) of the two-dimensional rocket car example and the energy management system of a building

    Evolutionary Many-objective Optimization of Hybrid Electric Vehicle Control: From General Optimization to Preference Articulation

    Get PDF
    Many real-world optimization problems have more than three objectives, which has triggered increasing research interest in developing efficient and effective evolutionary algorithms for solving many-objective optimization problems. However, most many-objective evolutionary algorithms have only been evaluated on benchmark test functions and few applied to real-world optimization problems. To move a step forward, this paper presents a case study of solving a many-objective hybrid electric vehicle controller design problem using three state-of-the-art algorithms, namely, a decomposition based evolutionary algorithm (MOEA/D), a non-dominated sorting based genetic algorithm (NSGA-III), and a reference vector guided evolutionary algorithm (RVEA). We start with a typical setting aiming at approximating the Pareto front without introducing any user preferences. Based on the analyses of the approximated Pareto front, we introduce a preference articulation method and embed it in the three evolutionary algorithms for identifying solutions that the decision-maker prefers. Our experimental results demonstrate that by incorporating user preferences into many-objective evolutionary algorithms, we are not only able to gain deep insight into the trade-off relationships between the objectives, but also to achieve high-quality solutions reflecting the decision-maker’s preferences. In addition, our experimental results indicate that each of the three algorithms examined in this work has its unique advantages that can be exploited when applied to the optimization of real-world problems
    • …
    corecore