19 research outputs found

    Impact of continuous veno-venous hemofiltration on acid-base balance

    No full text
    Background: Continuous veno-venous hemofiltration (CVVH) appears to have a significant and variable impact on acid-base balance. However, the pathogenesis of these acid-base effects remains poorly understood. The aim of this study was to understand the nature of acid-base changes in critically ill patients with acute renal failure during continuous veno-venous hemofiltration by applying quantitative methods of biophysical analysis (Stewart-Figge methodology). Methods: We studied forty patients with ARF receiving CVVH in the intensive care unit. We retrieved the biochemical data from computerized records and conducted quantitative biophysical analysis. We measured serum Na+, K+, Mg2+, Cl-, HCO3-, phosphate, ionized Ca2+, albumin, lactate and arterial blood gases and calculated the following Stewart-Figge variables: Strong Ion Difference apparent (SIDa), Strong Ion Difference Effective (SIDe) and Strong Ion Gap (SIG). Results: Before treatment, patients had mild acidemia (pH: 7.31) secondary to metabolic acidosis (bicarbonate: 19.8 mmol/L and base excess: -5.9 mEq/L). This acidosis was due to increased unmeasured anions (SIG: 12.3 mEq/L), hyperphosphatemia (1.86 mmol/L) and hyperlactatemia (2.08 mmol/L). It was attenuated by the alkalinizing effect of hypoalbuminemia (22.5 g/L). After commencing CVVH, the acidemia was corrected within 24 hours (pH 7.31 vs 7.41, p < 0.0001). This correction was associated with a decreased strong ion gap (SIG) (12.3 vs.8.8 mEq/L, p < 0.0001), phosphate concentration (1.86 vs. 1.49 mmol/L, p < 0.0001) and serum chloride concentration (102 vs. 98.5 mmol/L, p < 0.0001). After 3 days of CVVH, however, patients developed alkalemia (pH: 7.46) secondary to metabolic alkalosis (bicarbonate: 29.8 mmol/L, base excess: 6.7 mEq/L). This alkalemia appeared secondary to a further decrease in SIG to 6.7 mEq/L (p < 0.0001) and a further decrease in serum phosphate to 0.77 mmol/L (p < 0.0001) in the setting of persistent hypoalbuminemia (21.0 g/L; p = 0.56). Conclusions: CVVH corrects metabolic acidosis in acute renal failure patients through its effect on unmeasured anions, phosphate and chloride. Such correction coupled with the effect of hypoalbuminemia, results in the development of a metabolic alkalosis after 72 hours of treatment

    Stable opponent shaping in differentiable games

    No full text
    A growing number of learning methods are actually differentiable games whose players optimise multiple, interdependent objectives in parallel – from GANs and intrinsic curiosity to multi-agent RL. Opponent shaping is a powerful approach to improve learning dynamics in these games, accounting for player influence on others’ updates. Learning with Opponent-Learning Awareness (LOLA) is a recent algorithm that exploits this response and leads to cooperation in settings like the Iterated Prisoner’s Dilemma. Although experimentally successful, we show that LOLA agents can exhibit ‘arrogant’ behaviour directly at odds with convergence. In fact, remarkably few algorithms have theoretical guarantees applying across all (n-player, non-convex) games. In this paper we present Stable Opponent Shaping (SOS), a new method that interpolates between LOLA and a stable variant named LookAhead. We prove that LookAhead converges locally to equilibria and avoids strict saddles in all differentiable games. SOS inherits these essential guarantees, while also shaping the learning of opponents and consistently either matching or outperforming LOLA experimentally

    Stable opponent shaping in differentiable games

    Get PDF
    A growing number of learning methods are actually differentiable games whose players optimise multiple, interdependent objectives in parallel – from GANs and intrinsic curiosity to multi-agent RL. Opponent shaping is a powerful approach to improve learning dynamics in these games, accounting for player influence on others’ updates. Learning with Opponent-Learning Awareness (LOLA) is a recent algorithm that exploits this response and leads to cooperation in settings like the Iterated Prisoner’s Dilemma. Although experimentally successful, we show that LOLA agents can exhibit ‘arrogant’ behaviour directly at odds with convergence. In fact, remarkably few algorithms have theoretical guarantees applying across all (n-player, non-convex) games. In this paper we present Stable Opponent Shaping (SOS), a new method that interpolates between LOLA and a stable variant named LookAhead. We prove that LookAhead converges locally to equilibria and avoids strict saddles in all differentiable games. SOS inherits these essential guarantees, while also shaping the learning of opponents and consistently either matching or outperforming LOLA experimentally

    A baseline for any order gradient estimation in stochastic computation graphs

    No full text
    By enabling correct differentiation in Stochastic Computation Graphs (SCGs), the infinitely differentiable Monte-Carlo estimator (DiCE) can generate correct estimates for the higher order gradients that arise in, e.g., multi-agent reinforcement learning and meta-learning. However, the baseline term in DiCE that serves as a control variate for reducing variance applies only to first order gradient estimation, limiting the utility of higher-order gradient estimates. To improve the sample efficiency of DiCE, we propose a new baseline term for higher order gradient estimation. This term may be easily included in the objective, and produces unbiased variance-reduced estimators under (automatic) differentiation, without affecting the estimate of the objective itself or of the first order gradient estimate. It reuses the same baseline function (e.g., the state-value function in reinforcement learning) already used for the first order baseline. We provide theoretical analysis and numerical evaluations of this new baseline, which demonstrate that it can dramatically reduce the variance of DiCE’s second order gradient estimators and also show empirically that it reduces the variance of third and fourth order gradients. This computational tool can be easily used to estimate higher order gradients with unprecedented efficiency and simplicity wherever automatic differentiation is utilised, and it has the potential to unlock applications of higher order gradients in reinforcement learning and meta-learning
    corecore