64 research outputs found

    Reinforcement Learning from LLM Feedback to Counteract Goal Misgeneralization

    Full text link
    We introduce a method to address goal misgeneralization in reinforcement learning (RL), leveraging Large Language Model (LLM) feedback during training. Goal misgeneralization, a type of robustness failure in RL occurs when an agent retains its capabilities out-of-distribution yet pursues a proxy rather than the intended one. Our approach utilizes LLMs to analyze an RL agent's policies during training and identify potential failure scenarios. The RL agent is then deployed in these scenarios, and a reward model is learnt through the LLM preferences and feedback. This LLM-informed reward model is used to further train the RL agent on the original dataset. We apply our method to a maze navigation task, and show marked improvements in goal generalization, especially in cases where true and proxy goals are somewhat distinguishable and behavioral biases are pronounced. This study demonstrates how the LLM, despite its lack of task proficiency, can efficiently supervise RL agents, providing scalable oversight and valuable insights for enhancing goal-directed learning in RL through the use of LLMs

    Utilitarian Beliefs in Social Networks: Explaining the Emergence of Hatred

    Full text link
    We study the dynamics of opinions in a setting where a leader has a payoff that depends on agents' beliefs and where agents derive psychological utility from their beliefs. Agents sample a signal that maximises their utility and then communicate with each other through a network formed by disjoint social groups. The leader has a choice to target a finite set of social groups with a specific signal to influence their beliefs and maximise his returns. Heterogeneity in agents' preferences allows us to analyse the evolution of opinions as a dynamical system with asymmetric forces. We apply our model to explain the emergence of hatred and the spread of racism in a society. We show that when information is restricted, the equilibrium level of hatred is determined solely by the belief of the most extremist agent in the group regardless of the inherent structure of the network. On the contrary, when information is dense, the space is completely polarised in equilibrium with the presence of multiple "local truths" which oscillate in periodic cycles. We find that when preferences are uniformly distributed, the equilibrium level of hatred depends solely on the value of the practical punishment associated with holding a hate belief. Our finding suggests that an optimal policy to reduce hatred should focus on increasing the cost associated with holding a racist belief

    Thermodynamics of Mixing Water with Dimethyl Sulfoxide, as Seen from Computer Simulations

    Get PDF
    The Helmholtz free energy, energy, and entropy of mixing of eight different models of dimethyl sulfoxide (DMSO) with four widely used water models are calculated at 298 K over the entire composition range by means of thermodynamic integration along a suitably chosen thermodynamic path, and compared with experimental data. All 32 model combinations considered are able to reproduce the experimental values rather well, within RT (free energy and energy) and R (entropy) at any composition, and quite often the deviation from the experimental data is even smaller, being in the order of the uncertainty of the calculated free energy or energy, and entropy values of 0.1 kJ/mol and 0.1 J/(mol K), respectively. On the other hand, none of the model combinations considered can accurately reproduce all three experimental functions simultaneously. Furthermore, the fact that the entropy of mixing changes sign with increasing DMSO mole fraction is only reproduced by a handful of model pairs. Model combinations that (i) give the best reproduction of the experimental free energy, while still reasonably well reproducing the experimental energy and entropy of mixing, and (ii) that give the best reproduction of the experimental energy and entropy, while still reasonably well reproducing the experimental free energy of mixing, are identified

    Local structure of dilute aqueous DMSO solutions, as seen from molecular dynamics simulations

    Get PDF
    The information about the structure of dimethyl sulfoxide (DMSO)-water mixtures at relatively low DMSO mole fractions is an important step in order to understand their cryoprotective properties as well as the solvation process of proteins and amino acids. Classical MD simulations, using the potential model combination that best reproduces the free energy of mixing of these compounds, are used to analyze the local structure of DMSO-water mixtures at DMSO mole fractions below 0.2. Significant changes in the local structure of DMSO are observed around the DMSO mole fraction of 0.1. The array of evidence, based on the cluster and the metric and topological parameters of the Voronoi polyhedra distributions, indicates that these changes are associated with the simultaneous increase of the number of DMSO-water and decrease of water-water hydrogen bonds with increasing DMSO concentration. The inversion between the dominance of these two types of H-bonds occurs around X-DMSO = 0.1, above which the DMSO-DMSO interactions also start playing an important role. In other words, below the DMSO mole fraction of 0.1, DMSO molecules are mainly solvated by water molecules, while above it, their solvation shell consists of a mixture of water and DMSO. The trigonal, tetrahedral, and trigonal bipyramidal distributions of water shift to lower corresponding order parameter values indicating the loosening of these orientations. Adding DMSO does not affect the hydrogen bonding between a reference water molecule and its first neighbor hydrogen bonded water molecules, while it increases the bent hydrogen bond geometry involving the second ones. The close-packed local structure of the third, fourth, and fifth water neighbors also is reinforced. In accordance with previous theoretical and experimental data, the hydrogen bonding between water and the first, the second, and the third DMSO neighbors is stronger than that with its corresponding water neighbors. At a given DMSO mole fraction, the behavior of the intensity of the high orientational order parameter values indicates that water molecules are more ordered in the vicinity of the hydrophilic group while their structure is close-packed near the hydrophobic group of DMSO. Published by AIP Publishing

    Quantitative measurements of the CH radical in sooting diffusion flames at atmospheric pressure

    No full text
    The potential of Laser Induced Fluorescence detection of the CH radical using C–X (0–0) excitation is investigated in a sooting methane/air diffusion flame at atmospheric pressure. Fluorescence is detected using the very narrow (<0.4 nm) Q-branch of the C–X (0–0) band, which enables the measurement of CH in sooting flames without interference from PAH fluorescence and soot emissions. Absolute concentrations are obtained using Cavity Ring Down Spectroscopy. 1D CH profiles in the sooting zone are recorded using a CCD camera with an excellent signal-to-noise ratio. The C–X (0–0) excitation associated with Q-branch detection is shown to be three times more efficient than the B–X scheme

    Investigation of the ability of the corrosion protection of Zn-Mg coatings

    No full text
    Presently Zn-Mg coatings are being developed as a contribution to next generation of galvanized steel. However, the underlying corrosion mechanism is still under debate. In this paper we show that Raman spectroscopy next to electrochemical methods (linear sweep voltammetry and electrochemical impedance spectroscopy) can successfully be used to help in identifying the corrosion products of Zn-Mg coated galvanized steel (ZMS) when dipped into 3 wt. % NaCl aqueous solution at ambient temperature (25 °C). The obtained results are compared to galvanized steel. This study reveals that in the case of ZMS, Mg is mainly anodically dissolved forming a compact layer of Mg(OH)2/MgO and MgCO3. It is believed that the formation of this compact layer and the insulating properties of MgO, due to its large band gap, are responsible for the increased corrosion resistance of the alloy
    • …
    corecore