31 research outputs found

    Confidence-Building Measures for Artificial Intelligence: Workshop Proceedings

    Full text link
    Foundation models could eventually introduce several pathways for undermining state security: accidents, inadvertent escalation, unintentional conflict, the proliferation of weapons, and the interference with human diplomacy are just a few on a long list. The Confidence-Building Measures for Artificial Intelligence workshop hosted by the Geopolitics Team at OpenAI and the Berkeley Risk and Security Lab at the University of California brought together a multistakeholder group to think through the tools and strategies to mitigate the potential risks introduced by foundation models to international security. Originating in the Cold War, confidence-building measures (CBMs) are actions that reduce hostility, prevent conflict escalation, and improve trust between parties. The flexibility of CBMs make them a key instrument for navigating the rapid changes in the foundation model landscape. Participants identified the following CBMs that directly apply to foundation models and which are further explained in this conference proceedings: 1. crisis hotlines 2. incident sharing 3. model, transparency, and system cards 4. content provenance and watermarks 5. collaborative red teaming and table-top exercises and 6. dataset and evaluation sharing. Because most foundation model developers are non-government entities, many CBMs will need to involve a wider stakeholder community. These measures can be implemented either by AI labs or by relevant government actors

    Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training

    Full text link
    Humans are capable of strategically deceptive behavior: behaving helpfully in most situations, but then behaving very differently in order to pursue alternative objectives when given the opportunity. If an AI system learned such a deceptive strategy, could we detect it and remove it using current state-of-the-art safety training techniques? To study this question, we construct proof-of-concept examples of deceptive behavior in large language models (LLMs). For example, we train models that write secure code when the prompt states that the year is 2023, but insert exploitable code when the stated year is 2024. We find that such backdoor behavior can be made persistent, so that it is not removed by standard safety training techniques, including supervised fine-tuning, reinforcement learning, and adversarial training (eliciting unsafe behavior and then training to remove it). The backdoor behavior is most persistent in the largest models and in models trained to produce chain-of-thought reasoning about deceiving the training process, with the persistence remaining even when the chain-of-thought is distilled away. Furthermore, rather than removing backdoors, we find that adversarial training can teach models to better recognize their backdoor triggers, effectively hiding the unsafe behavior. Our results suggest that, once a model exhibits deceptive behavior, standard techniques could fail to remove such deception and create a false impression of safety.Comment: updated to add missing acknowledgement

    The ALICE experiment at the CERN LHC

    Get PDF
    ALICE (A Large Ion Collider Experiment) is a general-purpose, heavy-ion detector at the CERN LHC which focuses on QCD, the strong-interaction sector of the Standard Model. It is designed to address the physics of strongly interacting matter and the quark-gluon plasma at extreme values of energy density and temperature in nucleus-nucleus collisions. Besides running with Pb ions, the physics programme includes collisions with lighter ions, lower energy running and dedicated proton-nucleus runs. ALICE will also take data with proton beams at the top LHC energy to collect reference data for the heavy-ion programme and to address several QCD topics for which ALICE is complementary to the other LHC detectors. The ALICE detector has been built by a collaboration including currently over 1000 physicists and engineers from 105 Institutes in 30 countries. Its overall dimensions are 161626 m3 with a total weight of approximately 10 000 t. The experiment consists of 18 different detector systems each with its own specific technology choice and design constraints, driven both by the physics requirements and the experimental conditions expected at LHC. The most stringent design constraint is to cope with the extreme particle multiplicity anticipated in central Pb-Pb collisions. The different subsystems were optimized to provide high-momentum resolution as well as excellent Particle Identification (PID) over a broad range in momentum, up to the highest multiplicities predicted for LHC. This will allow for comprehensive studies of hadrons, electrons, muons, and photons produced in the collision of heavy nuclei. Most detector systems are scheduled to be installed and ready for data taking by mid-2008 when the LHC is scheduled to start operation, with the exception of parts of the Photon Spectrometer (PHOS), Transition Radiation Detector (TRD) and Electro Magnetic Calorimeter (EMCal). These detectors will be completed for the high-luminosity ion run expected in 2010. This paper describes in detail the detector components as installed for the first data taking in the summer of 2008

    Investigating chemical and dynamical processes in the Asian Monsoon UTLS using in-situ and satellite observations of carbon monoxide (CO) and carbonyl sulfide (OCS)

    No full text
    International audienceThe UTLS is characterized by significant gradients in trace gas mixing ratios that arise from i) mixing of different fractions of tropospheric and stratospheric air and ii) photochemical processing as air rises from the troposphere to the stratosphere (particularly in the tropics). We use satellite and in-situ measurements of two different tracers to investigate these processes in the region of the Asian Monsoon Anticyclone (AMA): carbon monoxide (CO) and carbonyl sulfide (OCS). CO is a short-lived tracer with a photochemical lifetime of ∼1 - 4 months. CO mixing ratios are sensitive to both photochemical depletion and inmixing of stratospheric air masses. OCS, on the other hand, can be regarded as photochemically inert in the UTLS (significant photochemical destruction of OCS takes place only in the tropical pipe above ∼ 22 km altitude). Therefore, OCS is sensitive only to stratospheric inmixing. Based on observed vertical profiles of the two gases in different positions relative to the core of the AMA, we set two hypotheses: In and directly above the AMA core, the composition is dominated by photochemical processing Further away from the AMA core, mixing processes become more important. This implies significant active net transport/ascend of upper tropospheric air into the stratosphere close to the AMA core and a more bi-directional transport regime elsewhere

    Investigating chemical and dynamical processes in the Asian Monsoon UTLS using in-situ and satellite observations of carbon monoxide (CO) and carbonyl sulfide (OCS)

    No full text
    International audienceThe UTLS is characterized by significant gradients in trace gas mixing ratios that arise from i) mixing of different fractions of tropospheric and stratospheric air and ii) photochemical processing as air rises from the troposphere to the stratosphere (particularly in the tropics). We use satellite and in-situ measurements of two different tracers to investigate these processes in the region of the Asian Monsoon Anticyclone (AMA): carbon monoxide (CO) and carbonyl sulfide (OCS). CO is a short-lived tracer with a photochemical lifetime of ∼1 - 4 months. CO mixing ratios are sensitive to both photochemical depletion and inmixing of stratospheric air masses. OCS, on the other hand, can be regarded as photochemically inert in the UTLS (significant photochemical destruction of OCS takes place only in the tropical pipe above ∼ 22 km altitude). Therefore, OCS is sensitive only to stratospheric inmixing. Based on observed vertical profiles of the two gases in different positions relative to the core of the AMA, we set two hypotheses: In and directly above the AMA core, the composition is dominated by photochemical processing Further away from the AMA core, mixing processes become more important. This implies significant active net transport/ascend of upper tropospheric air into the stratosphere close to the AMA core and a more bi-directional transport regime elsewhere

    Investigating chemical and dynamical processes in the Asian Monsoon UTLS using in-situ and satellite observations of carbon monoxide (CO) and carbonyl sulfide (OCS)

    No full text
    International audienceThe UTLS is characterized by significant gradients in trace gas mixing ratios that arise from i) mixing of different fractions of tropospheric and stratospheric air and ii) photochemical processing as air rises from the troposphere to the stratosphere (particularly in the tropics). We use satellite and in-situ measurements of two different tracers to investigate these processes in the region of the Asian Monsoon Anticyclone (AMA): carbon monoxide (CO) and carbonyl sulfide (OCS). CO is a short-lived tracer with a photochemical lifetime of ∼1 - 4 months. CO mixing ratios are sensitive to both photochemical depletion and inmixing of stratospheric air masses. OCS, on the other hand, can be regarded as photochemically inert in the UTLS (significant photochemical destruction of OCS takes place only in the tropical pipe above ∼ 22 km altitude). Therefore, OCS is sensitive only to stratospheric inmixing. Based on observed vertical profiles of the two gases in different positions relative to the core of the AMA, we set two hypotheses: In and directly above the AMA core, the composition is dominated by photochemical processing Further away from the AMA core, mixing processes become more important. This implies significant active net transport/ascend of upper tropospheric air into the stratosphere close to the AMA core and a more bi-directional transport regime elsewhere

    The AI Index 2022 Annual Report

    Full text link
    Welcome to the fifth edition of the AI Index Report! The latest edition includes data from a broad set of academic, private, and nonprofit organizations as well as more self-collected data and original analysis than any previous editions, including an expanded technical performance chapter, a new survey of robotics researchers around the world, data on global AI legislation records in 25 countries, and a new chapter with an in-depth analysis of technical AI ethics metrics. The AI Index Report tracks, collates, distills, and visualizes data related to artificial intelligence. Its mission is to provide unbiased, rigorously vetted, and globally sourced data for policymakers, researchers, executives, journalists, and the general public to develop a more thorough and nuanced understanding of the complex field of AI. The report aims to be the world's most credible and authoritative source for data and insights about AI
    corecore