31 research outputs found
Confidence-Building Measures for Artificial Intelligence: Workshop Proceedings
Foundation models could eventually introduce several pathways for undermining
state security: accidents, inadvertent escalation, unintentional conflict, the
proliferation of weapons, and the interference with human diplomacy are just a
few on a long list. The Confidence-Building Measures for Artificial
Intelligence workshop hosted by the Geopolitics Team at OpenAI and the Berkeley
Risk and Security Lab at the University of California brought together a
multistakeholder group to think through the tools and strategies to mitigate
the potential risks introduced by foundation models to international security.
Originating in the Cold War, confidence-building measures (CBMs) are actions
that reduce hostility, prevent conflict escalation, and improve trust between
parties. The flexibility of CBMs make them a key instrument for navigating the
rapid changes in the foundation model landscape. Participants identified the
following CBMs that directly apply to foundation models and which are further
explained in this conference proceedings: 1. crisis hotlines 2. incident
sharing 3. model, transparency, and system cards 4. content provenance and
watermarks 5. collaborative red teaming and table-top exercises and 6. dataset
and evaluation sharing. Because most foundation model developers are
non-government entities, many CBMs will need to involve a wider stakeholder
community. These measures can be implemented either by AI labs or by relevant
government actors
Recommended from our members
Global perturbation of stratospheric water and aerosol burden by Hunga eruption
The eruption of the submarine Hunga volcano in January 2022 was associated with a powerful blast that injected volcanic material to altitudes up to 58 km. From a combination of various types of satellite and ground-based observations supported by transport modeling, we show evidence for an unprecedented increase in the global stratospheric water mass by 13% relative to climatological levels, and a 5-fold increase of stratospheric aerosol load, the highest in the last three decades. Owing to the extreme injection altitude, the volcanic plume circumnavigated the Earth in only 1 week and dispersed nearly pole-to-pole in three months. The unique nature and magnitude of the global stratospheric perturbation by the Hunga eruption ranks it among the most remarkable climatic events in the modern observation era, with a range of potential long-lasting repercussions for stratospheric composition and climate
Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training
Humans are capable of strategically deceptive behavior: behaving helpfully in
most situations, but then behaving very differently in order to pursue
alternative objectives when given the opportunity. If an AI system learned such
a deceptive strategy, could we detect it and remove it using current
state-of-the-art safety training techniques? To study this question, we
construct proof-of-concept examples of deceptive behavior in large language
models (LLMs). For example, we train models that write secure code when the
prompt states that the year is 2023, but insert exploitable code when the
stated year is 2024. We find that such backdoor behavior can be made
persistent, so that it is not removed by standard safety training techniques,
including supervised fine-tuning, reinforcement learning, and adversarial
training (eliciting unsafe behavior and then training to remove it). The
backdoor behavior is most persistent in the largest models and in models
trained to produce chain-of-thought reasoning about deceiving the training
process, with the persistence remaining even when the chain-of-thought is
distilled away. Furthermore, rather than removing backdoors, we find that
adversarial training can teach models to better recognize their backdoor
triggers, effectively hiding the unsafe behavior. Our results suggest that,
once a model exhibits deceptive behavior, standard techniques could fail to
remove such deception and create a false impression of safety.Comment: updated to add missing acknowledgement
The ALICE experiment at the CERN LHC
ALICE (A Large Ion Collider Experiment) is a general-purpose, heavy-ion detector at the CERN LHC which focuses on QCD, the strong-interaction sector of the Standard Model. It is designed to address the physics of strongly interacting matter and the quark-gluon plasma at extreme values of energy density and temperature in nucleus-nucleus collisions. Besides running with Pb ions, the physics programme includes collisions with lighter ions, lower energy running and dedicated proton-nucleus runs. ALICE will also take data with proton beams at the top LHC energy to collect reference data for the heavy-ion programme and to address several QCD topics for which ALICE is complementary to the other LHC detectors. The ALICE detector has been built by a collaboration including currently over 1000 physicists and engineers from 105 Institutes in 30 countries. Its overall dimensions are 161626 m3 with a total weight of approximately 10 000 t. The experiment consists of 18 different detector systems each with its own specific technology choice and design constraints, driven both by the physics requirements and the experimental conditions expected at LHC. The most stringent design constraint is to cope with the extreme particle multiplicity anticipated in central Pb-Pb collisions. The different subsystems were optimized to provide high-momentum resolution as well as excellent Particle Identification (PID) over a broad range in momentum, up to the highest multiplicities predicted for LHC. This will allow for comprehensive studies of hadrons, electrons, muons, and photons produced in the collision of heavy nuclei. Most detector systems are scheduled to be installed and ready for data taking by mid-2008 when the LHC is scheduled to start operation, with the exception of parts of the Photon Spectrometer (PHOS), Transition Radiation Detector (TRD) and Electro Magnetic Calorimeter (EMCal). These detectors will be completed for the high-luminosity ion run expected in 2010. This paper describes in detail the detector components as installed for the first data taking in the summer of 2008
Investigating chemical and dynamical processes in the Asian Monsoon UTLS using in-situ and satellite observations of carbon monoxide (CO) and carbonyl sulfide (OCS)
International audienceThe UTLS is characterized by significant gradients in trace gas mixing ratios that arise from i) mixing of different fractions of tropospheric and stratospheric air and ii) photochemical processing as air rises from the troposphere to the stratosphere (particularly in the tropics). We use satellite and in-situ measurements of two different tracers to investigate these processes in the region of the Asian Monsoon Anticyclone (AMA): carbon monoxide (CO) and carbonyl sulfide (OCS). CO is a short-lived tracer with a photochemical lifetime of ∼1 - 4 months. CO mixing ratios are sensitive to both photochemical depletion and inmixing of stratospheric air masses. OCS, on the other hand, can be regarded as photochemically inert in the UTLS (significant photochemical destruction of OCS takes place only in the tropical pipe above ∼ 22 km altitude). Therefore, OCS is sensitive only to stratospheric inmixing. Based on observed vertical profiles of the two gases in different positions relative to the core of the AMA, we set two hypotheses: In and directly above the AMA core, the composition is dominated by photochemical processing Further away from the AMA core, mixing processes become more important. This implies significant active net transport/ascend of upper tropospheric air into the stratosphere close to the AMA core and a more bi-directional transport regime elsewhere
Investigating chemical and dynamical processes in the Asian Monsoon UTLS using in-situ and satellite observations of carbon monoxide (CO) and carbonyl sulfide (OCS)
International audienceThe UTLS is characterized by significant gradients in trace gas mixing ratios that arise from i) mixing of different fractions of tropospheric and stratospheric air and ii) photochemical processing as air rises from the troposphere to the stratosphere (particularly in the tropics). We use satellite and in-situ measurements of two different tracers to investigate these processes in the region of the Asian Monsoon Anticyclone (AMA): carbon monoxide (CO) and carbonyl sulfide (OCS). CO is a short-lived tracer with a photochemical lifetime of ∼1 - 4 months. CO mixing ratios are sensitive to both photochemical depletion and inmixing of stratospheric air masses. OCS, on the other hand, can be regarded as photochemically inert in the UTLS (significant photochemical destruction of OCS takes place only in the tropical pipe above ∼ 22 km altitude). Therefore, OCS is sensitive only to stratospheric inmixing. Based on observed vertical profiles of the two gases in different positions relative to the core of the AMA, we set two hypotheses: In and directly above the AMA core, the composition is dominated by photochemical processing Further away from the AMA core, mixing processes become more important. This implies significant active net transport/ascend of upper tropospheric air into the stratosphere close to the AMA core and a more bi-directional transport regime elsewhere
Investigating chemical and dynamical processes in the Asian Monsoon UTLS using in-situ and satellite observations of carbon monoxide (CO) and carbonyl sulfide (OCS)
International audienceThe UTLS is characterized by significant gradients in trace gas mixing ratios that arise from i) mixing of different fractions of tropospheric and stratospheric air and ii) photochemical processing as air rises from the troposphere to the stratosphere (particularly in the tropics). We use satellite and in-situ measurements of two different tracers to investigate these processes in the region of the Asian Monsoon Anticyclone (AMA): carbon monoxide (CO) and carbonyl sulfide (OCS). CO is a short-lived tracer with a photochemical lifetime of ∼1 - 4 months. CO mixing ratios are sensitive to both photochemical depletion and inmixing of stratospheric air masses. OCS, on the other hand, can be regarded as photochemically inert in the UTLS (significant photochemical destruction of OCS takes place only in the tropical pipe above ∼ 22 km altitude). Therefore, OCS is sensitive only to stratospheric inmixing. Based on observed vertical profiles of the two gases in different positions relative to the core of the AMA, we set two hypotheses: In and directly above the AMA core, the composition is dominated by photochemical processing Further away from the AMA core, mixing processes become more important. This implies significant active net transport/ascend of upper tropospheric air into the stratosphere close to the AMA core and a more bi-directional transport regime elsewhere
The AI Index 2022 Annual Report
Welcome to the fifth edition of the AI Index Report! The latest edition
includes data from a broad set of academic, private, and nonprofit
organizations as well as more self-collected data and original analysis than
any previous editions, including an expanded technical performance chapter, a
new survey of robotics researchers around the world, data on global AI
legislation records in 25 countries, and a new chapter with an in-depth
analysis of technical AI ethics metrics.
The AI Index Report tracks, collates, distills, and visualizes data related
to artificial intelligence. Its mission is to provide unbiased, rigorously
vetted, and globally sourced data for policymakers, researchers, executives,
journalists, and the general public to develop a more thorough and nuanced
understanding of the complex field of AI. The report aims to be the world's
most credible and authoritative source for data and insights about AI