14 research outputs found
Provably safe systems: the only path to controllable AGI
We describe a path to humanity safely thriving with powerful Artificial
General Intelligences (AGIs) by building them to provably satisfy
human-specified requirements. We argue that this will soon be technically
feasible using advanced AI for formal verification and mechanistic
interpretability. We further argue that it is the only path which guarantees
safe controlled AGI. We end with a list of challenge problems whose solution
would contribute to this positive outcome and invite readers to join in this
work.Comment: 17 page
Recommended from our members
Automating the Formal Verification of Software
Formally verified correctness is one of the most desirable properties of software systems. Despite great progress made toward verification via interactive proof assistants, such as Coq and Isabelle/HOL, such verification remains one of the most effort-intensive (and often prohibitively difficult) software development activities. Recent work has created tools that automatically synthesize proofs either through reasoning using precomputed facts or using machine learning to model proofs and then perform biased search through the proof space. However, models in existing tools fail to capture the richness present in proofs, such as the information the programmer has access to when writing proofs and the natural language contained within variable names. Furthermore, these prior models do not make use of variations in the learning process and advances in large language models.
In this dissertation, I develop tools to improve proof synthesis and to enable fully automating more verification. I first present TacTok, a proof-synthesis tool that models proofs using both the partial proof written thus far and the semantics of the proof state. I then present Diva, a proof-synthesis tool that controls the learning process to produce a diverse set of models and, due to the unique nature of proof synthesis (the existence of the theorem prover, an oracle that infallibly judges a proof’s correctness), efficiently combines these models to improve the overall proving power. I then present Passport, a proof-synthesis tool that systematically explores different ways of encoding identifiers in proofs to improve synthesis. Finally, I present Baldur, a proof-synthesis tool that uses transformer-based pretrained large language models fine-tuned on proofs to generate and repair whole proofs at once, rather than one step at a time.
This dissertation contributes new ideas for improving automated proof synthesis and empirically demonstrates that the improvement is significant on large benchmarks consisting of open-source software projects
FIMO: A Challenge Formal Dataset for Automated Theorem Proving
We present FIMO, an innovative dataset comprising formal mathematical problem
statements sourced from the International Mathematical Olympiad (IMO)
Shortlisted Problems. Designed to facilitate advanced automated theorem proving
at the IMO level, FIMO is currently tailored for the Lean formal language. It
comprises 149 formal problem statements, accompanied by both informal problem
descriptions and their corresponding LaTeX-based informal proofs. Through
initial experiments involving GPT-4, our findings underscore the existing
limitations in current methodologies, indicating a substantial journey ahead
before achieving satisfactory IMO-level automated theorem proving outcomes
Formalizing Chemical Physics using the Lean Theorem Prover
Chemical theory can be made more rigorous using the Lean theorem prover, an
interactive theorem prover for complex mathematics. We formalize the Langmuir
and BET theories of adsorption, making each scientific premise clear and every
step of the derivations explicit. Lean's math library, mathlib, provides
formally verified theorems for infinite geometries series, which are central to
BET theory. While writing these proofs, Lean prompts us to include mathematical
constraints that were not originally reported. We also illustrate how Lean
flexibly enables the reuse of proofs that build on more complex theories
through the use of functions, definitions, and structures. Finally, we
construct scientific frameworks for interoperable proofs, by creating
structures for classical thermodynamics and kinematics, using them to formalize
gas law relationships like Boyle's Law and equations of motion underlying
Newtonian mechanics, respectively. This approach can be extended to other
fields, enabling the formalization of rich and complex theories in science and
engineering
Large Language Models
Artificial intelligence is making spectacular progress, and one of the best
examples is the development of large language models (LLMs) such as OpenAI's
GPT series. In these lectures, written for readers with a background in
mathematics or physics, we give a brief history and survey of the state of the
art, and describe the underlying transformer architecture in detail. We then
explore some current ideas on how LLMs work and how models trained to predict
the next word in a text are able to perform other tasks displaying
intelligence.Comment: 46 page
Law Informs Code: A Legal Informatics Approach to Aligning Artificial Intelligence with Humans
We are currently unable to specify human goals and societal values in a way
that reliably directs AI behavior. Law-making and legal interpretation form a
computational engine that converts opaque human values into legible directives.
"Law Informs Code" is the research agenda embedding legal knowledge and
reasoning in AI. Similar to how parties to a legal contract cannot foresee
every potential contingency of their future relationship, and legislators
cannot predict all the circumstances under which their proposed bills will be
applied, we cannot ex ante specify rules that provably direct good AI behavior.
Legal theory and practice have developed arrays of tools to address these
specification problems. For instance, legal standards allow humans to develop
shared understandings and adapt them to novel situations. In contrast to more
prosaic uses of the law (e.g., as a deterrent of bad behavior through the
threat of sanction), leveraged as an expression of how humans communicate their
goals, and what society values, Law Informs Code.
We describe how data generated by legal processes (methods of law-making,
statutory interpretation, contract drafting, applications of legal standards,
legal reasoning, etc.) can facilitate the robust specification of inherently
vague human goals. This increases human-AI alignment and the local usefulness
of AI. Toward society-AI alignment, we present a framework for understanding
law as the applied philosophy of multi-agent alignment. Although law is partly
a reflection of historically contingent political power - and thus not a
perfect aggregation of citizen preferences - if properly parsed, its
distillation offers the most legitimate computational comprehension of societal
values available. If law eventually informs powerful AI, engaging in the
deliberative political process to improve law takes on even more meaning.Comment: Forthcoming in Northwestern Journal of Technology and Intellectual
Property, Volume 2
Augmented Language Models: a Survey
This survey reviews works in which language models (LMs) are augmented with
reasoning skills and the ability to use tools. The former is defined as
decomposing a potentially complex task into simpler subtasks while the latter
consists in calling external modules such as a code interpreter. LMs can
leverage these augmentations separately or in combination via heuristics, or
learn to do so from demonstrations. While adhering to a standard missing tokens
prediction objective, such augmented LMs can use various, possibly
non-parametric external modules to expand their context processing ability,
thus departing from the pure language modeling paradigm. We therefore refer to
them as Augmented Language Models (ALMs). The missing token objective allows
ALMs to learn to reason, use tools, and even act, while still performing
standard natural language tasks and even outperforming most regular LMs on
several benchmarks. In this work, after reviewing current advance in ALMs, we
conclude that this new research direction has the potential to address common
limitations of traditional LMs such as interpretability, consistency, and
scalability issues