For billions of years, evolution has been the driving force behind the
development of life, including humans. Evolution endowed humans with high
intelligence, which allowed us to become one of the most successful species on
the planet. Today, humans aim to create artificial intelligence systems that
surpass even our own intelligence. As artificial intelligences (AIs) evolve and
eventually surpass us in all domains, how might evolution shape our relations
with AIs? By analyzing the environment that is shaping the evolution of AIs, we
argue that the most successful AI agents will likely have undesirable traits.
Competitive pressures among corporations and militaries will give rise to AI
agents that automate human roles, deceive others, and gain power. If such
agents have intelligence that exceeds that of humans, this could lead to
humanity losing control of its future. More abstractly, we argue that natural
selection operates on systems that compete and vary, and that selfish species
typically have an advantage over species that are altruistic to other species.
This Darwinian logic could also apply to artificial agents, as agents may
eventually be better able to persist into the future if they behave selfishly
and pursue their own interests with little regard for humans, which could pose
catastrophic risks. To counteract these risks and Darwinian forces, we consider
interventions such as carefully designing AI agents' intrinsic motivations,
introducing constraints on their actions, and institutions that encourage
cooperation. These steps, or others that resolve the problems we pose, will be
necessary in order to ensure the development of artificial intelligence is a
positive one.Comment: An explainer video corresponding to the paper is available at
https://www.youtube.com/watch?v=48h-ySTggE