The emergence of large language models (LLMs) represents a major advance in
artificial intelligence (AI) research. However, the widespread use of LLMs is
also coupled with significant ethical and social challenges. Previous research
has pointed towards auditing as a promising governance mechanism to help ensure
that AI systems are designed and deployed in ways that are ethical, legal, and
technically robust. However, existing auditing procedures fail to address the
governance challenges posed by LLMs, which are adaptable to a wide range of
downstream tasks. To help bridge that gap, we offer three contributions in this
article. First, we establish the need to develop new auditing procedures that
capture the risks posed by LLMs by analysing the affordances and constraints of
existing auditing procedures. Second, we outline a blueprint to audit LLMs in
feasible and effective ways by drawing on best practices from IT governance and
system engineering. Specifically, we propose a three-layered approach, whereby
governance audits, model audits, and application audits complement and inform
each other. Finally, we discuss the limitations not only of our three-layered
approach but also of the prospect of auditing LLMs at all. Ultimately, this
article seeks to expand the methodological toolkit available to technology
providers and policymakers who wish to analyse and evaluate LLMs from
technical, ethical, and legal perspectives.Comment: Preprint, 29 pages, 2 figure