Backdoors pose a serious threat to machine learning, as they can compromise
the integrity of security-critical systems, such as self-driving cars. While
different defenses have been proposed to address this threat, they all rely on
the assumption that the hardware on which the learning models are executed
during inference is trusted. In this paper, we challenge this assumption and
introduce a backdoor attack that completely resides within a common hardware
accelerator for machine learning. Outside of the accelerator, neither the
learning model nor the software is manipulated, so that current defenses fail.
To make this attack practical, we overcome two challenges: First, as memory on
a hardware accelerator is severely limited, we introduce the concept of a
minimal backdoor that deviates as little as possible from the original model
and is activated by replacing a few model parameters only. Second, we develop a
configurable hardware trojan that can be provisioned with the backdoor and
performs a replacement only when the specific target model is processed. We
demonstrate the practical feasibility of our attack by implanting our hardware
trojan into the Xilinx Vitis AI DPU, a commercial machine-learning accelerator.
We configure the trojan with a minimal backdoor for a traffic-sign recognition
system. The backdoor replaces only 30 (0.069%) model parameters, yet it
reliably manipulates the recognition once the input contains a backdoor
trigger. Our attack expands the hardware circuit of the accelerator by 0.24%
and induces no run-time overhead, rendering a detection hardly possible. Given
the complex and highly distributed manufacturing process of current hardware,
our work points to a new threat in machine learning that is inaccessible to
current security mechanisms and calls for hardware to be manufactured only in
fully trusted environments