The success of modern deep learning hinges on the ability to train neural
networks at scale. Through clever reuse of intermediate information,
backpropagation facilitates training through gradient computation at a total
cost roughly proportional to running the function, rather than incurring an
additional factor proportional to the number of parameters - which can now be
in the trillions. Naively, one expects that quantum measurement collapse
entirely rules out the reuse of quantum information as in backpropagation. But
recent developments in shadow tomography, which assumes access to multiple
copies of a quantum state, have challenged that notion. Here, we investigate
whether parameterized quantum models can train as efficiently as classical
neural networks. We show that achieving backpropagation scaling is impossible
without access to multiple copies of a state. With this added ability, we
introduce an algorithm with foundations in shadow tomography that matches
backpropagation scaling in quantum resources while reducing classical auxiliary
computational costs to open problems in shadow tomography. These results
highlight the nuance of reusing quantum information for practical purposes and
clarify the unique difficulties in training large quantum models, which could
alter the course of quantum machine learning.Comment: 29 pages, 2 figure