Classical simulators play a major role in the development and benchmark of
quantum algorithms and practically any software framework for quantum
computation provides the option of running the algorithms on simulators.
However, the development of quantum simulators was substantially separated from
the rest of the software frameworks which, instead, focus on usability and
compilation. Here, we demonstrate the advantage of co-developing and
integrating simulators and compilers by proposing a specialized compiler pass
to reduce the simulation time for arbitrary circuits. While the concept is
broadly applicable, we present a concrete implementation based on the Intel
Quantum Simulator, a high-performance distributed simulator. As part of this
work, we extend its implementation with additional functionalities related to
the representation of quantum states. The communication overhead is reduced by
changing the order in which state amplitudes are stored in the distributed
memory, a concept analogous to the distinction between local and global qubits
for distributed Schroedinger-type simulators. We then implement a compiler pass
to exploit the novel functionalities by introducing special instructions
governing data movement as part of the quantum circuit. Those instructions
target unique capabilities of simulators and have no analogue in actual quantum
devices. To quantify the advantage, we compare the time required to simulate
random circuits with and without our optimization. The simulation time is
typically halved