We introduce the first large-scale dataset, MNISQ, for both the Quantum and
the Classical Machine Learning community during the Noisy Intermediate-Scale
Quantum era. MNISQ consists of 4,950,000 data points organized in 9
subdatasets. Building our dataset from the quantum encoding of classical
information (e.g., MNIST dataset), we deliver a dataset in a dual form: in
quantum form, as circuits, and in classical form, as quantum circuit
descriptions (quantum programming language, QASM). In fact, also the Machine
Learning research related to quantum computers undertakes a dual challenge:
enhancing machine learning exploiting the power of quantum computers, while
also leveraging state-of-the-art classical machine learning methodologies to
help the advancement of quantum computing. Therefore, we perform circuit
classification on our dataset, tackling the task with both quantum and
classical models. In the quantum endeavor, we test our circuit dataset with
Quantum Kernel methods, and we show excellent results up to 97% accuracy. In
the classical world, the underlying quantum mechanical structures within the
quantum circuit data are not trivial. Nevertheless, we test our dataset on
three classical models: Structured State Space sequence model (S4), Transformer
and LSTM. In particular, the S4 model applied on the tokenized QASM sequences
reaches an impressive 77% accuracy. These findings illustrate that quantum
circuit-related datasets are likely to be quantum advantageous, but also that
state-of-the-art machine learning methodologies can competently classify and
recognize quantum circuits. We finally entrust the quantum and classical
machine learning community the fundamental challenge to build more
quantum-classical datasets like ours and to build future benchmarks from our
experiments. The dataset is accessible on GitHub and its circuits are easily
run in qulacs or qiskit.Comment: Preprint. Under revie