Explainable AI (XAI) is a rapidly evolving field that aims to improve
transparency and trustworthiness of AI systems to humans. One of the unsolved
challenges in XAI is estimating the performance of these explanation methods
for neural networks, which has resulted in numerous competing metrics with
little to no indication of which one is to be preferred. In this paper, to
identify the most reliable evaluation method in a given explainability context,
we propose MetaQuantus -- a simple yet powerful framework that meta-evaluates
two complementary performance characteristics of an evaluation method: its
resilience to noise and reactivity to randomness. We demonstrate the
effectiveness of our framework through a series of experiments, targeting
various open questions in XAI, such as the selection of explanation methods and
optimisation of hyperparameters of a given metric. We release our work under an
open-source license to serve as a development tool for XAI researchers and
Machine Learning (ML) practitioners to verify and benchmark newly constructed
metrics (i.e., ``estimators'' of explanation quality). With this work, we
provide clear and theoretically-grounded guidance for building reliable
evaluation methods, thus facilitating standardisation and reproducibility in
the field of XAI.Comment: 30 pages, 12 figures, 3 table