We demonstrate the effectiveness of a Bayesian evidence-based analysis for
diagnosing and disentangling the sky-averaged 21-cm signal from instrumental
systematic effects. As a case study, we consider a simulated REACH pipeline
with an injected systematic. We demonstrate that very poor performance or
erroneous signal recovery is achieved if the systematic remains unmodelled.
These effects include sky-averaged 21-cm posterior estimates resembling a very
deep or wide signal. However, when including parameterised models of the
systematic, the signal recovery is dramatically improved in performance. Most
importantly, a Bayesian evidence-based model comparison is capable of
determining whether or not such a systematic model is needed as the true
underlying generative model of an experimental dataset is in principle unknown.
We, therefore, advocate a pipeline capable of testing a variety of potential
systematic errors with the Bayesian evidence acting as the mechanism for
detecting their presence