We believe that a wide range of physical processes conspire to shape the
observed galaxy population but we remain unsure of their detailed interactions.
The semi-analytic model (SAM) of galaxy formation uses multi-dimensional
parameterisations of the physical processes of galaxy formation and provides a
tool to constrain these underlying physical interactions. Because of the high
dimensionality, the parametric problem of galaxy formation may be profitably
tackled with a Bayesian-inference based approach, which allows one to constrain
theory with data in a statistically rigorous way. In this paper we develop a
SAM in the framework of Bayesian inference. We show that, with a parallel
implementation of an advanced Markov-Chain Monte-Carlo algorithm, it is now
possible to rigorously sample the posterior distribution of the
high-dimensional parameter space of typical SAMs. As an example, we
characterise galaxy formation in the current ΛCDM cosmology using the
stellar mass function of galaxies as an observational constraint. We find that
the posterior probability distribution is both topologically complex and
degenerate in some important model parameters, suggesting that thorough
explorations of the parameter space are needed to understand the models. We
also demonstrate that because of the model degeneracy, adopting a narrow prior
strongly restricts the model. Therefore, the inferences based on SAMs are
conditional to the model adopted. Using synthetic data to mimic systematic
errors in the stellar mass function, we demonstrate that an accurate
observational error model is essential to meaningful inference.Comment: revised version to match published article published in MNRA