Understanding protein structure is of crucial importance in science, medicine
and biotechnology. For about two decades, knowledge based potentials based on
pairwise distances -- so-called "potentials of mean force" (PMFs) -- have been
center stage in the prediction and design of protein structure and the
simulation of protein folding. However, the validity, scope and limitations of
these potentials are still vigorously debated and disputed, and the optimal
choice of the reference state -- a necessary component of these potentials --
is an unsolved problem. PMFs are loosely justified by analogy to the reversible
work theorem in statistical physics, or by a statistical argument based on a
likelihood function. Both justifications are insightful but leave many
questions unanswered. Here, we show for the first time that PMFs can be seen as
approximations to quantities that do have a rigorous probabilistic
justification: they naturally arise when probability distributions over
different features of proteins need to be combined. We call these quantities
reference ratio distributions deriving from the application of the reference
ratio method. This new view is not only of theoretical relevance, but leads to
many insights that are of direct practical use: the reference state is uniquely
defined and does not require external physical insights; the approach can be
generalized beyond pairwise distances to arbitrary features of protein
structure; and it becomes clear for which purposes the use of these quantities
is justified. We illustrate these insights with two applications, involving the
radius of gyration and hydrogen bonding. In the latter case, we also show how
the reference ratio method can be iteratively applied to sculpt an energy
funnel. Our results considerably increase the understanding and scope of energy
functions derived from known biomolecular structures