Central to several objective approaches to Bayesian model selection is the
use of training samples (subsets of the data), so as to allow utilization of
improper objective priors. The most common prescription for choosing training
samples is to choose them to be as small as possible, subject to yielding
proper posteriors; these are called minimal training samples.
When data can vary widely in terms of either information content or impact on
the improper priors, use of minimal training samples can be inadequate.
Important examples include certain cases of discrete data, the presence of
censored observations, and certain situations involving linear models and
explanatory variables. Such situations require more sophisticated methods of
choosing training samples. A variety of such methods are developed in this
paper, and successfully applied in challenging situations