We describe here a framework for a certain class of multiscale likelihood
factorizations wherein, in analogy to a wavelet decomposition of an L^2
function, a given likelihood function has an alternative representation as a
product of conditional densities reflecting information in both the data and
the parameter vector localized in position and scale. The framework is
developed as a set of sufficient conditions for the existence of such
factorizations, formulated in analogy to those underlying a standard
multiresolution analysis for wavelets, and hence can be viewed as a
multiresolution analysis for likelihoods. We then consider the use of these
factorizations in the task of nonparametric, complexity penalized likelihood
estimation. We study the risk properties of certain thresholding and
partitioning estimators, and demonstrate their adaptivity and near-optimality,
in a minimax sense over a broad range of function spaces, based on squared
Hellinger distance as a loss function. In particular, our results provide an
illustration of how properties of classical wavelet-based estimators can be
obtained in a single, unified framework that includes models for continuous,
count and categorical data types