The problem of learning forest-structured discrete graphical models from
i.i.d. samples is considered. An algorithm based on pruning of the Chow-Liu
tree through adaptive thresholding is proposed. It is shown that this algorithm
is both structurally consistent and risk consistent and the error probability
of structure learning decays faster than any polynomial in the number of
samples under fixed model size. For the high-dimensional scenario where the
size of the model d and the number of edges k scale with the number of samples
n, sufficient conditions on (n,d,k) are given for the algorithm to satisfy
structural and risk consistencies. In addition, the extremal structures for
learning are identified; we prove that the independent (resp. tree) model is
the hardest (resp. easiest) to learn using the proposed algorithm in terms of
error rates for structure learning.Comment: Accepted to the Journal of Machine Learning Research (Feb 2011