48 research outputs found
On the closure of relational models
Relational models for contingency tables are generalizations of log-linear
models, allowing effects associated with arbitrary subsets of cells in a
possibly incomplete table, and not necessarily containing the overall effect.
In this generality, the MLEs under Poisson and multinomial sampling are not
always identical. This paper deals with the theory of maximum likelihood
estimation in the case when there are observed zeros in the data. A unique MLE
to such data is shown to always exist in the set of pointwise limits of
sequences of distributions in the original model. This set is equal to the
closure of the original model with respect to the Bregman information
divergence. The same variant of iterative scaling may be used to compute the
MLE in the original model and in its closure
Faithfulness and learning hypergraphs from discrete distributions
The concepts of faithfulness and strong-faithfulness are important for
statistical learning of graphical models. Graphs are not sufficient for
describing the association structure of a discrete distribution. Hypergraphs
representing hierarchical log-linear models are considered instead, and the
concept of parametric (strong-) faithfulness with respect to a hypergraph is
introduced. Strong-faithfulness ensures the existence of uniformly consistent
parameter estimators and enables building uniformly consistent procedures for a
hypergraph search. The strength of association in a discrete distribution can
be quantified with various measures, leading to different concepts of
strong-faithfulness. Lower and upper bounds for the proportions of
distributions that do not satisfy strong-faithfulness are computed for
different parameterizations and measures of association.Comment: 23 pages, 6 figure
Relational models for contingency tables
The paper considers general multiplicative models for complete and incomplete
contingency tables that generalize log-linear and several other models and are
entirely coordinate free. Sufficient conditions of the existence of maximum
likelihood estimates under these models are given, and it is shown that the
usual equivalence between multinomial and Poisson likelihoods holds if and only
if an overall effect is present in the model. If such an effect is not assumed,
the model becomes a curved exponential family and a related mixed
parameterization is given that relies on non-homogeneous odds ratios. Several
examples are presented to illustrate the properties and use of such models
Entropy and Hausdorff Dimension in Random Growing Trees
We investigate the limiting behavior of random tree growth in preferential
attachment models. The tree stems from a root, and we add vertices to the
system one-by-one at random, according to a rule which depends on the degree
distribution of the already existing tree. The so-called weight function, in
terms of which the rule of attachment is formulated, is such that each vertex
in the tree can have at most K children. We define the concept of a certain
random measure mu on the leaves of the limiting tree, which captures a global
property of the tree growth in a natural way. We prove that the Hausdorff and
the packing dimension of this limiting measure is equal and constant with
probability one. Moreover, the local dimension of mu equals the Hausdorff
dimension at mu-almost every point. We give an explicit formula for the
dimension, given the rule of attachment
Iterative Scaling in Curved Exponential Families
The paper describes a generalized iterative proportional fitting procedure that can be used for maximum likelihood estimation in a special class of the general log-linear model. The models in this class, called relational, apply to multivariate discrete sample spaces that do not necessarily have a Cartesian product structure and may not contain an overall effect. When applied to the cell probabilities, the models without the overall effect are curved exponential families and the values of the sufficient statistics are reproduced by the MLE only up to a constant of proportionality. The paper shows that Iterative Proportional Fitting, Generalized Iterative Scaling, and Improved Iterative Scaling fail to work for such models. The algorithm proposed here is based on iterated Bregman projections. As a by-product, estimates of the multiplicative parameters are also obtained. An implementation of the algorithm is available as an R-package