20,543 research outputs found
Algebraic statistical models
Many statistical models are algebraic in that they are defined in terms of
polynomial constraints, or in terms of polynomial or rational parametrizations.
The parameter spaces of such models are typically semi-algebraic subsets of the
parameter space of a reference model with nice properties, such as for example
a regular exponential family. This observation leads to the definition of an
`algebraic exponential family'. This new definition provides a unified
framework for the study of statistical models with algebraic structure. In this
paper we review the ingredients to this definition and illustrate in examples
how computational algebraic geometry can be used to solve problems arising in
statistical inference in algebraic models
Equations of States in Statistical Learning for a Nonparametrizable and Regular Case
Many learning machines that have hierarchical structure or hidden variables
are now being used in information science, artificial intelligence, and
bioinformatics. However, several learning machines used in such fields are not
regular but singular statistical models, hence their generalization performance
is still left unknown. To overcome these problems, in the previous papers, we
proved new equations in statistical learning, by which we can estimate the
Bayes generalization loss from the Bayes training loss and the functional
variance, on the condition that the true distribution is a singularity
contained in a learning machine. In this paper, we prove that the same
equations hold even if a true distribution is not contained in a parametric
model. Also we prove that, the proposed equations in a regular case are
asymptotically equivalent to the Takeuchi information criterion. Therefore, the
proposed equations are always applicable without any condition on the unknown
true distribution
A Widely Applicable Bayesian Information Criterion
A statistical model or a learning machine is called regular if the map taking
a parameter to a probability distribution is one-to-one and if its Fisher
information matrix is always positive definite. If otherwise, it is called
singular. In regular statistical models, the Bayes free energy, which is
defined by the minus logarithm of Bayes marginal likelihood, can be
asymptotically approximated by the Schwarz Bayes information criterion (BIC),
whereas in singular models such approximation does not hold.
Recently, it was proved that the Bayes free energy of a singular model is
asymptotically given by a generalized formula using a birational invariant, the
real log canonical threshold (RLCT), instead of half the number of parameters
in BIC. Theoretical values of RLCTs in several statistical models are now being
discovered based on algebraic geometrical methodology. However, it has been
difficult to estimate the Bayes free energy using only training samples,
because an RLCT depends on an unknown true distribution.
In the present paper, we define a widely applicable Bayesian information
criterion (WBIC) by the average log likelihood function over the posterior
distribution with the inverse temperature , where is the number
of training samples. We mathematically prove that WBIC has the same asymptotic
expansion as the Bayes free energy, even if a statistical model is singular for
and unrealizable by a statistical model. Since WBIC can be numerically
calculated without any information about a true distribution, it is a
generalized version of BIC onto singular statistical models.Comment: 30 page
Computing all roots of the likelihood equations of seemingly unrelated regressions
Seemingly unrelated regressions are statistical regression models based on
the Gaussian distribution. They are popular in econometrics but also arise in
graphical modeling of multivariate dependencies. In maximum likelihood
estimation, the parameters of the model are estimated by maximizing the
likelihood function, which maps the parameters to the likelihood of observing
the given data. By transforming this optimization problem into a polynomial
optimization problem, it was recently shown that the likelihood function of a
simple bivariate seemingly unrelated regressions model may have several
stationary points. Thus local maxima may complicate maximum likelihood
estimation. In this paper, we study several more complicated seemingly
unrelated regression models, and show how all stationary points of the
likelihood function can be computed using algebraic geometry.Comment: To appear in the Journal of Symbolic Computation, special issue on
Computational Algebraic Statistics. 11 page
Structured penalties for functional linear models---partially empirical eigenvectors for regression
One of the challenges with functional data is incorporating spatial
structure, or local correlation, into the analysis. This structure is inherent
in the output from an increasing number of biomedical technologies, and a
functional linear model is often used to estimate the relationship between the
predictor functions and scalar responses. Common approaches to the ill-posed
problem of estimating a coefficient function typically involve two stages:
regularization and estimation. Regularization is usually done via dimension
reduction, projecting onto a predefined span of basis functions or a reduced
set of eigenvectors (principal components). In contrast, we present a unified
approach that directly incorporates spatial structure into the estimation
process by exploiting the joint eigenproperties of the predictors and a linear
penalty operator. In this sense, the components in the regression are
`partially empirical' and the framework is provided by the generalized singular
value decomposition (GSVD). The GSVD clarifies the penalized estimation process
and informs the choice of penalty by making explicit the joint influence of the
penalty and predictors on the bias, variance, and performance of the estimated
coefficient function. Laboratory spectroscopy data and simulations are used to
illustrate the concepts.Comment: 29 pages, 3 figures, 5 tables; typo/notational errors edited and
intro revised per journal review proces
Computational algebraic methods in efficient estimation
A strong link between information geometry and algebraic statistics is made
by investigating statistical manifolds which are algebraic varieties. In
particular it it shown how first and second order efficient estimators can be
constructed, such as bias corrected Maximum Likelihood and more general
estimators, and for which the estimating equations are purely algebraic. In
addition it is shown how Gr\"obner basis technology, which is at the heart of
algebraic statistics, can be used to reduce the degrees of the terms in the
estimating equations. This points the way to the feasible use, to find the
estimators, of special methods for solving polynomial equations, such as
homotopy continuation methods. Simple examples are given showing both equations
and computations. *** The proof of Theorem 2 was corrected by the latest
version. Some minor errors were also corrected.Comment: 21 pages, 5 figure
- …