17,225 research outputs found
A categorical characterization of relative entropy on standard Borel spaces
We give a categorical treatment, in the spirit of Baez and Fritz, of relative
entropy for probability distributions defined on standard Borel spaces. We
define a category suitable for reasoning about statistical inference on
standard Borel spaces. We define relative entropy as a functor into Lawvere's
category and we show convexity, lower semicontinuity and uniqueness.Comment: 16 page
Local proper scoring rules of order two
Scoring rules assess the quality of probabilistic forecasts, by assigning a
numerical score based on the predictive distribution and on the event or value
that materializes. A scoring rule is proper if it encourages truthful
reporting. It is local of order if the score depends on the predictive
density only through its value and the values of its derivatives of order up to
at the realizing event. Complementing fundamental recent work by Parry,
Dawid and Lauritzen, we characterize the local proper scoring rules of order 2
relative to a broad class of Lebesgue densities on the real line, using a
different approach. In a data example, we use local and nonlocal proper scoring
rules to assess statistically postprocessed ensemble weather forecasts.Comment: Published in at http://dx.doi.org/10.1214/12-AOS973 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Linear Estimating Equations for Exponential Families with Application to Gaussian Linear Concentration Models
In many families of distributions, maximum likelihood estimation is
intractable because the normalization constant for the density which enters
into the likelihood function is not easily available. The score matching
estimator of Hyv\"arinen (2005) provides an alternative where this
normalization constant is not required. The corresponding estimating equations
become linear for an exponential family. The score matching estimator is shown
to be consistent and asymptotically normally distributed for such models,
although not necessarily efficient. Gaussian linear concentration models are
examples of such families. For linear concentration models that are also linear
in the covariance we show that the score matching estimator is identical to the
maximum likelihood estimator, hence in such cases it is also efficient.
Gaussian graphical models and graphical models with symmetries form
particularly interesting subclasses of linear concentration models and we
investigate the potential use of the score matching estimator for this case
New multicategory boosting algorithms based on multicategory Fisher-consistent losses
Fisher-consistent loss functions play a fundamental role in the construction
of successful binary margin-based classifiers. In this paper we establish the
Fisher-consistency condition for multicategory classification problems. Our
approach uses the margin vector concept which can be regarded as a
multicategory generalization of the binary margin. We characterize a wide class
of smooth convex loss functions that are Fisher-consistent for multicategory
classification. We then consider using the margin-vector-based loss functions
to derive multicategory boosting algorithms. In particular, we derive two new
multicategory boosting algorithms by using the exponential and logistic
regression losses.Comment: Published in at http://dx.doi.org/10.1214/08-AOAS198 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …