2,287 research outputs found
The Minimum Description Length Principle and Model Selection in Spectropolarimetry
It is shown that the two-part Minimum Description Length Principle can be
used to discriminate among different models that can explain a given observed
dataset. The description length is chosen to be the sum of the lengths of the
message needed to encode the model plus the message needed to encode the data
when the model is applied to the dataset. It is verified that the proposed
principle can efficiently distinguish the model that correctly fits the
observations while avoiding over-fitting. The capabilities of this criterion
are shown in two simple problems for the analysis of observed
spectropolarimetric signals. The first is the de-noising of observations with
the aid of the PCA technique. The second is the selection of the optimal number
of parameters in LTE inversions. We propose this criterion as a quantitative
approach for distinguising the most plausible model among a set of proposed
models. This quantity is very easy to implement as an additional output on the
existing inversion codes.Comment: Accepted for publication in the Astrophysical Journa
Determining Principal Component Cardinality through the Principle of Minimum Description Length
PCA (Principal Component Analysis) and its variants areubiquitous techniques
for matrix dimension reduction and reduced-dimensionlatent-factor extraction.
One significant challenge in using PCA, is thechoice of the number of principal
components. The information-theoreticMDL (Minimum Description Length) principle
gives objective compression-based criteria for model selection, but it is
difficult to analytically applyits modern definition - NML (Normalized Maximum
Likelihood) - to theproblem of PCA. This work shows a general reduction of NML
prob-lems to lower-dimension problems. Applying this reduction, it boundsthe
NML of PCA, by terms of the NML of linear regression, which areknown.Comment: LOD 201
The Loss Rank Principle for Model Selection
We introduce a new principle for model selection in regression and
classification. Many regression models are controlled by some smoothness or
flexibility or complexity parameter c, e.g. the number of neighbors to be
averaged over in k nearest neighbor (kNN) regression or the polynomial degree
in regression with polynomials. Let f_D^c be the (best) regressor of complexity
c on data D. A more flexible regressor can fit more data D' well than a more
rigid one. If something (here small loss) is easy to achieve it's typically
worth less. We define the loss rank of f_D^c as the number of other
(fictitious) data D' that are fitted better by f_D'^c than D is fitted by
f_D^c. We suggest selecting the model complexity c that has minimal loss rank
(LoRP). Unlike most penalized maximum likelihood variants (AIC,BIC,MDL), LoRP
only depends on the regression function and loss function. It works without a
stochastic noise model, and is directly applicable to any non-parametric
regressor, like kNN. In this paper we formalize, discuss, and motivate LoRP,
study it for specific regression problems, in particular linear ones, and
compare it to other model selection schemes.Comment: 16 page
MDL Convergence Speed for Bernoulli Sequences
The Minimum Description Length principle for online sequence
estimation/prediction in a proper learning setup is studied. If the underlying
model class is discrete, then the total expected square loss is a particularly
interesting performance measure: (a) this quantity is finitely bounded,
implying convergence with probability one, and (b) it additionally specifies
the convergence speed. For MDL, in general one can only have loss bounds which
are finite but exponentially larger than those for Bayes mixtures. We show that
this is even the case if the model class contains only Bernoulli distributions.
We derive a new upper bound on the prediction error for countable Bernoulli
classes. This implies a small bound (comparable to the one for Bayes mixtures)
for certain important model classes. We discuss the application to Machine
Learning tasks such as classification and hypothesis testing, and
generalization to countable classes of i.i.d. models.Comment: 28 page
A new determination of the orbit and masses of the Be binary system delta Scorpii
The binary star delta Sco (HD143275) underwent remarkable brightening in the
visible in 2000, and continues to be irregularly variable. The system was
observed with the Sydney University Stellar Interferometer (SUSI) in 1999,
2000, 2001, 2006 and 2007. The 1999 observations were consistent with
predictions based on the previously published orbital elements. The subsequent
observations can only be explained by assuming that an optically bright
emission region with an angular size of > 2 +/- 1 mas formed around the primary
in 2000. By 2006/2007 the size of this region grew to an estimated > 4 mas.
We have determined a consistent set of orbital elements by simultaneously
fitting all the published interferometric and spectroscopic data as well as the
SUSI data reported here. The resulting elements and the brightness ratio for
the system measured prior to the outburst in 2000 have been used to estimate
the masses of the components. We find Ma = 15 +/- 7 Msun and Mb = 8.0 +/- 3.6
Msun. The dynamical parallax is estimated to be 7.03 +/- 0.15 mas, which is in
good agreement with the revised HIPPARCOS parallax.Comment: 8 pages, 4 figs. Accepted for publication in MNRA
Isomeric states close to doubly magic Sn studied with JYFLTRAP
The double Penning trap mass spectrometer JYFLTRAP has been employed to
measure masses and excitation energies for isomers in Cd,
Cd, Cd and Te, for isomers in In and
In, and for isomers in Sn and Sb. These first
direct mass measurements of the Cd and In isomers reveal deviations to the
excitation energies based on results from beta-decay experiments and yield new
information on neutron- and proton-hole states close to Sn. A new
excitation energy of 144(4) keV has been determined for Cd. A good
agreement with the precisely known excitation energies of Cd,
Sn, and Sb has been found.Comment: 10 pages, 6 figures, submitted to Phys. Rev.
Q_EC values of the Superallowed beta-Emitters 10-C, 34-Ar, 38-Ca and 46-V
The Q_EC values of the superallowed beta+ emitters 10-C, 34-Ar, 38-Ca and
46-V have been measured with a Penning-trap mass spectrometer to be 3648.12(8),
6061.83(8), 6612.12(7) and 7052.44(10) keV, respectively. All four values are
substantially improved in precision over previous results.Comment: 9 pages, 7 figures, 5 table
Mutual Information of Population Codes and Distance Measures in Probability Space
We studied the mutual information between a stimulus and a large system
consisting of stochastic, statistically independent elements that respond to a
stimulus. The Mutual Information (MI) of the system saturates exponentially
with system size. A theory of the rate of saturation of the MI is developed. We
show that this rate is controlled by a distance function between the response
probabilities induced by different stimuli. This function, which we term the
{\it Confusion Distance} between two probabilities, is related to the Renyi
-Information.Comment: 11 pages, 3 figures, accepted to PR
PAC-Bayesian Bounds for Randomized Empirical Risk Minimizers
The aim of this paper is to generalize the PAC-Bayesian theorems proved by
Catoni in the classification setting to more general problems of statistical
inference. We show how to control the deviations of the risk of randomized
estimators. A particular attention is paid to randomized estimators drawn in a
small neighborhood of classical estimators, whose study leads to control the
risk of the latter. These results allow to bound the risk of very general
estimation procedures, as well as to perform model selection
- âŠ