2,439 research outputs found
Determining Principal Component Cardinality through the Principle of Minimum Description Length
PCA (Principal Component Analysis) and its variants areubiquitous techniques
for matrix dimension reduction and reduced-dimensionlatent-factor extraction.
One significant challenge in using PCA, is thechoice of the number of principal
components. The information-theoreticMDL (Minimum Description Length) principle
gives objective compression-based criteria for model selection, but it is
difficult to analytically applyits modern definition - NML (Normalized Maximum
Likelihood) - to theproblem of PCA. This work shows a general reduction of NML
prob-lems to lower-dimension problems. Applying this reduction, it boundsthe
NML of PCA, by terms of the NML of linear regression, which areknown.Comment: LOD 201
The Minimum Description Length Principle and Model Selection in Spectropolarimetry
It is shown that the two-part Minimum Description Length Principle can be
used to discriminate among different models that can explain a given observed
dataset. The description length is chosen to be the sum of the lengths of the
message needed to encode the model plus the message needed to encode the data
when the model is applied to the dataset. It is verified that the proposed
principle can efficiently distinguish the model that correctly fits the
observations while avoiding over-fitting. The capabilities of this criterion
are shown in two simple problems for the analysis of observed
spectropolarimetric signals. The first is the de-noising of observations with
the aid of the PCA technique. The second is the selection of the optimal number
of parameters in LTE inversions. We propose this criterion as a quantitative
approach for distinguising the most plausible model among a set of proposed
models. This quantity is very easy to implement as an additional output on the
existing inversion codes.Comment: Accepted for publication in the Astrophysical Journa
Indefinitely Oscillating Martingales
We construct a class of nonnegative martingale processes that oscillate
indefinitely with high probability. For these processes, we state a uniform
rate of the number of oscillations and show that this rate is asymptotically
close to the theoretical upper bound. These bounds on probability and
expectation of the number of upcrossings are compared to classical bounds from
the martingale literature. We discuss two applications. First, our results
imply that the limit of the minimum description length operator may not exist.
Second, we give bounds on how often one can change one's belief in a given
hypothesis when observing a stream of data.Comment: ALT 2014, extended technical repor
MDL Convergence Speed for Bernoulli Sequences
The Minimum Description Length principle for online sequence
estimation/prediction in a proper learning setup is studied. If the underlying
model class is discrete, then the total expected square loss is a particularly
interesting performance measure: (a) this quantity is finitely bounded,
implying convergence with probability one, and (b) it additionally specifies
the convergence speed. For MDL, in general one can only have loss bounds which
are finite but exponentially larger than those for Bayes mixtures. We show that
this is even the case if the model class contains only Bernoulli distributions.
We derive a new upper bound on the prediction error for countable Bernoulli
classes. This implies a small bound (comparable to the one for Bayes mixtures)
for certain important model classes. We discuss the application to Machine
Learning tasks such as classification and hypothesis testing, and
generalization to countable classes of i.i.d. models.Comment: 28 page
A new determination of the orbit and masses of the Be binary system delta Scorpii
The binary star delta Sco (HD143275) underwent remarkable brightening in the
visible in 2000, and continues to be irregularly variable. The system was
observed with the Sydney University Stellar Interferometer (SUSI) in 1999,
2000, 2001, 2006 and 2007. The 1999 observations were consistent with
predictions based on the previously published orbital elements. The subsequent
observations can only be explained by assuming that an optically bright
emission region with an angular size of > 2 +/- 1 mas formed around the primary
in 2000. By 2006/2007 the size of this region grew to an estimated > 4 mas.
We have determined a consistent set of orbital elements by simultaneously
fitting all the published interferometric and spectroscopic data as well as the
SUSI data reported here. The resulting elements and the brightness ratio for
the system measured prior to the outburst in 2000 have been used to estimate
the masses of the components. We find Ma = 15 +/- 7 Msun and Mb = 8.0 +/- 3.6
Msun. The dynamical parallax is estimated to be 7.03 +/- 0.15 mas, which is in
good agreement with the revised HIPPARCOS parallax.Comment: 8 pages, 4 figs. Accepted for publication in MNRA
Q_EC values of the Superallowed beta-Emitters 10-C, 34-Ar, 38-Ca and 46-V
The Q_EC values of the superallowed beta+ emitters 10-C, 34-Ar, 38-Ca and
46-V have been measured with a Penning-trap mass spectrometer to be 3648.12(8),
6061.83(8), 6612.12(7) and 7052.44(10) keV, respectively. All four values are
substantially improved in precision over previous results.Comment: 9 pages, 7 figures, 5 table
Isomeric states close to doubly magic Sn studied with JYFLTRAP
The double Penning trap mass spectrometer JYFLTRAP has been employed to
measure masses and excitation energies for isomers in Cd,
Cd, Cd and Te, for isomers in In and
In, and for isomers in Sn and Sb. These first
direct mass measurements of the Cd and In isomers reveal deviations to the
excitation energies based on results from beta-decay experiments and yield new
information on neutron- and proton-hole states close to Sn. A new
excitation energy of 144(4) keV has been determined for Cd. A good
agreement with the precisely known excitation energies of Cd,
Sn, and Sb has been found.Comment: 10 pages, 6 figures, submitted to Phys. Rev.
- âŠ