2,287 research outputs found

    The Minimum Description Length Principle and Model Selection in Spectropolarimetry

    Get PDF
    It is shown that the two-part Minimum Description Length Principle can be used to discriminate among different models that can explain a given observed dataset. The description length is chosen to be the sum of the lengths of the message needed to encode the model plus the message needed to encode the data when the model is applied to the dataset. It is verified that the proposed principle can efficiently distinguish the model that correctly fits the observations while avoiding over-fitting. The capabilities of this criterion are shown in two simple problems for the analysis of observed spectropolarimetric signals. The first is the de-noising of observations with the aid of the PCA technique. The second is the selection of the optimal number of parameters in LTE inversions. We propose this criterion as a quantitative approach for distinguising the most plausible model among a set of proposed models. This quantity is very easy to implement as an additional output on the existing inversion codes.Comment: Accepted for publication in the Astrophysical Journa

    Determining Principal Component Cardinality through the Principle of Minimum Description Length

    Full text link
    PCA (Principal Component Analysis) and its variants areubiquitous techniques for matrix dimension reduction and reduced-dimensionlatent-factor extraction. One significant challenge in using PCA, is thechoice of the number of principal components. The information-theoreticMDL (Minimum Description Length) principle gives objective compression-based criteria for model selection, but it is difficult to analytically applyits modern definition - NML (Normalized Maximum Likelihood) - to theproblem of PCA. This work shows a general reduction of NML prob-lems to lower-dimension problems. Applying this reduction, it boundsthe NML of PCA, by terms of the NML of linear regression, which areknown.Comment: LOD 201

    The Loss Rank Principle for Model Selection

    Full text link
    We introduce a new principle for model selection in regression and classification. Many regression models are controlled by some smoothness or flexibility or complexity parameter c, e.g. the number of neighbors to be averaged over in k nearest neighbor (kNN) regression or the polynomial degree in regression with polynomials. Let f_D^c be the (best) regressor of complexity c on data D. A more flexible regressor can fit more data D' well than a more rigid one. If something (here small loss) is easy to achieve it's typically worth less. We define the loss rank of f_D^c as the number of other (fictitious) data D' that are fitted better by f_D'^c than D is fitted by f_D^c. We suggest selecting the model complexity c that has minimal loss rank (LoRP). Unlike most penalized maximum likelihood variants (AIC,BIC,MDL), LoRP only depends on the regression function and loss function. It works without a stochastic noise model, and is directly applicable to any non-parametric regressor, like kNN. In this paper we formalize, discuss, and motivate LoRP, study it for specific regression problems, in particular linear ones, and compare it to other model selection schemes.Comment: 16 page

    MDL Convergence Speed for Bernoulli Sequences

    Get PDF
    The Minimum Description Length principle for online sequence estimation/prediction in a proper learning setup is studied. If the underlying model class is discrete, then the total expected square loss is a particularly interesting performance measure: (a) this quantity is finitely bounded, implying convergence with probability one, and (b) it additionally specifies the convergence speed. For MDL, in general one can only have loss bounds which are finite but exponentially larger than those for Bayes mixtures. We show that this is even the case if the model class contains only Bernoulli distributions. We derive a new upper bound on the prediction error for countable Bernoulli classes. This implies a small bound (comparable to the one for Bayes mixtures) for certain important model classes. We discuss the application to Machine Learning tasks such as classification and hypothesis testing, and generalization to countable classes of i.i.d. models.Comment: 28 page

    A new determination of the orbit and masses of the Be binary system delta Scorpii

    Full text link
    The binary star delta Sco (HD143275) underwent remarkable brightening in the visible in 2000, and continues to be irregularly variable. The system was observed with the Sydney University Stellar Interferometer (SUSI) in 1999, 2000, 2001, 2006 and 2007. The 1999 observations were consistent with predictions based on the previously published orbital elements. The subsequent observations can only be explained by assuming that an optically bright emission region with an angular size of > 2 +/- 1 mas formed around the primary in 2000. By 2006/2007 the size of this region grew to an estimated > 4 mas. We have determined a consistent set of orbital elements by simultaneously fitting all the published interferometric and spectroscopic data as well as the SUSI data reported here. The resulting elements and the brightness ratio for the system measured prior to the outburst in 2000 have been used to estimate the masses of the components. We find Ma = 15 +/- 7 Msun and Mb = 8.0 +/- 3.6 Msun. The dynamical parallax is estimated to be 7.03 +/- 0.15 mas, which is in good agreement with the revised HIPPARCOS parallax.Comment: 8 pages, 4 figs. Accepted for publication in MNRA

    Isomeric states close to doubly magic 132^{132}Sn studied with JYFLTRAP

    Full text link
    The double Penning trap mass spectrometer JYFLTRAP has been employed to measure masses and excitation energies for 11/2−11/2^- isomers in 121^{121}Cd, 123^{123}Cd, 125^{125}Cd and 133^{133}Te, for 1/2−1/2^- isomers in 129^{129}In and 131^{131}In, and for 7−7^- isomers in 130^{130}Sn and 134^{134}Sb. These first direct mass measurements of the Cd and In isomers reveal deviations to the excitation energies based on results from beta-decay experiments and yield new information on neutron- and proton-hole states close to 132^{132}Sn. A new excitation energy of 144(4) keV has been determined for 123^{123}Cdm^m. A good agreement with the precisely known excitation energies of 121^{121}Cdm^m, 130^{130}Snm^m, and 134^{134}Sbm^m has been found.Comment: 10 pages, 6 figures, submitted to Phys. Rev.

    Q_EC values of the Superallowed beta-Emitters 10-C, 34-Ar, 38-Ca and 46-V

    Full text link
    The Q_EC values of the superallowed beta+ emitters 10-C, 34-Ar, 38-Ca and 46-V have been measured with a Penning-trap mass spectrometer to be 3648.12(8), 6061.83(8), 6612.12(7) and 7052.44(10) keV, respectively. All four values are substantially improved in precision over previous results.Comment: 9 pages, 7 figures, 5 table

    Mutual Information of Population Codes and Distance Measures in Probability Space

    Full text link
    We studied the mutual information between a stimulus and a large system consisting of stochastic, statistically independent elements that respond to a stimulus. The Mutual Information (MI) of the system saturates exponentially with system size. A theory of the rate of saturation of the MI is developed. We show that this rate is controlled by a distance function between the response probabilities induced by different stimuli. This function, which we term the {\it Confusion Distance} between two probabilities, is related to the Renyi α\alpha-Information.Comment: 11 pages, 3 figures, accepted to PR

    PAC-Bayesian Bounds for Randomized Empirical Risk Minimizers

    Get PDF
    The aim of this paper is to generalize the PAC-Bayesian theorems proved by Catoni in the classification setting to more general problems of statistical inference. We show how to control the deviations of the risk of randomized estimators. A particular attention is paid to randomized estimators drawn in a small neighborhood of classical estimators, whose study leads to control the risk of the latter. These results allow to bound the risk of very general estimation procedures, as well as to perform model selection
    • 

    corecore