153,774 research outputs found

    Optimal predictive model selection

    Full text link
    Often the goal of model selection is to choose a model for future prediction, and it is natural to measure the accuracy of a future prediction by squared error loss. Under the Bayesian approach, it is commonly perceived that the optimal predictive model is the model with highest posterior probability, but this is not necessarily the case. In this paper we show that, for selection among normal linear models, the optimal predictive model is often the median probability model, which is defined as the model consisting of those variables which have overall posterior probability greater than or equal to 1/2 of being in a model. The median probability model often differs from the highest probability model

    Model selection in cosmology

    Get PDF
    Model selection aims to determine which theoretical models are most plausible given some data, without necessarily considering preferred values of model parameters. A common model selection question is to ask when new data require introduction of an additional parameter, describing a newly discovered physical effect. We review model selection statistics, then focus on the Bayesian evidence, which implements Bayesian analysis at the level of models rather than parameters. We describe our CosmoNest code, the first computationally efficient implementation of Bayesian model selection in a cosmological context. We apply it to recent WMAP satellite data, examining the need for a perturbation spectral index differing from the scaleinvariant (Harrison–Zel'dovich) case

    The Model Selection Curse

    Full text link
    A "statistician" takes an action on behalf of an agent, based on the agent's self-reported personal data and a sample involving other people. The action that he takes is an estimated function of the agent's report. The estimation procedure involves model selection. We ask the following question: Is truth-telling optimal for the agent given the statistician's procedure? We analyze this question in the context of a simple example that highlights the role of model selection. We suggest that our simple exercise may have implications for the broader issue of human interaction with "machine learning" algorithms

    Model selection and local geometry

    Full text link
    We consider problems in model selection caused by the geometry of models close to their points of intersection. In some cases---including common classes of causal or graphical models, as well as time series models---distinct models may nevertheless have identical tangent spaces. This has two immediate consequences: first, in order to obtain constant power to reject one model in favour of another we need local alternative hypotheses that decrease to the null at a slower rate than the usual parametric n−1/2n^{-1/2} (typically we will require n−1/4n^{-1/4} or slower); in other words, to distinguish between the models we need large effect sizes or very large sample sizes. Second, we show that under even weaker conditions on their tangent cones, models in these classes cannot be made simultaneously convex by a reparameterization. This shows that Bayesian network models, amongst others, cannot be learned directly with a convex method similar to the graphical lasso. However, we are able to use our results to suggest methods for model selection that learn the tangent space directly, rather than the model itself. In particular, we give a generic algorithm for learning Bayesian network models

    Bootstrap for neural model selection

    Full text link
    Bootstrap techniques (also called resampling computation techniques) have introduced new advances in modeling and model evaluation. Using resampling methods to construct a series of new samples which are based on the original data set, allows to estimate the stability of the parameters. Properties such as convergence and asymptotic normality can be checked for any particular observed data set. In most cases, the statistics computed on the generated data sets give a good idea of the confidence regions of the estimates. In this paper, we debate on the contribution of such methods for model selection, in the case of feedforward neural networks. The method is described and compared with the leave-one-out resampling method. The effectiveness of the bootstrap method, versus the leave-one-out methode, is checked through a number of examples.Comment: A la suite de la conf\'{e}rence ESANN 200

    Model selection for amplitude analysis

    Get PDF
    Model complexity in amplitude analyses is often a priori under-constrained since the underlying theory permits a large number of possible amplitudes to contribute to most physical processes. The use of an overly complex model results in reduced predictive power and worse resolution on unknown parameters of interest. Therefore, it is common to reduce the complexity by removing from consideration some subset of the allowed amplitudes. This paper studies a method for limiting model complexity from the data sample itself through regularization during regression in the context of a multivariate (Dalitz-plot) analysis. The regularization technique applied greatly improves the performance. An outline of how to obtain the significance of a resonance in a multivariate amplitude analysis is also provided
    • …
    corecore