Search CORE

12,112 research outputs found

Model selection in High-Dimensions: A Quadratic-risk based approach

Author: Akaike H.
Blom G.
Bollen K.
Fraley C.
Haughton D. M. A.
Lindsay B.
McLachlan G.
Ray S.
Schwarz G.
Serfling R. J.
van der Laan M. J.
Yang Y
Publication venue
Publication date: 01/01/2006
Field of study

In this article we propose a general class of risk measures which can be used for data based evaluation of parametric models. The loss function is defined as generalized quadratic distance between the true density and the proposed model. These distances are characterized by a simple quadratic form structure that is adaptable through the choice of a nonnegative definite kernel and a bandwidth parameter. Using asymptotic results for the quadratic distances we build a quick-to-compute approximation for the risk function. Its derivation is analogous to the Akaike Information Criterion (AIC), but unlike AIC, the quadratic risk is a global comparison tool. The method does not require resampling, a great advantage when point estimators are expensive to compute. The method is illustrated using the problem of selecting the number of components in a mixture model, where it is shown that, by using an appropriate kernel, the method is computationally straightforward in arbitrarily high data dimensions. In this same context it is shown that the method has some clear advantages over AIC and BIC.Comment: Updated with reviewer suggestion

arXiv.org e-Print Archive

CiteSeerX

Crossref

Research Papers in Economics

Enlighten

Building Combined Classifiers

Author: Eastwood Mark
Gabrys Bogdan
Publication venue: EXIT Publishing House
Publication date: 01/01/2008
Field of study

This chapter covers different approaches that may be taken when building an ensemble method, through studying specific examples of each approach from research conducted by the authors. A method called Negative Correlation Learning illustrates a decision level combination approach with individual classifiers trained co-operatively. The Model level combination paradigm is illustrated via a tree combination method. Finally, another variant of the decision level paradigm, with individuals trained independently instead of co-operatively, is discussed as applied to churn prediction in the telecommunications industry

Bournemouth University Research Online

A Modern Take on the Bias-Variance Tradeoff in Neural Networks

Author: Baratin Aristide
Lacoste-Julien Simon
Mitliagkas Ioannis
Mittal Sarthak
Neal Brady
Scicluna Matthew
Tantia Vinayak
Publication venue
Publication date: 18/12/2019
Field of study

The bias-variance tradeoff tells us that as model complexity increases, bias falls and variances increases, leading to a U-shaped test error curve. However, recent empirical results with over-parameterized neural networks are marked by a striking absence of the classic U-shaped test error curve: test error keeps decreasing in wider networks. This suggests that there might not be a bias-variance tradeoff in neural networks with respect to network width, unlike was originally claimed by, e.g., Geman et al. (1992). Motivated by the shaky evidence used to support this claim in neural networks, we measure bias and variance in the modern setting. We find that both bias and variance can decrease as the number of parameters grows. To better understand this, we introduce a new decomposition of the variance to disentangle the effects of optimization and data sampling. We also provide theoretical analysis in a simplified setting that is consistent with our empirical findings

arXiv.org e-Print Archive

Small Area Shrinkage Estimation

Author: Datta G.
Ghosh M.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 23/03/2012
Field of study

The need for small area estimates is increasingly felt in both the public and private sectors in order to formulate their strategic plans. It is now widely recognized that direct small area survey estimates are highly unreliable owing to large standard errors and coefficients of variation. The reason behind this is that a survey is usually designed to achieve a specified level of accuracy at a higher level of geography than that of small areas. Lack of additional resources makes it almost imperative to use the same data to produce small area estimates. For example, if a survey is designed to estimate per capita income for a state, the same survey data need to be used to produce similar estimates for counties, subcounties and census divisions within that state. Thus, by necessity, small area estimation needs explicit, or at least implicit, use of models to link these areas. Improved small area estimates are found by "borrowing strength" from similar neighboring areas.Comment: Published in at http://dx.doi.org/10.1214/11-STS374 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Desiderata for a Predictive Theory of Statistics

Author: Clarke Bertrand
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 01/01/2010
Field of study

In many contexts the predictive validation of models or their associated prediction strategies is of greater importance than model identification which may be practically impossible. This is particularly so in fields involving complex or high dimensional data where model selection, or more generally predictor selection is the main focus of effort. This paper suggests a unified treatment for predictive analyses based on six \u27desiderata\u27. These desiderata are an effort to clarify what criteria a good predictive theory of statistics should satisfy