Search CORE

134,958 research outputs found

Statistical and Computational Tradeoff in Genetic Algorithm-Based Estimation

Author: Battaglia Francesco
Rizzo Manuel
Publication venue: 'Informa UK Limited'
Publication date: 25/03/2017
Field of study

When a Genetic Algorithm (GA), or a stochastic algorithm in general, is employed in a statistical problem, the obtained result is affected by both variability due to sampling, that refers to the fact that only a sample is observed, and variability due to the stochastic elements of the algorithm. This topic can be easily set in a framework of statistical and computational tradeoff question, crucial in recent problems, for which statisticians must carefully set statistical and computational part of the analysis, taking account of some resource or time constraints. In the present work we analyze estimation problems tackled by GAs, for which variability of estimates can be decomposed in the two sources of variability, considering some constraints in the form of cost functions, related to both data acquisition and runtime of the algorithm. Simulation studies will be presented to discuss the statistical and computational tradeoff question.Comment: 17 pages, 5 figure

arXiv.org e-Print Archive

Archivio della ricerca- Università di Roma La Sapienza

Least absolute deviation estimation of linear econometric models: A literature review

Author: Dasgupta Madhuchhanda
Mishra SK
Publication venue
Publication date
Field of study

Econometricians generally take for granted that the error terms in the econometric models are generated by distributions having a finite variance. However, since the time of Pareto the existence of error distributions with infinite variance is known. Works of many econometricians, namely, Meyer & Glauber (1964), Fama (1965) and Mandlebroth (1967), on economic data series like prices in financial and commodity markets confirm that infinite variance distributions exist abundantly. The distribution of firms by size, behaviour of speculative prices and various other recent economic phenomena also display similar trends. Further, econometricians generally assume that the disturbance term, which is an influence of innumerably many factors not accounted for in the model, approaches normality according to the Central Limit Theorem. But Bartels (1977) is of the opinion that there are limit theorems, which are just likely to be relevant when considering the sum of number of components in a regression disturbance that leads to non-normal stable distribution characterized by infinite variance. Thus, the possibility of the error term following a non-normal distribution exists. The Least Squares method of estimation of parameters of linear (regression) models performs well provided that the residuals (disturbances or errors) are well behaved (preferably normally or near-normally distributed and not infested with large size outliers) and follow Gauss-Markov assumptions. However, models with the disturbances that are prominently non-normally distributed and contain sizeable outliers fail estimation by the Least Squares method. An intensive research has established that in such cases estimation by the Least Absolute Deviation (LAD) method performs well. This paper is an attempt to survey the literature on LAD estimation of single as well as multi-equation linear econometric models.Lad estimator; Least absolute deviation estimation; econometric model; LAD Estimator; Minimum Absolute Deviation; Robust; Outliers; L1 Estimator; Review of literature

Research Papers in Economics

Implementation of complex interactions in a Cox regression framework

Author: Müller M.
Ulm Kurt
Publication venue
Publication date: 01/01/2003
Field of study

The standard Cox proportional hazards model has been extended by functionally describable interaction terms. The first of which are related to neural networks by adopting the idea of transforming sums of weighted covariables by means of a logistic function. A class of reasonable weight combinations within the logistic transformation is described. Apart from the standard covariable product interaction, a product of logistically transformed covariables has also been included in the analysis of performance of the new terms. An algorithm combining likelihood ratio tests and AIC criterion has been defined for model choice. The critical values of the likelihood ratio test statistics had to be corrected in order to guarantee a maximum type I error of 5% for each interaction term. The new class of interaction terms allows interpretation of functional relationships between covariables with more flexibility and can easily be implemented in standard software packages

Open Access LMU

Recommended from our members

On Nonregularized Estimation of Psychological Networks.

Author: Rast Philippe
Rhemtulla Mijke
Williams Donald R
Wysocki Anna C
Publication venue: eScholarship, University of California
Publication date: 01/09/2019
Field of study

An important goal for psychological science is developing methods to characterize relationships between variables. Customary approaches use structural equation models to connect latent factors to a number of observed measurements, or test causal hypotheses between observed variables. More recently, regularized partial correlation networks have been proposed as an alternative approach for characterizing relationships among variables through off-diagonal elements in the precision matrix. While the graphical Lasso (glasso) has emerged as the default network estimation method, it was optimized in fields outside of psychology with very different needs, such as high dimensional data where the number of variables (p) exceeds the number of observations (n). In this article, we describe the glasso method in the context of the fields where it was developed, and then we demonstrate that the advantages of regularization diminish in settings where psychological networks are often fitted ( p≪n ). We first show that improved properties of the precision matrix, such as eigenvalue estimation, and predictive accuracy with cross-validation are not always appreciable. We then introduce nonregularized methods based on multiple regression and a nonparametric bootstrap strategy, after which we characterize performance with extensive simulations. Our results demonstrate that the nonregularized methods can be used to reduce the false-positive rate, compared to glasso, and they appear to provide consistent performance across sparsity levels, sample composition (p/n), and partial correlation size. We end by reviewing recent findings in the statistics literature that suggest alternative methods often have superior performance than glasso, as well as suggesting areas for future research in psychology. The nonregularized methods have been implemented in the R package GGMnonreg

eScholarship - University of California

Penalized Composite Quasi-Likelihood for Ultrahigh-Dimensional Variable Selection

Author: Bai
Bickel
Bickel
Bickel
Deutsch
Efron
Fan
Fan
Fan
Fan
Fan
Frank
Friedman
Huang
Huber
Kim
Koenker
Lehmann
Li
Portnoy
Portnoy
Tibshirani
van der Vaart
Wu
Xie
Yuan
Zhao
Zou
Zou
Zou
Zou
Publication venue: 'Wiley'
Publication date: 30/06/2010
Field of study

In high-dimensional model selection problems, penalized simple least-square approaches have been extensively used. This paper addresses the question of both robustness and efficiency of penalized model selection methods, and proposes a data-driven weighted linear combination of convex loss functions, together with weighted

L_1

-penalty. It is completely data-adaptive and does not require prior knowledge of the error distribution. The weighted

L_1

-penalty is used both to ensure the convexity of the penalty term and to ameliorate the bias caused by the

L_1

-penalty. In the setting with dimensionality much larger than the sample size, we establish a strong oracle property of the proposed method that possesses both the model selection consistency and estimation efficiency for the true non-zero coefficients. As specific examples, we introduce a robust method of composite L1-L2, and optimal composite quantile method and evaluate their performance in both simulated and real data examples

arXiv.org e-Print Archive

Crossref