451 research outputs found
Public Policies and the Demand for Carbonated Soft Drinks: A Censored Quantile Regression Approach
Heavy consumption of soda may contribute to obesity, strokes, and cardiac problems. From a health perspective, the distribution of the consumption is at least as important as the mean. Censored as well as ordinary quantile regression techniques were used to estimate the demand for sugary soda based on household data from 1989 to 1999. It was found that heavy drinkers are more price- and expenditure-responsive than are light drinkers. The study shows that increasing the taxes on carbonated soft drinks will lead to a small reduction in consumption for small and moderate consumers and a huge reduction for heavy consumers.soda demand, quantile regression, taxes, Agricultural and Food Policy, Food Consumption/Nutrition/Food Safety, D12, I10,
Recommended from our members
Estimation of high dimensional mean regression in the absence of symmetry and light tail assumptions
Data subject to heavy-tailed errors are commonly encountered in various scientific fields. To address this problem, procedures based on quantile regression and Least Absolute Deviation (LAD) regression have been developed in recent years. These methods essentially estimate the conditional median (or quantile) function. They can be very different from the conditional mean functions, especially when distributions are asymmetric and heteroscedastic. How can we efficiently estimate the mean regression functions in ultra-high dimensional setting with existence of only the second moment? To solve this problem, we propose a penalized Huber loss with diverging parameter to reduce biases created by the traditional Huber loss. Such a penalized robust approximate quadratic (RA-quadratic) loss will be called RA-Lasso. In the ultra-high dimensional setting, where the dimensionality can grow exponentially with the sample size, our results reveal that the RA-lasso estimator produces a consistent estimator at the same rate as the optimal rate under the light-tail situation. We further study the computational convergence of RA-Lasso and show that the composite gradient descent algorithm indeed produces a solution that admits the same optimal rate after sufficient iterations. As a byproduct, we also establish the concentration inequality for estimating population mean when there exists only the second moment. We compare RA-Lasso with other regularized robust estimators based on quantile regression and LAD regression. Extensive simulation studies demonstrate the satisfactory finite-sample performance of RA-Lasso
Partial quantile regression
Partial least squares regression (PLSR) is a method of finding a reliable predictor of the response variable when there are more regressors than observations. It does so by eliciting a small number of components from the regressors that are inherently informative about the response. Quantile regression (QR) estimates the quantiles of the response distribution by regression functions of the covariates, and so gives a fuller description of the response than does the usual regression for the mean value of the response. We extend QR to partial quantile regression (PQR) when there are more regressors than observations. For each percentile the method provides a low dimensional approximation to the joint distribution of the covariates and response with a given coverage probability and which, under further linearity assumptions, estimates the corresponding quantile of the conditional distribution. The methodology parallels the procedure for PLSR using a quantile covariance that is appropriate for predicting a quantile rather than the usual covariance which is appropriate for predicting a mean value. The analysis suggests a new measure of risk associated with the quantile regressions. Examples are given that illustrate the methodology and the benefits accrued, based on simulated data and the analysis of spectrometer dat
Distributional Regression for Data Analysis
Flexible modeling of how an entire distribution changes with covariates is an
important yet challenging generalization of mean-based regression that has seen
growing interest over the past decades in both the statistics and machine
learning literature. This review outlines selected state-of-the-art statistical
approaches to distributional regression, complemented with alternatives from
machine learning. Topics covered include the similarities and differences
between these approaches, extensions, properties and limitations, estimation
procedures, and the availability of software. In view of the increasing
complexity and availability of large-scale data, this review also discusses the
scalability of traditional estimation methods, current trends, and open
challenges. Illustrations are provided using data on childhood malnutrition in
Nigeria and Australian electricity prices.Comment: Accepted for publication in Annual Review of Statistics and its
Applicatio
TWO-STAGE HUBER ESTIMATION
In this paper we study how the Huber estimator can be adapted to the presence of endogeneity in a two stage equations setting similar to that of 2SLS. We propose an estimation procedure that is at the same time relatively (i) simple, (ii) robust and (iii) efficient. Moreover, we deal with the case of random regressors and asymmetric errors, two extensions rarely present in this literature. The preliminary scale correction is implemented with median absolute deviation estimator, which is consistent with our above criteria and is a very robust estimator of scale. The resulting estimator is termed as the Two-Stage Huber (2SH) estimator. We explicitly establish the conditions for consistency and asymptotic normality of the 2SH estimator and we derive the formula of the asymptotic covariance matrix. We conduct Monte Carlo simulations whose results indicate that the 2SH estimator has smaller standard errors than the Two-Stage Least Squares (2SLS) estimator and than the Two-Stage Least Absolute Deviations (2SLAD) estimator in many situations. On the whole, the 2SH estimator appears to be a simple and useful alternative to 2SLS and 2SLAD in cases of two-stage estimation to deal with endogeneity when there are concerns for both robustness and efficiency.Two-stage estimation, Huber estimation, robustness, endogeneity
Inference for High-Dimensional Sparse Econometric Models
This article is about estimation and inference methods for high dimensional
sparse (HDS) regression models in econometrics. High dimensional sparse models
arise in situations where many regressors (or series terms) are available and
the regression function is well-approximated by a parsimonious, yet unknown set
of regressors. The latter condition makes it possible to estimate the entire
regression function effectively by searching for approximately the right set of
regressors. We discuss methods for identifying this set of regressors and
estimating their coefficients based on -penalization and describe key
theoretical results. In order to capture realistic practical situations, we
expressly allow for imperfect selection of regressors and study the impact of
this imperfect selection on estimation and inference results. We focus the main
part of the article on the use of HDS models and methods in the instrumental
variables model and the partially linear model. We present a set of novel
inference results for these models and illustrate their use with applications
to returns to schooling and growth regression
Endogenous semiparametric binary choice models with heteroscedasticity
In this paper we consider endogenous regressors in the binary choice model under a weak median exclusion restriction, but without further specification of the distribution of the unobserved random components. Our reduced form specification with heteroscedastic residuals covers various heterogeneous structural binary choice models. As a particularly relevant example of a structural model where no semiparametric estimator has of yet been analyzed, we consider the binary random utility model with endogenous regressors and heterogeneous parameters. We employ a control function IV assumption to establish identification of a slope parameter 'â' by the mean ratio of derivatives of two functions of the instruments. We propose an estimator based on direct sample counterparts, and discuss the large sample behavior of this estimator. In particular, we show '√'n consistency and derive the asymptotic distribution. In the same framework, we propose tests for heteroscedasticity, overidentification and endogeneity. We analyze the small sample performance through a simulation study. An application of the model to discrete choice demand data concludes this paper.
Retire: Robust Expectile Regression in High Dimensions
High-dimensional data can often display heterogeneity due to heteroscedastic
variance or inhomogeneous covariate effects. Penalized quantile and expectile
regression methods offer useful tools to detect heteroscedasticity in
high-dimensional data. The former is computationally challenging due to the
non-smooth nature of the check loss, and the latter is sensitive to
heavy-tailed error distributions. In this paper, we propose and study
(penalized) robust expectile regression (retire), with a focus on iteratively
reweighted -penalization which reduces the estimation bias from
-penalization and leads to oracle properties. Theoretically, we
establish the statistical properties of the retire estimator under two regimes:
(i) low-dimensional regime in which ; (ii) high-dimensional regime in
which with denoting the number of significant predictors. In
the high-dimensional setting, we carefully characterize the solution path of
the iteratively reweighted -penalized retire estimation, adapted from
the local linear approximation algorithm for folded-concave regularization.
Under a mild minimum signal strength condition, we show that after as many as
iterations the final iterate enjoys the oracle convergence rate.
At each iteration, the weighted -penalized convex program can be
efficiently solved by a semismooth Newton coordinate descent algorithm.
Numerical studies demonstrate the competitive performance of the proposed
procedure compared with either non-robust or quantile regression based
alternatives
- …