2,270 research outputs found

    Mobility and the Return to Education: Testing a Roy Model with Multiple Markets

    Get PDF
    Self-selected migration presents one potential explanation for why observed returns to a college education in local labor markets vary widely even though U.S. workers are highly mobile. To assess the impact of self-selection on estimated returns, this paper first develops a Roy model of mobility and earnings where workers choose in which of the 50 states (plus the District of Columbia) to live and work. Available estimation methods are either infeasible for a selection model with so many alternatives or place potentially severe restrictions on earnings and the selection process. This paper develops an alternative econometric methodology which combines Lee's (1983) parametric maximum order statistic approach to reduce the dimensionality of the error terms with more recent work on semiparametric estimation of selection models (e.g., Ahn and Powell, 1993). The resulting semiparametric correction is easy to implement and can be adapted to a variety of other polychotomous choice problems. The empirical work, which uses 1990 U.S. Census data, confirms the role of comparative advantage in mobility decisions. The results suggest that self-selection of higher educated individuals to states with higher returns to education generally leads to upward biases in OLS estimates of the returns to education in state-specific labor markets. While the estimated returns to a college education are significantly biased, correcting for the bias does not narrow the range of returns across states. Consistent with the finding that the corrected return to a college education differs across the U.S., the relative state-to-state migration flows of college- versus high school-educated individuals respond strongly to differences in the return to education and amenities across states.Selection Bias, Polychotomous Choice, Roy Model, Return to Education, Migration

    Brief history of the Lehmann Symposia: Origins, goals and motivation

    Full text link
    The idea of the Lehmann Symposia as platforms to encourage a revival of interest in fundamental questions in theoretical statistics, while keeping in focus issues that arise in contemporary interdisciplinary cutting-edge scientific problems, developed during a conversation that I had with Victor Perez Abreu during one of my visits to Centro de Investigaci\'{o}n en Matem\'{a}ticas (CIMAT) in Guanajuato, Mexico. Our goal was and has been to showcase relevant theoretical work to encourage young researchers and students to engage in such work. The First Lehmann Symposium on Optimality took place in May of 2002 at Centro de Investigaci\'{o}n en Matem\'{a}ticas in Guanajuato, Mexico. A brief account of the Symposium has appeared in Vol. 44 of the Institute of Mathematical Statistics series of Lecture Notes and Monographs. The volume also contains several works presented during the First Lehmann Symposium. All papers were refereed. The program and a picture of the participants can be found on-line at the website http://www.stat.rice.edu/lehmann/lst-Lehmann.html.Comment: Published at http://dx.doi.org/10.1214/074921706000000347 in the IMS Lecture Notes--Monograph Series (http://www.imstat.org/publications/lecnotes.htm) by the Institute of Mathematical Statistics (http://www.imstat.org

    Bootstrap applications in proportional hazards models

    Get PDF
    Experiments in which the measured responses are times until events occur are common in a variety of fields. When only one response is measured on each subject, the proportional hazards model of Cox (1972) is often used to assess the effects of one or more explanatory variables on the event times. Two new resampling plans are introduced for bootstrapping estimators from this model when explanatory variables are fixed by design. One method resamples from the Uniform (0,1) distribution of the probability integral transformation corresponding to the conditional failure time distribution, and it is easily adapted to a wide variety of censoring schemes. The other method is an analog to the residual-resampling method for regression introduced by Efron (1979), and it admits random censoring from a class of distributions which includes the Koziol-Green model;Multivariate extensions of resampling methods are developed for situations where multiple event times are monitored on individual subjects. Marginal models are fit using an independence working model approach. Resampling procedures are then applied to the joint distribution of the multiple responses or residuals to make bias corrections to the parameter estimates, estimate covariance matrices, and construct confidence intervals. Simulation studies indicate that each of the proposed methods provides substantial improvements in mean squared errors over existing techniques for estimation of model parameters. The proposed methods also provide better estimates of standard errors and more reliable confidence intervals for model parameters than existing methods which rely largely on asymptotic approximations. These methods are demonstrated through applications to data sets available in the literature

    Analysis of rabbit doe longevity using a semiparametric log-Normal animal frailty model with time-dependent covariates

    Get PDF
    Data on doe longevity in a rabbit population were analysed using a semiparametric log-Normal animal frailty model. Longevity was defined as the time from the first positive pregnancy test to death or culling due to pathological problems. Does culled for other reasons had right censored records of longevity. The model included time dependent covariates associated with year by season, the interaction between physiological state and the number of young born alive, and between order of positive pregnancy test and physiological state. The model also included an additive genetic effect and a residual in log frailty. Properties of marginal posterior distributions of specific parameters were inferred from a full Bayesian analysis using Gibbs sampling. All of the fully conditional posterior distributions defining a Gibbs sampler were easy to sample from, either directly or using adaptive rejection sampling. The marginal posterior mean estimates of the additive genetic variance and of the residual variance in log frailty were 0.247 and 0.690

    Estimation of a regression spline sample selection model

    Get PDF
    It is often the case that an outcome of interest is observed for a restricted non-randomly selected sample of the population. In such a situation, standard statistical analysis yields biased results. This issue can be addressed using sample selection models which are based on the estimation of two regressions: a binary selection equation determining whether a particular statistical unit will be available in the outcome equation. Classic sample selection models assume a priori that continuous regressors have a pre-specified linear or non-linear relationship to the outcome, which can lead to erroneous conclusions. In the case of continuous response, methods in which covariate effects are modeled flexibly have been previously proposed, the most recent being based on a Bayesian Markov chain Monte Carlo approach. A frequentist counterpart which has the advantage of being computationally fast is introduced. The proposed algorithm is based on the penalized likelihood estimation framework. The construction of confidence intervals is also discussed. The empirical properties of the existing and proposed methods are studied through a simulation study. The approaches are finally illustrated by analyzing data from the RAND Health Insurance Experiment on annual health expenditures

    A multivariate semiparametric Bayesian spatial modeling framework for hurricane surface wind fields

    Full text link
    Storm surge, the onshore rush of sea water caused by the high winds and low pressure associated with a hurricane, can compound the effects of inland flooding caused by rainfall, leading to loss of property and loss of life for residents of coastal areas. Numerical ocean models are essential for creating storm surge forecasts for coastal areas. These models are driven primarily by the surface wind forcings. Currently, the gridded wind fields used by ocean models are specified by deterministic formulas that are based on the central pressure and location of the storm center. While these equations incorporate important physical knowledge about the structure of hurricane surface wind fields, they cannot always capture the asymmetric and dynamic nature of a hurricane. A new Bayesian multivariate spatial statistical modeling framework is introduced combining data with physical knowledge about the wind fields to improve the estimation of the wind vectors. Many spatial models assume the data follow a Gaussian distribution. However, this may be overly-restrictive for wind fields data which often display erratic behavior, such as sudden changes in time or space. In this paper we develop a semiparametric multivariate spatial model for these data. Our model builds on the stick-breaking prior, which is frequently used in Bayesian modeling to capture uncertainty in the parametric form of an outcome. The stick-breaking prior is extended to the spatial setting by assigning each location a different, unknown distribution, and smoothing the distributions in space with a series of kernel functions. This semiparametric spatial model is shown to improve prediction compared to usual Bayesian Kriging methods for the wind field of Hurricane Ivan.Comment: Published at http://dx.doi.org/10.1214/07-AOAS108 in the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Rank Regression in Stability Analysis

    Get PDF
    Stability data are often collected to determine the shelf-life of certain characteristics of a pharmaceutical product, for example, a drug\u27s potency over time. Statistical approaches such as the linear regression models are considered as appropriate to analyze the stability data. However, most of these regression models in both theory and practice rely heavily on their underlying parametric assumptions, such as normality of the continuous characteristics or their transformations. In this article, we propose and study some rank-based regression procedures for the stability data when the linear regression models are semiparametric with unspecified error structure. Numerical studies including Monte Carlo simulations and practical examples are demonstrated with the proposed procedures as well
    corecore