402 research outputs found

    Partial Identification of Probability Distributions with Misclassified Data

    Get PDF
    This paper addresses the problem of data errors in discrete variables. When data errors occur, the observed variable is a misclassified version of the variable of interest, whose distribution is not identified. Inferential problems caused by data errors have been conceptualized through convolution and mixture models. This paper introduces the direct misclassification approach. The approach is based on the observation that in the presence of classification errors, the relation between the distribution of the "true" but unobservable variable and its misclassified representation is given by a linear system of simultaneous equations, in which the coefficient matrix is the matrix of misclassification probabilities. Formalizing the problem in these terms allows one to incorporate any prior information--e.g., validation studies, economic theory, social and cognitive psychology--into the analysis through sets of restrictions on the matrix of misclassification probabilities. Such information can have strong identifying power; the direct misclassification approach fully exploits it to derive identification regions for any real functional of the distribution of interest. A method for estimating the identification regions and construct their confidence sets is given, and illustrated with an empirical analysis of the distribution of pension plan types using data from the Health and Retirement Study.

    Missing Treatments

    Get PDF
    The existing literature on treatment e¤ects assumes perfect observability of the treatments received by the population of interest. Even in cases of imperfect compliance, it is usually as- sumed that both the assigned and administered treatment are observed (or missing completely at random). This paper abandons such assumptions. Imperfect observability of the received treatment can arise as a result of survey nonresponse in observational studies, or noncompliance with randomly assigned treatments that are not directly monitored. I study the problem in the context of observational studies. I derive sharp worst case bounds without assuming anything about treatment selection, and I show that the bounds are a function of the available prior information on the distribution of the missing treatments. Under the maintained assumption of monotone treatment response, I show that no prior information on the distribution of missing treatments is necessary to get sharp informative bounds. I apply the methodologies recently proposed by Imbens and Manski (2004) and Chernozhukov, Hong, and Tamer (2004) to derive two types of confidence intervals for the partially identi.ed parameters. The results are illustrated with an empirical analysis of drug use and employment using data from the National Longitudinal Survey of Youth.

    Spatial correlation robust inference with Errors in Location or Distance

    Get PDF
    This paper presents results from a Monte Carlo study concerning inference with spatially dependent data. It investigates the impact of location/distance measurement errors upon the accuracy of parametric and nonparametric estimators of asymptotic variances.

    Asymptotic properties for a class of partially identified models

    Get PDF
    We propose inference procedures for partially identified population features for which the population identification region can be written as a transformation of the Aumann expectation of a properly defined set valued random variable (SVRV). An SVRV is a mapping that associates a set (rather than a real number) with each element of the sample space. Examples of population features in this class include sample means and best linear predictors with interval outcome data, and parameters of semiparametric binary models with interval regressor data. We extend the analogy principle to SVRVs, and show that the sample analog estimator of the population identification region is given by a transformation of a Minkowski average of SVRVs. Using the results of the mathematics literature on SVRVs, we show that this estimator converges in probability to the identification region of the model with respect to the Hausdorff distance. We then show that the Hausdorff distance between the estimator and the population identification region, when properly normalized by vn, converges in distribution to the supremum of a Gaussian process whose covariance kernel depends on parameters of the population identification region. We provide consistent bootstrap procedures to approximate this limiting distribution. Using similar arguments as those applied for vector valued random variables, we develop a methodology to test assumptions about the true identification region and to calculate the power of the test. We show that these results can be used to construct a confidence collection, that is a collection of sets that, when specified as null hypothesis for the true value of the population identification region, cannot be rejected by our test.Partial Identification, Confidence Collections, Set-Valued Random Variables.

    Generalization of a Result on "Regression, Short and Long"

    Get PDF
    This paper is concerned with the problem of combining a data set that identifies the conditional distribution P (y|x) with one that identifies the conditional distribution P (z|x), in order to identify the regressions E (y|x, middot) identical with [E (y|x, z = j), j element of Z] when the conditional distribution P (y|x, z) is unknown. Cross and Manski (2002) studied this problem and showed that the identification region of E (y|x, middot) can be precisely calculated, when y has finite support. Here we generalize Cross and Manski's result showing that the identification region can be precisely calculated also in the case in which y has infinite support.

    Asymptotic Properties for a Class of Partially Identified Models

    Get PDF
    We propose inference procedures for partially identified population features for which the population identification region can be written as a transformation of the Aumann expectation of a properly defined set valued random variable (SVRV). An SVRV is a mapping that associates a set (rather than a real number) with each element of the sample space. Examples of population features in this class include sample means and best linear predictors with interval outcome data, and parameters of semiparametric binary models with interval regressor data. We extend the analogy principle to SVRVs, and show that the sample analog estimator of the population identification region is given by a transformation of a Minkowski average of SVRVs. Using the results of the mathematics literature on SVRVs, we show that this estimator converges in probability to the identification region of the model with respect to the Hausdorff distance. We then show that the Hausdorff distance between the estimator and the population identification region, when properly normalized by square-root-of-n, converges in distribution to the supremum of a Gaussian process whose covariance kernel depends on parameters of the population identification region. We provide consistent bootstrap procedures to approximate this limiting distribution. Using similar arguments as those applied for vector valued random variables, we develop a methodology to test assumptions about the true identification region and to calculate the power of the test. We show that these results can be used to construct a confidence collection, that is a collection of sets that, when specified as null hypothesis for the true value of the population identification region, cannot be rejected by our test.

    Discrete Choice under Risk with Limited Consideration

    Full text link
    This paper is concerned with learning decision makers' preferences using data on observed choices from a finite set of risky alternatives. We propose a discrete choice model with unobserved heterogeneity in consideration sets and in standard risk aversion. We obtain sufficient conditions for the model's semi-nonparametric point identification, including in cases where consideration depends on preferences and on some of the exogenous variables. Our method yields an estimator that is easy to compute and is applicable in markets with large choice sets. We illustrate its properties using a dataset on property insurance purchases.Comment: 76 pages, 9 figures, 15 table

    Spatial Correlation Robust Inference with Errors in Location or Distance

    Get PDF
    This paper presents results from a Monte Carlo study concerning inference with spatially dependent data. We investigate the impact of location/distance measurement errors upon the accuracy of parametric and nonparametric estimators of asymptotic variances. Nonparametric estimators are quite robust to such errors, method of moments estimators perform surprisingly well, and MLE estimators are very poor. We also present and evaluate a specification test based on a parametric bootstrap that has good power properties for the types of measurement error we consider.

    FIRE Takes on the Nation’s Capital

    Get PDF
    • …
    corecore