165 research outputs found

    Extremum sieve estimation in k-out-of-n system

    Get PDF
    The paper considers nonparametric estimation of absolutely continuous distribution functions of independent lifetimes of non-identical components in k-out-of-n systems, 2 k-out-of-n, from the observed "autopsy" data. In economics, ascending "button" or "clock" auctions with n heterogeneous bidders with independent private values present 2-out-of-n systems. Classical competing risks models are examples of n-out-of-n systems. Under weak conditions on the underlying distributions the estimation problem is shown to be well posed and the suggested extremum sieve estimator is proven to be consistent. The paper considers sieve spaces of Bernstein polynomials which allow to easily implement constraints on the monotonicity of estimated distribution functions

    Quantile uncorrelation and instrumental regressions

    Get PDF
    We introduce a notion of median uncorrelation that is a natural extension of mean (linear) uncorrelation. A scalar random variable Y is median uncorrelated with a k-dimensional random vector X if and only if the slope from an LAD regression of Y on X is zero. Using this simple definition, we characterize properties of median uncorrelated random variables, and introduce a notion of multivariate median uncorrelation. We provide measures of median uncorrelation that are similar to the linear correlation coefficient and the coefficient of determination. We also extend this median uncorrelation to other loss functions. As two stage least squares exploits mean uncorrelation between an instrument vector and the error to derive consistent estimators for parameters in linear regressions with endogenous regressors, the main result of this paper shows how a median uncorrelation assumption between an instrument vector and the error can similarly be used to derive consistent estimators in these linear models with endogenous regressors. We also show how median uncorrelation can be used in linear panel models with quantile restrictions and in linear models with measurement errors.

    Nonparametric identification in asymmetric second-price auctions: a new approach

    Get PDF
    This paper proposes an approach to proving nonparametric identification for distributions of bidders' values in asymmetric second-price auctions. I consider the case when bidders have independent private values and the only available data pertain to the winner's identity and the transaction price. My proof of identification is constructive and is based on establishing the existence and uniqueness of a solution to the system of non-linear differential equations that describes relationships between unknown distribution functions and observable functions. The proof is conducted in two logical steps. First, I prove the existence and uniqueness of a local solution. Then I describe a method that extends this local solution to the whole support. This paper delivers other interesting results. I show how this approach can be applied to obtain identification in more general auction settings, for instance, in auctions with stochastic number of bidders or weaker support conditions. Furthermore, I demonstrate that my results can be extended to generalized competing risks models. Moreover, contrary to results in classical competing risks (Roy model), I show that in this generalized class of models it is possible to obtain implications that can be used to check whether the risks in a model are dependent. Finally, I provide a sieve minimum distance estimator and show that it consistently estimates the underlying valuation distribution of interest.

    Identification, data combination and the risk of disclosure

    Get PDF
    Businesses routinely rely on econometric models to analyze and predict consumer behavior. Estimation of such models may require combining a firm's internal data with external datasets to take into account sample selection, missing observations, omitted variables and errors in measurement within the existing data source. In this paper we point out that these data problems can be addressed when estimating econometric models from combined data using the data mining techniques under mild assumptions regarding the data distribution. However, data combination leads to serious threats to security of consumer data: we demonstrate that point identification of an econometric model from combined data is incompatible with restrictions on the risk of individual disclosure. Consequently, if a consumer model is point identified, the firm would (implicitly or explicitly) reveal the identity of at least some of consumers in its internal data. More importantly, we provide an argument that unless the firm places a restriction on the individual disclosure risk when combining data, even if the raw combined dataset is not shared with a third party, an adversary or a competitor can gather confidential information regarding some individuals from the estimated model.

    Multivariate ordered discrete response models

    Full text link
    We introduce multivariate ordered discrete response models with general rectangular structures. From the perspective of behavioral economics, these non-lattice models correspond to broad bracketing in decision making, whereas lattice models, which researchers typically estimate in practice, correspond to narrow bracketing. In these models, we specify latent processes as a sum of an index of covariates and an unobserved error, with unobservables for different latent processes potentially correlated. We provide conditions that are sufficient for identification under the independence of errors and covariates and outline an estimation approach. We present simulations and empirical examples, with a particular focus on probit specifications

    Identification, data combination and the risk of disclosure

    Get PDF
    It is commonplace that the data needed for econometric inference are not contained in a single source. In this paper we analyze the problem of parametric inference from combined individual-level data when data combination is based on personal and demographic identifiers such as name, age, or address. Our main question is the identification of the econometric model based on the combined data when the data do not contain exact individual identifiers and no parametric assumptions are imposed on the joint distribution of information that is common across the combined dataset. We demonstrate the conditions on the observable marginal distributions of data in individual datasets that can and cannot guarantee identification of the parameters of interest. We also note that the data combination procedure is essential in the semiparametric setting such as ours. Provided that the (non-parametric) data combination procedure can only be defined in finite samples, we introduce a new notion of identification based on the concept of limits of statistical experiments. Our results apply to the setting where the individual data used for inferences are sensitive and their combination may lead to a substantial increase in the data sensitivity or lead to a de-anonymization of the previously anonymized information. We demonstrate that the point identification of an econometric model from combined data is incompatible with restrictions on the risk of individual disclosure. If the data combination procedure guarantees a bound on the risk of individual disclosure, then the information available from the combined dataset allows one to identify the parameter of interest only partially, and the size of the identification region is inversely related to the upper bound guarantee for the disclosure risk. This result is new in the context of data combination as we notice that the quality of links that need to be used in the combined data to assure point identification may be much higher than the average link quality in the entire dataset, and thus point inference requires the use of the most sensitive subset of the data. Our results provide important insights into the ongoing discourse on the empirical analysis of merged administrative records as well as discussions on the disclosive nature of policies implemented by the data-driven companies (such as Internet services companies and medical companies using individual patient records for policy decisions

    On Optimal Set Estimation for Partially Identified Binary Choice Models

    Full text link
    In this paper we reconsider the notion of optimality in estimation of partially identified models. We illustrate the general problem in the context of a semiparametric binary choice model with discrete covariates as an example of a model which is partially identified as shown in, e.g. Bierens and Hartog (1988). A set estimator for the regression coefficients in the model can be constructed by implementing the Maximum Score procedure proposed by Manski (1975). For many designs this procedure converges to the identified set for these parameters, and so in one sense is optimal. But as shown in Komarova (2013) for other cases the Maximum Score objective function gives an outer region of the identified set. This motivates alternative methods that are optimal in one sense that they converge to the identified region in all designs, and we propose and compare such procedures. One is a Hodges type estimator combining the Maximum Score estimator with existing procedures. A second is a two step estimator using a Maximum Score type objective function in the second step. Lastly we propose a new random set quantile estimator, motivated by definitions introduced in Molchanov (2006). Extensions of these ideas for the cross sectional model to static and dynamic discrete panel data models are also provided.Comment: 71 pages, 4 figure

    Optimization of the Concrete Composition Mix at the Design Stage

    Get PDF
    The problem of the composition optimization of concrete mixes seems to be quite urgent as errors at the composition design stage can lead to problems of concrete at the stage of exploitation such as delamination, cracking etc. Reasonable selection of concrete mix components guarantees the required strength of concrete and reinforced concrete structures in the future. This paper investigates the influence of the concrete mix composition on the strength of concrete. Firstly, typical risks that can occur on the composition design stage have been identified through the experts' interviews. Secondly, this risks were associated with indicators and characteristics that can be tested experimentally. Running of several mathematical models has allowed to outline concrete mix parameters of highest importance and formulate an empirical equation for the dependence of the strength of the concrete mixture on the values of the coarse aggregate quality factor, the fine aggregate fraction and the consumption of the Portland cement has been proposed. As a result, a methodology for controlling the quality of concrete at the stage of the composition design has been formulated. Doi: 10.28991/cej-2021-03091732 Full Text: PD
    corecore