16 research outputs found

    Generic Machine Learning Inference on Heterogenous Treatment Effects in Randomized Experiments

    Full text link
    We propose strategies to estimate and make inference on key features of heterogeneous effects in randomized experiments. These key features include best linear predictors of the effects using machine learning proxies, average effects sorted by impact groups, and average characteristics of most and least impacted units. The approach is valid in high dimensional settings, where the effects are proxied by machine learning methods. We post-process these proxies into the estimates of the key features. Our approach is generic, it can be used in conjunction with penalized methods, deep and shallow neural networks, canonical and new random forests, boosted trees, and ensemble methods. It does not rely on strong assumptions. In particular, we don't require conditions for consistency of the machine learning methods. Estimation and inference relies on repeated data splitting to avoid overfitting and achieve validity. For inference, we take medians of p-values and medians of confidence intervals, resulting from many different data splits, and then adjust their nominal level to guarantee uniform validity. This variational inference method is shown to be uniformly valid and quantifies the uncertainty coming from both parameter estimation and data splitting. We illustrate the use of the approach with two randomized experiments in development on the effects of microcredit and nudges to stimulate immunization demand.Comment: 53 pages, 6 figures, 15 table

    Generic machine learning inference on heterogenous treatment effects in randomized experiments

    Full text link
    We propose strategies to estimate and make inference on key features of heterogeneous effects in randomized experiments. These key features include best linear predictors of the effects using machine learning proxies, average effects sorted by impact groups, and average characteristics of most and least impacted units. The approach is valid in high dimensional settings, where the effects are proxied by machine learning methods. We post-process these proxies into the estimates of the key features. Our approach is generic, it can be used in conjunction with penalized methods, deep and shallow neural networks, canonical and new random forests, boosted trees, and ensemble methods. Our approach is agnostic and does not make unrealistic or hard-to-check assumptions; we don’t require conditions for consistency of the ML methods. Estimation and inference relies on repeated data splitting to avoid overfitting and achieve validity. For inference, we take medians of p-values and medians of confidence intervals, resulting from many different data splits, and then adjust their nominal level to guarantee uniform validity. This variational inference method is shown to be uniformly valid and quantifies the uncertainty coming from both parameter estimation and data splitting. The inference method could be of substantial independent interest in many machine learning applications. An empirical application to the impact of micro-credit on economic development illustrates the use of the approach in randomized experiments. An additional application to the impact of the gender discrimination on wages illustrates the potential use of the approach in observational studies, where machine learning methods can be used to condition flexibly on very high-dimensional controls.https://arxiv.org/abs/1712.04802First author draf

    Essays on production function estimation

    No full text
    Thesis: Ph. D., Massachusetts Institute of Technology, Department of Economics, May, 2020Cataloged from the official PDF of thesis.Includes bibliographical references (pages 193-201).This first chapter develops a new method for estimating production functions with factor-augmenting technology and assesses its economic implications. The method does not impose parametric restrictions and generalizes prior approaches that rely on the CES production function. I first extend the canonical Olley-Pakes framework to accommodate factor-augmenting technology. Then, I show how to identify output elasticities based on a novel control variable approach and the optimality of input expenditures. I use this method to estimate output elasticities and markups in manufacturing industries in the US and four developing countries. Neglecting labor-augmenting productivity and imposing parametric restrictions mismeasures output elasticities and heterogeneity in the production function. My estimates suggest that standard models (i) underestimate capital elasticity by up to 70 percent (ii) overestimate labor elasticity by up to 80 percent.These biases propagate into markup estimates inferred from output elasticities: markups are overestimated by 20 percentage points. Finally, heterogeneity in output elasticities also affects estimated trends in markups: my estimates point to a much more muted markup growth (about half) in the US manufacturing sector than recent estimates. The second chapter develops partial identification results that are robust to deviations from the commonly used control function approach assumptions and measurement errors in inputs. In particular, the model (i) allows for multi-dimensional unobserved heterogeneity,(ii) relaxes strict monotonicity to weak monotonicity, (iii) accommodates a more flexible timing assumption for capital. I show that under these assumptions production function parameters are partially identified by an 'imperfect proxy' variable via moment inequalities. Using these moment inequalities, I derive bounds on the parameters and propose an estimator.An empirical application is presented to quantify the informativeness of the identified set. The third chapter develops an approach in which endogenous networks is a source of identification in estimations with network data. In particular, I study a linear model where network data can be used to control for unobserved heterogeneity and partially identify the parameters of the linear model. My method does not rely on a parametric model of network formation. Instead, identification is achieved by assuming that the network satisfies latent homophily - the tendency of individuals to be linked with others who are similar to themselves. I first provide two definitions of homophily: weak and strong homophily. Then, based on these definitions, I characterize the identified sets and show that they are bounded under weak conditions.Finally, to illustrate the method in an empirical setting, I estimate the effects of education on risk preferences and peer effects using social network data from 150 Chinese villages.by Mert Demirer.Ph. D.Ph.D. Massachusetts Institute of Technology, Department of Economic

    Semi-Parametric Efficient Policy Learning with Continuous Actions

    No full text
    © 2019 Neural information processing systems foundation. All rights reserved. We consider off-policy evaluation and optimization with continuous action spaces. We focus on observational data where the data collection policy is unknown and needs to be estimated. We take a semi-parametric approach where the value function takes a known parametric form in the treatment, but we are agnostic on how it depends on the observed contexts. We propose a doubly robust off-policy estimate for this setting and show that off-policy optimization based on this estimate is robust to estimation errors of the policy function or the regression model. Our results also apply if the model does not satisfy our semi-parametric form, but rather we measure regret in terms of the best projection of the true value function to this functional space. Our work extends prior approaches of policy optimization from observational data that only considered discrete actions. We provide an experimental evaluation of our method in a synthetic data example motivated by optimal personalized pricing and costly resource allocation

    Modeling the Antipodal Connectivity Structure of Neural Communities

    No full text
    Recent studies support the theory of the brain being composed of modules and certain nodes establishing connections between the modules [1,2,3]. The existence of such connections can only be identified by conducting a detailed investigation with sophisticated tools. Therefore, in this manuscript we provide a new mathematical model to indicate the functional dependency, which supports the idea of information exchange between the neural modules at the highest spatial and hierarchical level of bottom-up processes using EEG (ElectroEncephaloGraphy) [4]. The developed model is to study the functional dependencies between di erent regions of the cortex is based on the Borsuk-Ulam's antipodal symmetry theorem. It is a mathematical model complemented with an innovative algorithm, called Projection based on Normalized Transformation (PNT), to show the existence of unique neural activity pattern known as the Antipodal Connectivity. For validating of the model, EEG data collected from a total of 50 experiments with the participation of 18 di erent test subjects was used to measure the e ectiveness and accuracy of method. Using the data collected from the subjects in di erent stages (active or resting) of the brain, the Antipodal Hub Neurons (AHNs) were captured and compared to determine the ratio of fluctuation under di erent conditions and whether or not the stimulus has any role in antipodal neural connectivity. Although the preliminary results are not conclusive, we have successfully identified the existence of antipodal behavioral patterns in neural activities

    Çelebi Ailesinin Türkiye’ye göçü

    No full text
    Ankara : İhsan Doğramacı Bilkent Üniversitesi İktisadi, İdari ve Sosyal Bilimler Fakültesi, Tarih Bölümü, 2017.This work is a student project of the The Department of History, Faculty of Economics, Administrative and Social Sciences, İhsan Doğramacı Bilkent University.by Ünsal, Mehmet Süha
    corecore