262 research outputs found
The ASEAN Free Trade Agreement: Building bloc or stumbling bloc for multilateral trade liberalization?
This paper investigates empirically whether the ASEAN Free Trade Agreement had a building bloc or stumbling bloc effect on subsequent changes in MFN tariffs of four major ASEAN members. The method resembles the one recently used by Nuno Limão. We use tariff data to test whether MFN tariffs were changed differently for preferential products compared to otherwise similar products without a preference. We find a significant building bloc effect for Indonesia, the Philippines and Thailand. MFN tariffs of preferential products were reduced by more than for non-preferential products. We obtain ambiguous effects for Malaysia. This suggests that overall the ASEAN Free Trade Agreement has rather helped than hindered nondiscriminatory trade liberalization
New evidence on preference utilization
We analyse the degree of preference utilization in four major importing countries (Australia, Canada, EU and US) and provide evidence that preferences are more widely used than previously thought. For Australia and Canada, we have obtained a new dataset on imports by preferential regime that has so far not been publicly available. For the EU and US, we make use of more disaggregated data than previously used in the literature. We empirically test what determines utilization rates. In line with previous studies, we find that utilization increases with both the preferential margin and the volume of exports, suggesting that using preferences can be costly. However, we also find that utilization rates are often very high, even for very small preferential margins and/or very small trade flows, which contradicts numerous estimates that average compliance costs are as high as 2-6%. We extend the existing literature in relation to both data and methodological issues. In particular, we construct pseudo transaction-level data that allows us to assess more precisely when available preferences are utilized. Using this methodology, we obtain a more realistic estimate of what determines utilization. Rather than constituting a percentage share of the trade value, our findings indicate that utilization costs involve an important fixed cost element. We provide estimates for such fixed costs, which appear to be in the range of USD 14 to USD 1,500
Can online markets make trade more inclusive?
Technology made available by online markets has significantly reduced the cost of entry into international markets for small and medium sized firms, who can now reach far away consumers and create global reputation as a seller at very low costs. Empirical evidence using data from eBay sellers shows that a large share of online firms exports, even though they are on average much smaller than traditional offline firms. We show that in a world where income inequality is driven by an uneven distribution of capital rents, online markets help to reduce income inequality by providing smaller firms access to international markets
Identification and Efficient Estimation of the Natural Direct Effect Among the Untreated
The natural direct effect (NDE), or the effect of an exposure on an outcome if an intermediate variable was set to the level it would have been in the absence of the exposure, is often of interest to investigators. In general, the statistical parameter associated with the NDE is difficult to estimate in the non-parametric model, particularly when the intermediate variable is continuous or high dimensional. In this paper we introduce a new causal parameter called the natural direct effect among the untreated, discus identifiability assumptions, and show that this new parameter is equivalent to the NDE in a randomized control trial. We also present a targeted minimum loss estimator (TMLE), a locally efficient, double robust substitution estimator for the statistical parameter associated with this causal parameter. The TMLE can be applied to problems with continuous and high dimensional intermediate variables, and can be used to estimate the NDE in a randomized controlled trial with such data. Additionally, we define and discuss the estimation of three related causal parameters: the natural direct effect among the treated, the indirect effect among the untreated and the indirect effect among the treated
Online Targeted Learning
We consider the case that the data comes in sequentially and can be viewed as sample of independent and identically distributed observations from a fixed data generating distribution. The goal is to estimate a particular path wise target parameter of this data generating distribution that is known to be an element of a particular semi-parametric statistical model. We want our estimator to be asymptotically efficient, but we also want that our estimator can be calculated by updating the current estimator based on the new block of data without having to revisit the past data, so that it is computationally much faster to compute than recomputing a fixed estimator each time new data comes in. We refer to such an estimator as an online estimator. These online estimators can also be applied on a large fixed data base by dividing the data set in many subsets and enforcing an ordering of these subsets. The current literature provides such online estimators for parametric models, where the online estimators are based on variations of the stochastic gradient descent algorithm.
For that purpose we propose a new online one-step estimator, which is proven to be asymptotically efficient under regularity conditions. This estimator takes as input online estimators of the relevant part of the data generating distribution and the nuisance parameter that are required for efficient estimation of the target parameter. These estimators could be an online stochastic gradient descent estimator based on large parametric models as developed in the current literature, but we also propose other online data adaptive estimators that do not rely on the specification of a particular parametric model.
We also present a targeted version of this online one-step estimator that presumably minimizes the one-step correction and thereby might be more robust in finite samples. These online one-step estimators are not a substitution estimator and might therefore be unstable for finite samples if the target parameter is borderline identifiable.
Therefore we also develop an online targeted minimum loss-based estimator, which updates the initial estimator of the relevant part of the data generating distribution by updating the current initial estimator with the new block of data, and estimates the target parameter with the corresponding plug-in estimator. The online substitution estimator is also proven to be asymptotically efficient under the same regularity conditions required for asymptotic normality of the online one-step estimator.
The online one-step estimator, targeted online one-step estimator, and online TMLE is demonstrated for estimation of a causal effect of a binary treatment on an outcome based on a dynamic data base that gets regularly updated, a common scenario for the analysis of electronic medical record data bases.
Finally, we extend these online estimators to a group sequential adaptive design in which certain components of the data generating experiment are continuously fine-tuned based on past data, and the new data generating distribution is then used to generate the next block of data
Balancing Score Adjusted Targeted Minimum Loss-based Estimation
Adjusting for a balancing score is sufficient for bias reduction when estimating causal effects including the average treatment effect and effect among the treated. Estimators that adjust for the propensity score in a nonparametric way, such as matching on an estimate of the propensity score, can be consistent when the estimated propensity score is not consistent for the true propensity score but converges to some other balancing score. We call this property the balancing score property, and discuss a class of estimators that have this property. We introduce a targeted minimum loss-based estimator (TMLE) for a treatment specific mean with the balancing score property that is additionally locally efficient and doubly robust. We investigate the new estimator\u27s performance relative to other estimators, including another TMLE, a propensity score matching estimator, an inverse probability of treatment weighted estimator, and a regression based estimator in simulation studies
Online Cross-Validation-Based Ensemble Learning
Online estimators update a current estimate with a new incoming batch of data without having to revisit past data thereby providing streaming estimates that are scalable to big data. We develop flexible, ensemble-based online estimators of an infinite-dimensional target parameter, such as a regression function, in the setting where data are generated sequentially by a common conditional data distribution given summary measures of the past. This setting encompasses a wide range of time-series models and as special case, models for independent and identically distributed data. Our estimator considers a large library of candidate online estimators and uses online cross-validation to identify the algorithm with the best performance. We show that by basing estimates on the cross-validation-selected algorithm, we are asymptotically guaranteed to perform as well as the true, unknown best-performing algorithm. We provide extensions of this approach including online estimation of the optimal ensemble of candidate online estimators. We illustrate the practical performance of our methods using simulations and a real data example where we make streaming predictions of infectious disease incidence using data from a large database
Propensity score prediction for electronic healthcare databases using Super Learner and High-dimensional Propensity Score Methods
The optimal learner for prediction modeling varies depending on the underlying data-generating distribution. Super Learner (SL) is a generic ensemble learning algorithm that uses cross-validation to select among a library of candidate prediction models. The SL is not restricted to a single prediction model, but uses the strengths of a variety of learning algorithms to adapt to different databases. While the SL has been shown to perform well in a number of settings, it has not been thoroughly evaluated in large electronic healthcare databases that are common in pharmacoepidemiology and comparative effectiveness research. In this study, we applied and evaluated the performance of the SL in its ability to predict treatment assignment using three electronic healthcare databases. We considered a library of algorithms that consisted of both nonparametric and parametric models. We also considered a novel strategy for prediction modeling that combines the SL with the high-dimensional propensity score (hdPS) variable selection algorithm. Predictive performance was assessed using three metrics: the negative log-likelihood, area under the curve (AUC), and time complexity. Results showed that the best individual algorithm, in terms of predictive performance, varied across datasets. The SL was able to adapt to the given dataset and optimize predictive performance relative to any individual learner. Combining the SL with the hdPS was the most consistent prediction method and may be promising for PS estimation and prediction modeling in electronic healthcare databases
- …