8,298 research outputs found

    Nonparametric Covariate Adjustment for Receiver Operating Characteristic Curves

    Full text link
    The accuracy of a diagnostic test is typically characterised using the receiver operating characteristic (ROC) curve. Summarising indexes such as the area under the ROC curve (AUC) are used to compare different tests as well as to measure the difference between two populations. Often additional information is available on some of the covariates which are known to influence the accuracy of such measures. We propose nonparametric methods for covariate adjustment of the AUC. Models with normal errors and non-normal errors are discussed and analysed separately. Nonparametric regression is used for estimating mean and variance functions in both scenarios. In the general noise case we propose a covariate-adjusted Mann-Whitney estimator for AUC estimation which effectively uses available data to construct working samples at any covariate value of interest and is computationally efficient for implementation. This provides a generalisation of the Mann-Whitney approach for comparing two populations by taking covariate effects into account. We derive asymptotic properties for the AUC estimators in both settings, including asymptotic normality, optimal strong uniform convergence rates and MSE consistency. The usefulness of the proposed methods is demonstrated through simulated and real data examples

    Debiased inference for a covariate-adjusted regression function

    Full text link
    In this article, we study nonparametric inference for a covariate-adjusted regression function. This parameter captures the average association between a continuous exposure and an outcome after adjusting for other covariates. In particular, under certain causal conditions, this parameter corresponds to the average outcome had all units been assigned to a specific exposure level, known as the causal dose-response curve. We propose a debiased local linear estimator of the covariate-adjusted regression function, and demonstrate that our estimator converges pointwise to a mean-zero normal limit distribution. We use this result to construct asymptotically valid confidence intervals for function values and differences thereof. In addition, we use approximation results for the distribution of the supremum of an empirical process to construct asymptotically valid uniform confidence bands. Our methods do not require undersmoothing, permit the use of data-adaptive estimators of nuisance functions, and our estimator attains the optimal rate of convergence for a twice differentiable function. We illustrate the practical performance of our estimator using numerical studies and an analysis of the effect of air pollution exposure on cardiovascular mortality

    Binscatter Regressions

    Full text link
    We introduce the \texttt{Stata} (and \texttt{R}) package \textsf{Binsreg}, which implements the binscatter methods developed in \citet*{Cattaneo-Crump-Farrell-Feng_2019_Binscatter}. The package includes the commands \texttt{binsreg}, \texttt{binsregtest}, and \texttt{binsregselect}. The first command (\texttt{binsreg}) implements binscatter for the regression function and its derivatives, offering several point estimation, confidence intervals and confidence bands procedures, with particular focus on constructing binned scatter plots. The second command (\texttt{binsregtest}) implements hypothesis testing procedures for parametric specification and for nonparametric shape restrictions of the unknown regression function. Finally, the third command (\texttt{binsregselect}) implements data-driven number of bins selectors for binscatter implementation using either quantile-spaced or evenly-spaced binning/partitioning. All the commands allow for covariate adjustment, smoothness restrictions, weighting and clustering, among other features. A companion \texttt{R} package with the same capabilities is also available

    On Binscatter

    Full text link
    Binscatter is very popular in applied microeconomics. It provides a flexible, yet parsimonious way of visualizing and summarizing large data sets in regression settings, and it is often used for informal evaluation of substantive hypotheses such as linearity or monotonicity of the regression function. This paper presents a foundational, thorough analysis of binscatter: we give an array of theoretical and practical results that aid both in understanding current practices (i.e., their validity or lack thereof) and in offering theory-based guidance for future applications. Our main results include principled number of bins selection, confidence intervals and bands, hypothesis tests for parametric and shape restrictions of the regression function, and several other new methods, applicable to canonical binscatter as well as higher-order polynomial, covariate-adjusted and smoothness-restricted extensions thereof. In particular, we highlight important methodological problems related to covariate adjustment methods used in current practice. We also discuss extensions to clustered data. Our results are illustrated with simulated and real data throughout. Companion general-purpose software packages for \texttt{Stata} and \texttt{R} are provided. Finally, from a technical perspective, new theoretical results for partitioning-based series estimation are obtained that may be of independent interest
    • …
    corecore