8,298 research outputs found
Nonparametric Covariate Adjustment for Receiver Operating Characteristic Curves
The accuracy of a diagnostic test is typically characterised using the
receiver operating characteristic (ROC) curve. Summarising indexes such as the
area under the ROC curve (AUC) are used to compare different tests as well as
to measure the difference between two populations. Often additional information
is available on some of the covariates which are known to influence the
accuracy of such measures. We propose nonparametric methods for covariate
adjustment of the AUC. Models with normal errors and non-normal errors are
discussed and analysed separately. Nonparametric regression is used for
estimating mean and variance functions in both scenarios. In the general noise
case we propose a covariate-adjusted Mann-Whitney estimator for AUC estimation
which effectively uses available data to construct working samples at any
covariate value of interest and is computationally efficient for
implementation. This provides a generalisation of the Mann-Whitney approach for
comparing two populations by taking covariate effects into account. We derive
asymptotic properties for the AUC estimators in both settings, including
asymptotic normality, optimal strong uniform convergence rates and MSE
consistency. The usefulness of the proposed methods is demonstrated through
simulated and real data examples
Debiased inference for a covariate-adjusted regression function
In this article, we study nonparametric inference for a covariate-adjusted
regression function. This parameter captures the average association between a
continuous exposure and an outcome after adjusting for other covariates. In
particular, under certain causal conditions, this parameter corresponds to the
average outcome had all units been assigned to a specific exposure level, known
as the causal dose-response curve. We propose a debiased local linear estimator
of the covariate-adjusted regression function, and demonstrate that our
estimator converges pointwise to a mean-zero normal limit distribution. We use
this result to construct asymptotically valid confidence intervals for function
values and differences thereof. In addition, we use approximation results for
the distribution of the supremum of an empirical process to construct
asymptotically valid uniform confidence bands. Our methods do not require
undersmoothing, permit the use of data-adaptive estimators of nuisance
functions, and our estimator attains the optimal rate of convergence for a
twice differentiable function. We illustrate the practical performance of our
estimator using numerical studies and an analysis of the effect of air
pollution exposure on cardiovascular mortality
Binscatter Regressions
We introduce the \texttt{Stata} (and \texttt{R}) package \textsf{Binsreg},
which implements the binscatter methods developed in
\citet*{Cattaneo-Crump-Farrell-Feng_2019_Binscatter}. The package includes the
commands \texttt{binsreg}, \texttt{binsregtest}, and \texttt{binsregselect}.
The first command (\texttt{binsreg}) implements binscatter for the regression
function and its derivatives, offering several point estimation, confidence
intervals and confidence bands procedures, with particular focus on
constructing binned scatter plots. The second command (\texttt{binsregtest})
implements hypothesis testing procedures for parametric specification and for
nonparametric shape restrictions of the unknown regression function. Finally,
the third command (\texttt{binsregselect}) implements data-driven number of
bins selectors for binscatter implementation using either quantile-spaced or
evenly-spaced binning/partitioning. All the commands allow for covariate
adjustment, smoothness restrictions, weighting and clustering, among other
features. A companion \texttt{R} package with the same capabilities is also
available
On Binscatter
Binscatter is very popular in applied microeconomics. It provides a flexible,
yet parsimonious way of visualizing and summarizing large data sets in
regression settings, and it is often used for informal evaluation of
substantive hypotheses such as linearity or monotonicity of the regression
function. This paper presents a foundational, thorough analysis of binscatter:
we give an array of theoretical and practical results that aid both in
understanding current practices (i.e., their validity or lack thereof) and in
offering theory-based guidance for future applications. Our main results
include principled number of bins selection, confidence intervals and bands,
hypothesis tests for parametric and shape restrictions of the regression
function, and several other new methods, applicable to canonical binscatter as
well as higher-order polynomial, covariate-adjusted and smoothness-restricted
extensions thereof. In particular, we highlight important methodological
problems related to covariate adjustment methods used in current practice. We
also discuss extensions to clustered data. Our results are illustrated with
simulated and real data throughout. Companion general-purpose software packages
for \texttt{Stata} and \texttt{R} are provided. Finally, from a technical
perspective, new theoretical results for partitioning-based series estimation
are obtained that may be of independent interest
- …