44,219 research outputs found
Estimating spatial quantile regression with functional coefficients: A robust semiparametric framework
This paper considers an estimation of semiparametric functional
(varying)-coefficient quantile regression with spatial data. A general robust
framework is developed that treats quantile regression for spatial data in a
natural semiparametric way. The local M-estimators of the unknown
functional-coefficient functions are proposed by using local linear
approximation, and their asymptotic distributions are then established under
weak spatial mixing conditions allowing the data processes to be either
stationary or nonstationary with spatial trends. Application to a soil data set
is demonstrated with interesting findings that go beyond traditional analysis.Comment: Published in at http://dx.doi.org/10.3150/12-BEJ480 the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
Blaming the exogenous environment? Conditional efficiency estimation with continuous and discrete exogenous variables
This paper proposes a fully nonparametric framework to estimate relative efficiency of entities while accounting for a mixed set of continuous and discrete (both ordered and unordered) exogenous variables. Using robust partial frontier techniques, the probabilistic and conditional characterization of the production process, as well as insights from the recent developments in nonparametric econometrics, we present a generalized approach for conditional efficiency measurement. To do so, we utilize a tailored mixed kernel function with a data-driven bandwidth selection. So far only descriptive analysis for studying the effect of heterogeneity in conditional efficiency estimation has been suggested. We show how to use and interpret nonparametric bootstrap-based significance tests in a generalized conditional efficiency framework. This allows us to study statistical significance of continuous and discrete exogenous variables on production process. The proposed approach is illustrated using simulated examples as well as a sample of British pupils from the OECD Pisa data set. The results of the empirical application show that several exogenous discrete factors have a statistically significant effect on the educational process.Nonparametric estimation, Conditional efficiency measures, Exogenous factors, Generalized kernel function, Education
Marginal integration for nonparametric causal inference
We consider the problem of inferring the total causal effect of a single
variable intervention on a (response) variable of interest. We propose a
certain marginal integration regression technique for a very general class of
potentially nonlinear structural equation models (SEMs) with known structure,
or at least known superset of adjustment variables: we call the procedure
S-mint regression. We easily derive that it achieves the convergence rate as
for nonparametric regression: for example, single variable intervention effects
can be estimated with convergence rate assuming smoothness with
twice differentiable functions. Our result can also be seen as a major
robustness property with respect to model misspecification which goes much
beyond the notion of double robustness. Furthermore, when the structure of the
SEM is not known, we can estimate (the equivalence class of) the directed
acyclic graph corresponding to the SEM, and then proceed by using S-mint based
on these estimates. We empirically compare the S-mint regression method with
more classical approaches and argue that the former is indeed more robust, more
reliable and substantially simpler.Comment: 40 pages, 14 figure
Blaming the exogenous environment? Conditional efficiency estimation with continuous and discrete environmental variables
This paper proposes a fully nonparametric framework to estimate relative efficiency of entities while accounting for a mixed set of continuous and discrete (both ordered and unordered) exogenous variables. Using robust partial frontier techniques, the probabilistic and conditional characterization of the production process, as well as insights from the recent developments in nonparametric econometrics, we present a generalized approach for conditional efficiency measurement. To do so, we utilize a tailored mixed kernel function with a data-driven bandwidth selection. So far only descriptive analysis for studying the effect of heterogeneity in conditional efficiency estimation has been suggested. We show how to use and interpret nonparametric bootstrap-based significance tests in a generalized conditional efficiency framework. This allows us to study statistical significance of continuous and discrete environmental variables. The proposed approach is illustrated by a sample of British pupils from the OECD Pisa data set. The results show that several exogenous discrete factors have a significant effect on the educational process.
Interpretable statistics for complex modelling: quantile and topological learning
As the complexity of our data increased exponentially in the last decades, so has our
need for interpretable features. This thesis revolves around two paradigms to approach
this quest for insights.
In the first part we focus on parametric models, where the problem of interpretability
can be seen as a âparametrization selectionâ. We introduce a quantile-centric
parametrization and we show the advantages of our proposal in the context of regression,
where it allows to bridge the gap between classical generalized linear (mixed)
models and increasingly popular quantile methods.
The second part of the thesis, concerned with topological learning, tackles the
problem from a non-parametric perspective. As topology can be thought of as a way
of characterizing data in terms of their connectivity structure, it allows to represent
complex and possibly high dimensional through few features, such as the number of
connected components, loops and voids. We illustrate how the emerging branch of
statistics devoted to recovering topological structures in the data, Topological Data
Analysis, can be exploited both for exploratory and inferential purposes with a special
emphasis on kernels that preserve the topological information in the data.
Finally, we show with an application how these two approaches can borrow strength
from one another in the identification and description of brain activity through fMRI
data from the ABIDE project
- âŠ