25,555 research outputs found
Non parametric estimation of the structural expectation of a stochastic increasing function
This article introduces a non parametric warping model for functional data. When the outcome of an experiment is a sample of curves, data can be seen as realizations of a stochastic process, which takes into account the small variations between the different observed curves. The aim of this work is to define a mean pattern which represents the main behaviour of the set of all the realizations. So we define the structural expectation of the underlying stochastic function. Then we provide empirical estimators of this structural expectation and of each individual warping function. Consistency and asymptotic normality for such estimators are proved
Non parametric estimation of the structural expectation of a stochastic increasing function
International audienceThis article introduces a non parametric warping model for functional data. When the outcome of an experiment is a sample of curves, data can be seen as realizations of a stochastic process, which takes into account the variations between the different observed curves. The aim of this work is to define a mean pattern which represents the main behaviour of the set of all the realizations. So, we define the structural expectation of the underlying stochastic function. Then, we provide empirical estimators of this structural expectation and of each individual warping function. Consistency and asymptotic normality for such estimators are proved
Endogeneity and Instrumental Variables in Dynamic Models
The objective of the paper is to draw the theory of endogeneity in dynamic models in discrete and continuous time, in particular for diffusions and counting processes. We first provide an extension of the separable set-up to a separable dynamic framework given in term of semi-martingale decomposition. Then we define our function of interest as a stopping time for an additional noise process, whose role is played by a Brownian motion for diffusions, and a Poisson process for counting processes.
Non Parametric Models with Instrumental Variables
This paper gives a survey of econometric models characterized by a relation between observable and unobservable random elements where these unobservable terms are assumed to be independent of another set of observable variables called instrumental variables. This kind of specification is usefull to address the question of endogeneity or of selection bias for example. These models are treated non parametrically and in all the example we consider the functional parameter of interest is defined as the solution of a linear or non linear integral equation. The estimation procedure then requires to solve a (generally ill-posed) inverse problem. We illustrate the main questions (construction of the equation, identification, numerical solution, asymptotic properties, selection of the regularization parameter) by the different models we present.
A Primer on Causality in Data Science
Many questions in Data Science are fundamentally causal in that our objective
is to learn the effect of some exposure, randomized or not, on an outcome
interest. Even studies that are seemingly non-causal, such as those with the
goal of prediction or prevalence estimation, have causal elements, including
differential censoring or measurement. As a result, we, as Data Scientists,
need to consider the underlying causal mechanisms that gave rise to the data,
rather than simply the pattern or association observed in those data. In this
work, we review the 'Causal Roadmap' of Petersen and van der Laan (2014) to
provide an introduction to some key concepts in causal inference. Similar to
other causal frameworks, the steps of the Roadmap include clearly stating the
scientific question, defining of the causal model, translating the scientific
question into a causal parameter, assessing the assumptions needed to express
the causal parameter as a statistical estimand, implementation of statistical
estimators including parametric and semi-parametric methods, and interpretation
of our findings. We believe that using such a framework in Data Science will
help to ensure that our statistical analyses are guided by the scientific
question driving our research, while avoiding over-interpreting our results. We
focus on the effect of an exposure occurring at a single time point and
highlight the use of targeted maximum likelihood estimation (TMLE) with Super
Learner.Comment: 26 pages (with references); 4 figure
Importance Sampling and its Optimality for Stochastic Simulation Models
We consider the problem of estimating an expected outcome from a stochastic
simulation model. Our goal is to develop a theoretical framework on importance
sampling for such estimation. By investigating the variance of an importance
sampling estimator, we propose a two-stage procedure that involves a regression
stage and a sampling stage to construct the final estimator. We introduce a
parametric and a nonparametric regression estimator in the first stage and
study how the allocation between the two stages affects the performance of the
final estimator. We analyze the variance reduction rates and derive oracle
properties of both methods. We evaluate the empirical performances of the
methods using two numerical examples and a case study on wind turbine
reliability evaluation.Comment: 37 pages, 6 figures, 2 tables. Accepted to the Electronic Journal of
Statistic
Causal inference for social network data
We describe semiparametric estimation and inference for causal effects using
observational data from a single social network. Our asymptotic result is the
first to allow for dependence of each observation on a growing number of other
units as sample size increases. While previous methods have generally
implicitly focused on one of two possible sources of dependence among social
network observations, we allow for both dependence due to transmission of
information across network ties, and for dependence due to latent similarities
among nodes sharing ties. We describe estimation and inference for new causal
effects that are specifically of interest in social network settings, such as
interventions on network ties and network structure. Using our methods to
reanalyze the Framingham Heart Study data used in one of the most influential
and controversial causal analyses of social network data, we find that after
accounting for network structure there is no evidence for the causal effects
claimed in the original paper
- …