31 research outputs found
Kernel-based measures of association between inputs and outputs based on ANOVA
ANOVA decomposition of function with random input variables provides ANOVA
functionals (AFs), which contain information about the contributions of the
input variables on the output variable(s). By embedding AFs into an appropriate
reproducing kernel Hilbert space regarding their distributions, we propose an
efficient statistical test of independence between the input variables and
output variable(s). The resulting test statistic leads to new dependent
measures of association between inputs and outputs that allow for i) dealing
with any distribution of AFs, including the Cauchy distribution, ii) accounting
for the necessary or desirable moments of AFs and the interactions among the
input variables. In uncertainty quantification for mathematical models, a
number of existing measures are special cases of this framework. We then
provide unified and general global sensitivity indices and their consistent
estimators, including asymptotic distributions. For Gaussian-distributed AFs,
we obtain Sobol' indices and dependent generalized sensitivity indices using
quadratic kernels
Derivative-based global sensitivity measures: general links with Sobol' indices and numerical tests
The estimation of variance-based importance measures (called Sobol' indices)
of the input variables of a numerical model can require a large number of model
evaluations. It turns to be unacceptable for high-dimensional model involving a
large number of input variables (typically more than ten). Recently, Sobol and
Kucherenko have proposed the Derivative-based Global Sensitivity Measures
(DGSM), defined as the integral of the squared derivatives of the model output,
showing that it can help to solve the problem of dimensionality in some cases.
We provide a general inequality link between DGSM and total Sobol' indices for
input variables belonging to the class of Boltzmann probability measures, thus
extending the previous results of Sobol and Kucherenko for uniform and normal
measures. The special case of log-concave measures is also described. This link
provides a DGSM-based maximal bound for the total Sobol indices. Numerical
tests show the performance of the bound and its usefulness in practice
Optimal estimators of cross-partial derivatives and surrogates of functions
Computing cross-partial derivatives using fewer model runs is relevant in modeling, such as stochastic approximation, derivative-based ANOVA, exploring complex models, and active subspaces. This paper introduces surrogates of all the cross-partial derivatives of functions by evaluating such functions at randomized points and using a set of constraints. Randomized points rely on independent, central, and symmetric variables. The associated estimators, based on model runs, reach the optimal rates of convergence (i.e., ), and the biases of our approximations do not suffer from the curse of dimensionality for a wide class of functions. Such results are used for i) computing the main and upper-bounds of sensitivity indices, and ii) deriving emulators of simulators or surrogates of functions thanks to the derivative-based ANOVA. Simulations are presented to show the accuracy of our emulators and estimators of sensitivity indices. The plug-in estimates of indices using the U-statistics of one sample are numerically much stable
Analyse de sensibilité pour les modèles dynamiques utilisés en agronomie et environnement
Dynamic models are often used to simulate the impact of agricultural practices and sometimes to test some decision rules. These models include many uncertain parameters and it is sometimes di cult or impossible to estimate all the paramters. A common practice in literature is to select key parameters by using sensitivity index and then to estimate the most in uent parameters. Although this approach is intuitive, his real interest and its consequences on the models predictive quality are not well known. Our research work aims to evaluate the practice of modellers by establishing a relationship between the sensitivity indices of model parameters and some model quality measures such as the msep (Mean Square Error of Prediction) and the MSE (Mean Square Error) often used in agronomy. Establishing such a relationship requires the development of a Sensitivity Analysis (SA) method that provides a unique index per factor and takes into account correlations between di erent model outputs. We propose a new sensitivity index that synthetizes the e ects of uncertain factors on all the dynamic outputs obtained from dynamic models. Several methods are presented in this paper to calculate the new indices. The performance of these methods are evaluated on two agricultural dynamics models : Azodyn and WWDM. We also establish, in this paper, a formal relationship between MSE, the MSEP and sensitivity indices in the case of a linear model and an empirical relationship between the MSEP and the new synthetic index in the case of a nonlinear dynamic model : CERES-EGC. These relations show that parameter selection by using sensitivity index improves models performance under some conditions.Des modèles dynamiques sont souvent utilisés pour simuler l'impact des pratiques agricoles et parfois pour tester des règles de décision. Ces modèles incluent de nombreux paramètres incertains et il est parfois difficile voire impossible de tous les estimer. Une pratique courante dans la littérature consiste às sélectionner les paramètres clés à l'aide d'indices de sensibilité calculés par simulation et de n'estimer que les paramètres les plus influents. Bien que cette démarche soit intuitive, son intérêt réel et ses conséquences sur la qualité prédictive des modèles ne sont pas connus. Nos travaux de recherches ont pour ambition d'évaluer cette pratique des modélisateurs en établissant une relation entre les indices de sensibilité des paramètres d'un modèle et des critères d'évaluation de modèles tels que le MSEP (Mean Square Error of Prediction) et le MSE (Mean Square Error), souvent utilisés en agronomie. L'établissement d'une telle relation nécessite le développement d'une méthode d'AS qui fournit un unique indice par facteur qui prend en compte les corrélations entre les différentes sorties du modèle obtenues à différentes dates. Nous proposons un nouvel indice de sensibilité global qui permet de synthétiser les effets des facteurs incertains sur l'ensemble des dynamiques simulées à l'aide de modèle. Plusieurs méthodes sont présentées dans ce mémoire pour calculer ces nouveaux indices. Les performances de ces méthodes sont évaluées pour deux modèles agronomiques dynamiques : Azodyn et WWDM. Nous établissons également dans ce mémoire, une relation formelle entre le MSE, le MSEP et les indices de sensibilité dans le cas d'un modèle linéaire et une relation empirique entre le MSEP et les indices dans le cas du modèle dynamique non linéaire CERES-EGC. Ces relations montrent que la sélection de paramètres à l'aide d'indices de sensibilité n'améliore les performances des modèles que sous certaines conditions
Derivative formulas and gradient of functions with non-independent variables
International audienceStochastic characterizations of functions subject to constraints result in treating them as functionswith non-independent variables. Using the distribution function or copula of the input variablesthat comply with such constraints, we derive two types of partial derivatives of functions withnon-independent variables (i.e., actual and dependent derivatives) and argue in favor of the latter.Dependent partial derivatives of functions with non-independent variables rely on the dependentJacobian matrix of non-independent variables, which is also used to dene a tensor metric. The dif-ferential geometric framework allows for deriving the gradient, Hessian and Taylor-type expansionof functions with non-independent variables
Optimal and Efficient Approximations of Gradients of Functions with Nonindependent Variables
Gradients of smooth functions with nonindependent variables are relevant for exploring complex models and for the optimization of the functions subjected to constraints. In this paper, we investigate new and simple approximations and computations of such gradients by making use of independent, central, and symmetric variables. Such approximations are well suited for applications in which the computations of the gradients are too expansive or impossible. The derived upper bounds of the biases of our approximations do not suffer from the curse of dimensionality for any 2-smooth function, and they theoretically improve the known results. Also, our estimators of such gradients reach the optimal (mean squared error) rates of convergence (i.e., O(N−1)) for the same class of functions. Numerical comparisons based on a test case and a high-dimensional PDE model show the efficiency of our approach