51 research outputs found

    Bayesian sequential design of computer experiments to estimate reliable sets

    Full text link
    We consider an unknown multivariate function representing a system-such as a complex numerical simulator-taking both deterministic and uncertain inputs. Our objective is to estimate the set of deterministic inputs leading to outputs whose probability (with respect to the distribution of the uncertain inputs) to belong to a given set is controlled by a given threshold. To solve this problem, we propose a Bayesian strategy based on the Stepwise Uncertainty Reduction (SUR) principle to sequentially choose the points at which the function should be evaluated to approximate the set of interest. We illustrate its performance and interest in several numerical experiments

    Some Bayesian insights for statistical tolerance analysis

    Get PDF
    Functionality of assembled products mostly rely on the ability of the manufacturer to produce under some quality requirements. Parts which do not meet these requirements represent a manufacturing waste which can be at the origin of substantial losses in terms of money and credibility. Quality con- trol and defect detection are two keypoints of predictive process management. At the design stage, a statistical tolerance analysis can be performed to predict the process quality. This imply to estimate a so-called defect probability which quantifies the probability that the final assembly does not meet functional requirements. In general, this quantity depends on a number of process specifications (toler- ances, capability levels) set a priori by the manufacturer, but also on the monitoring of the process itself since the process parameters (mean shift value and standard deviation) vary statistically for different batches. In this paper, we give an alternative point of view on an existing method, namely the Advanced Probability-based Tolerance Analysis of products (APTA), proposed in literature to estimate the defect probability. This method, originally relying on a double-loop sampling strategy, is revisited within the Bayesian framework, and an augmented approach is proposed to estimate the defect probability in a more efficient way. The efficiency of the augmented approach for solving tolerancing problems with APTA is illustrated on a linear reference test-case

    Analyse de sensibilité fiabiliste avec prise en compte d'incertitudes sur le modèle probabiliste - Application aux systèmes aérospatiaux

    Get PDF
    Aerospace systems are complex engineering systems for which reliability has to be guaranteed at an early design phase, especially regarding the potential tremendous damage and costs that could be induced by any failure. Moreover, the management of various sources of uncertainties, either impacting the behavior of systems (“aleatory” uncertainty due to natural variability of physical phenomena) and/or their modeling and simulation (“epistemic” uncertainty due to lack of knowledge and modeling choices) is a cornerstone for reliability assessment of those systems. Thus, uncertainty quantification and its underlying methodology consists in several phases. Firstly, one needs to model and propagate uncertainties through the computer model which is considered as a “black-box”. Secondly, a relevant quantity of interest regarding the goal of the study, e.g., a failure probability here, has to be estimated. For highly-safe systems, the failure probability which is sought is very low and may be costly-to-estimate. Thirdly, a sensitivity analysis of the quantity of interest can be set up in order to better identify and rank the influential sources of uncertainties in input. Therefore, the probabilistic modeling of input variables (epistemic uncertainty) might strongly influence the value of the failure probability estimate obtained during the reliability analysis. A deeper investigation about the robustness of the probability estimate regarding such a type of uncertainty has to be conducted. This thesis addresses the problem of taking probabilistic modeling uncertainty of the stochastic inputs into account. Within the probabilistic framework, a “bi-level” input uncertainty has to be modeled and propagated all along the different steps of the uncertainty quantification methodology. In this thesis, the uncertainties are modeled within a Bayesian framework in which the lack of knowledge about the distribution parameters is characterized by the choice of a prior probability density function. During a first phase, after the propagation of the bi-level input uncertainty, the predictive failure probability is estimated and used as the current reliability measure instead of the standard failure probability. Then, during a second phase, a local reliability-oriented sensitivity analysis based on the use of score functions is achieved to study the impact of hyper-parameterization of the prior on the predictive failure probability estimate. Finally, in a last step, a global reliability-oriented sensitivity analysis based on Sobol indices on the indicator function adapted to the bi-level input uncertainty is proposed. All the proposed methodologies are tested and challenged on a representative industrial aerospace test-case simulating the fallout of an expendable space launcher.Les systèmes aérospatiaux sont des systèmes complexes dont la fiabilité doit être garantie dès la phase de conception au regard des coûts liés aux dégâts gravissimes qu’engendrerait la moindre défaillance. En outre, la prise en compte des incertitudes influant sur le comportement (incertitudes dites « aléatoires » car liées à la variabilité naturelle de certains phénomènes) et la modélisation de ces systèmes (incertitudes dites « épistémiques » car liées au manque de connaissance et aux choix de modélisation) permet d’estimer la fiabilité de tels systèmes et demeure un enjeu crucial en ingénierie. Ainsi, la quantification des incertitudes et sa méthodologie associée consiste, dans un premier temps, à modéliser puis propager ces incertitudes à travers le modèle numérique considéré comme une « boîte-noire ». Dès lors, le but est d’estimer une quantité d’intérêt fiabiliste telle qu’une probabilité de défaillance. Pour les systèmes hautement fiables, la probabilité de défaillance recherchée est très faible, et peut être très coûteuse à estimer. D’autre part, une analyse de sensibilité de la quantité d’intérêt vis-à-vis des incertitudes en entrée peut être réalisée afin de mieux identifier et hiérarchiser l’influence des différentes sources d’incertitudes. Ainsi, la modélisation probabiliste des variables d’entrée (incertitude épistémique) peut jouer un rôle prépondérant dans la valeur de la probabilité obtenue. Une analyse plus profonde de l’impact de ce type d’incertitude doit être menée afin de donner une plus grande confiance dans la fiabilité estimée. Cette thèse traite de la prise en compte de la méconnaissance du modèle probabiliste des entrées stochastiques du modèle. Dans un cadre probabiliste, un « double niveau » d’incertitudes (aléatoires/épistémiques) doit être modélisé puis propagé à travers l’ensemble des étapes de la méthodologie de quantification des incertitudes. Dans cette thèse, le traitement des incertitudes est effectué dans un cadre bayésien où la méconnaissance sur les paramètres de distribution des variables d‘entrée est caractérisée par une densité a priori. Dans un premier temps, après propagation du double niveau d’incertitudes, la probabilité de défaillance prédictive est utilisée comme mesure de substitution à la probabilité de défaillance classique. Dans un deuxième temps, une analyse de sensibilité locale à base de score functions de cette probabilité de défaillance prédictive vis-à-vis des hyper-paramètres de loi de probabilité des variables d’entrée est proposée. Enfin, une analyse de sensibilité globale à base d’indices de Sobol appliqués à la variable binaire qu’est l’indicatrice de défaillance est réalisée. L’ensemble des méthodes proposées dans cette thèse est appliqué à un cas industriel de retombée d’un étage de lanceur

    Reliability-oriented sensitivity analysis under probabilistic model uncertainty – Application to aerospace systems

    No full text
    Les systèmes aérospatiaux sont des systèmes complexes dont la fiabilité doit être garantie dès la phase de conception au regard des coûts liés aux dégâts gravissimes qu’engendrerait la moindre défaillance. En outre, la prise en compte des incertitudes influant sur le comportement (incertitudes dites « aléatoires » car liées à la variabilité naturelle de certains phénomènes) et la modélisation de ces systèmes (incertitudes dites « épistémiques » car liées au manque de connaissance et aux choix de modélisation) permet d’estimer la fiabilité de tels systèmes et demeure un enjeu crucial en ingénierie. Ainsi, la quantification des incertitudes et sa méthodologie associée consiste, dans un premier temps, à modéliser puis propager ces incertitudes à travers le modèle numérique considéré comme une « boîte-noire ». Dès lors, le but est d’estimer une quantité d’intérêt fiabiliste telle qu’une probabilité de défaillance. Pour les systèmes hautement fiables, la probabilité de défaillance recherchée est très faible, et peut être très coûteuse à estimer. D’autre part, une analyse de sensibilité de la quantité d’intérêt vis-à-vis des incertitudes en entrée peut être réalisée afin de mieux identifier et hiérarchiser l’influence des différentes sources d’incertitudes. Ainsi, la modélisation probabiliste des variables d’entrée (incertitude épistémique) peut jouer un rôle prépondérant dans la valeur de la probabilité obtenue. Une analyse plus profonde de l’impact de ce type d’incertitude doit être menée afin de donner une plus grande confiance dans la fiabilité estimée. Cette thèse traite de la prise en compte de la méconnaissance du modèle probabiliste des entrées stochastiques du modèle. Dans un cadre probabiliste, un « double niveau » d’incertitudes (aléatoires/épistémiques) doit être modélisé puis propagé à travers l’ensemble des étapes de la méthodologie de quantification des incertitudes. Dans cette thèse, le traitement des incertitudes est effectué dans un cadre bayésien où la méconnaissance sur les paramètres de distribution des variables d‘entrée est caractérisée par une densité a priori. Dans un premier temps, après propagation du double niveau d’incertitudes, la probabilité de défaillance prédictive est utilisée comme mesure de substitution à la probabilité de défaillance classique. Dans un deuxième temps, une analyse de sensibilité locale à base de score functions de cette probabilité de défaillance prédictive vis-à-vis des hyper-paramètres de loi de probabilité des variables d’entrée est proposée. Enfin, une analyse de sensibilité globale à base d’indices de Sobol appliqués à la variable binaire qu’est l’indicatrice de défaillance est réalisée. L’ensemble des méthodes proposées dans cette thèse est appliqué à un cas industriel de retombée d’un étage de lanceur.Aerospace systems are complex engineering systems for which reliability has to be guaranteed at an early design phase, especially regarding the potential tremendous damage and costs that could be induced by any failure. Moreover, the management of various sources of uncertainties, either impacting the behavior of systems (“aleatory” uncertainty due to natural variability of physical phenomena) and/or their modeling and simulation (“epistemic” uncertainty due to lack of knowledge and modeling choices) is a cornerstone for reliability assessment of those systems. Thus, uncertainty quantification and its underlying methodology consists in several phases. Firstly, one needs to model and propagate uncertainties through the computer model which is considered as a “black-box”. Secondly, a relevant quantity of interest regarding the goal of the study, e.g., a failure probability here, has to be estimated. For highly-safe systems, the failure probability which is sought is very low and may be costly-to-estimate. Thirdly, a sensitivity analysis of the quantity of interest can be set up in order to better identify and rank the influential sources of uncertainties in input. Therefore, the probabilistic modeling of input variables (epistemic uncertainty) might strongly influence the value of the failure probability estimate obtained during the reliability analysis. A deeper investigation about the robustness of the probability estimate regarding such a type of uncertainty has to be conducted. This thesis addresses the problem of taking probabilistic modeling uncertainty of the stochastic inputs into account. Within the probabilistic framework, a “bi-level” input uncertainty has to be modeled and propagated all along the different steps of the uncertainty quantification methodology. In this thesis, the uncertainties are modeled within a Bayesian framework in which the lack of knowledge about the distribution parameters is characterized by the choice of a prior probability density function. During a first phase, after the propagation of the bi-level input uncertainty, the predictive failure probability is estimated and used as the current reliability measure instead of the standard failure probability. Then, during a second phase, a local reliability-oriented sensitivity analysis based on the use of score functions is achieved to study the impact of hyper-parameterization of the prior on the predictive failure probability estimate. Finally, in a last step, a global reliability-oriented sensitivity analysis based on Sobol indices on the indicator function adapted to the bi-level input uncertainty is proposed. All the proposed methodologies are tested and challenged on a representative industrial aerospace test-case simulating the fallout of an expendable space launcher

    Statistical developments for target and conditional sensitivity analysis: application on safety studies for nuclear reactor

    No full text
    Numerical simulators are essential for understanding, modeling and predicting physical phenomena. However, the available information about some of the input variables is often limited or uncertain. Global sensitivity analysis (GSA) then aims at determining (qualitatively or quantitatively) how the variability of the inputs affects the model output. However, from reliability and risk management perspectives, GSA might be insufficient to capture the influence of the inputs on a restricted domain of the output (e.g., a distribution tail). To remedy this, we define and use in this work target (TSA) and conditional sensitivity analysis (CSA), which aim respectively at measuring the influence of the inputs on the occurrence of the critical event, and on the output within the critical domain (ignoring what happens outside). As illustrated in the applications, these two notions can widely differ. From existing GSA measures, we propose new operational tools for TSA and CSA. We first focus on the popular Sobol indices and show their practical limitations for both TSA and CSA. Then, the Hilbert-Schmidt Independence Criterion (HSIC), a dependence measure recently adapted for GSA purposes and well-suited for small datasets, is considered. TSA and CSA adaptations of Sobol and HSIC indices, and associated statistical estimators, are defined. Alternative CSA Sobol indices are thus defined to overcome the dependence of inputs induced by the conditioning. Moreover, to cope with the loss of information (especially when the critical domain is associated to a low probability) and reduce the variability of estimators, transformation of the output using weight functions is also proposed. These new TSA and CSA tools are tested and compared on analytical examples. The efficiency of HSIC-based indices clearly appear, as well as the relevancy of smooth relaxation. Finally, these latter indices are applied and interpreted on a nuclear engineering use case simulating a severe accidental scenario on a pressurized water reactor

    Variance-based importance measures for machine learning model interpretability

    No full text
    International audienceMachine learning algorithms benefit from an unprecedented boost in the industrial world, in particular in support of decision-making for critical systems. However, their lack of “interpretability” remains a challenge to leverage in order to make these tools fully intelligible and auditable. This paper aims to track and synthesize of a panel of interpretability metrics (called “importance measures”) whose aim is to quantify the impact of each predictor on the statistical model’s output variance. It is shown that the choice of a relevant metric has to be guided by proper constraints imposed by the data and the considered model (linear vs. nonlinear phenomenon of interest, input dimension, input dependency) together with taking the type of study the user wants to perform into consideration (detect influential variables, rank them, etc.). Finally, these metrics are estimated and analyzed on a public dataset so as to illustrate some of their theoretical and empirical properties.Les algorithmes statistiques d'apprentissage automatique (ou machine learning) connaissent un essor sans précédent dans le monde industriel, notamment pour l'aide à la décision en ingénierie des systèmes critiques. Toutefois, leur manque d'"interprétabilité" est un verrou à lever afin de rendre ces outils intelligibles et auditables. Ce papier vise à dresser une cartographie de certaines métriques d'interprétabilité (appelées "mesures d'importance") dont le but est de quantifier l'impact de chaque prédicteur sur la variance de la sortie du modèle statistique. Il est montré que le choix d'une métrique pertinente doit être guidé par les contraintes inhérentes aux données et au modèle considéré (caractère linéaire ou non du phénomène d'intérêt, dimension du problème, dépendance des prédicteurs) et par le type d'étude que l'utilisateur souhaite mener (détecter les variables influentes, les hiérarchiser, etc.). Enfin, ces métriques sont estimées et analysées sur un jeu de données public afin d'illustrer certaines de leurs propriétés théoriques et empiriques. Keywords-apprentissage statistique, interprétabilité, analyse de sensibilité, effets de Shapley, indices de Sobol' Abstract-Machine learning algorithms benefit from an unprecedented boost in the industrial world, in particular in support of decision-making for critical systems. However, their lack of "interpretability" remains a challenge to leverage in order to make these tools fully intelligible and auditable. This paper aims to track and synthesize of a panel of interpretability metrics (called "importance measures") whose aim is to quantify the impact of each predictor on the statistical model's output variance. It is shown that the choice of a relevant metric has to be guided by proper constraints imposed by the data and the considered model (linear vs. nonlinear phenomenon of interest, input dimension, input dependency) together with taking the type of study the user wants to perform into consideration (detect influential variables, rank them, etc.). Finally, these metrics are estimated and analyzed on a public dataset so as to illustrate some of their theoretical and empirical properties

    Variance-based importance measures for machine learning model interpretability

    No full text
    Machine learning algorithms benefit from an unprecedented boost in the industrial world, in particular in support of decision-making for critical systems. However, their lack of “interpretability” remains a challenge to leverage in order to make these tools fully intelligible and auditable. This paper aims to track and synthesize of a panel of interpretability metrics (called “importance measures”) whose aim is to quantify the impact of each predictor on the statistical model’s output variance. It is shown that the choice of a relevant metric has to be guided by proper constraints imposed by the data and the considered model (linear vs. nonlinear phenomenon of interest, input dimension, input dependency) together with taking the type of study the user wants to perform into consideration (detect influential variables, rank them, etc.). Finally, these metrics are estimated and analyzed on a public dataset so as to illustrate some of their theoretical and empirical properties.Les algorithmes statistiques d'apprentissage automatique (ou machine learning) connaissent un essor sans précédent dans le monde industriel, notamment pour l'aide à la décision en ingénierie des systèmes critiques. Toutefois, leur manque d'"interprétabilité" est un verrou à lever afin de rendre ces outils intelligibles et auditables. Ce papier vise à dresser une cartographie de certaines métriques d'interprétabilité (appelées "mesures d'importance") dont le but est de quantifier l'impact de chaque prédicteur sur la variance de la sortie du modèle statistique. Il est montré que le choix d'une métrique pertinente doit être guidé par les contraintes inhérentes aux données et au modèle considéré (caractère linéaire ou non du phénomène d'intérêt, dimension du problème, dépendance des prédicteurs) et par le type d'étude que l'utilisateur souhaite mener (détecter les variables influentes, les hiérarchiser, etc.). Enfin, ces métriques sont estimées et analysées sur un jeu de données public afin d'illustrer certaines de leurs propriétés théoriques et empiriques. Keywords-apprentissage statistique, interprétabilité, analyse de sensibilité, effets de Shapley, indices de Sobol' Abstract-Machine learning algorithms benefit from an unprecedented boost in the industrial world, in particular in support of decision-making for critical systems. However, their lack of "interpretability" remains a challenge to leverage in order to make these tools fully intelligible and auditable. This paper aims to track and synthesize of a panel of interpretability metrics (called "importance measures") whose aim is to quantify the impact of each predictor on the statistical model's output variance. It is shown that the choice of a relevant metric has to be guided by proper constraints imposed by the data and the considered model (linear vs. nonlinear phenomenon of interest, input dimension, input dependency) together with taking the type of study the user wants to perform into consideration (detect influential variables, rank them, etc.). Finally, these metrics are estimated and analyzed on a public dataset so as to illustrate some of their theoretical and empirical properties

    Variance-based importance measures for machine learning model interpretability

    No full text
    International audienceMachine learning algorithms benefit from an unprecedented boost in the industrial world, in particular in support of decision-making for critical systems. However, their lack of "interpretability" remains a challenge to leverage in order to make these tools fully intelligible and auditable. This paper aims to track and synthesize of a panel of interpretability metrics (called "importance measures") whose aim is to quantify the impact of each predictor on the statistical model's output variance. It is shown that the choice of a relevant metric has to be guided by proper constraints imposed by the data and the considered model (linear vs. nonlinear phenomenon of interest, input dimension, input dependency) together with taking the type of study the user wants to perform into consideration (detect influential variables, rank them, etc.). Finally, these metrics are estimated and analyzed on a public dataset so as to illustrate some of their theoretical and empirical properties

    Variance-based importance measures for machine learning model interpretability

    No full text
    International audienceMachine learning algorithms benefit from an unprecedented boost in the industrial world, in particular in support of decision-making for critical systems. However, their lack of "interpretability" remains a challenge to leverage in order to make these tools fully intelligible and auditable. This paper aims to track and synthesize of a panel of interpretability metrics (called "importance measures") whose aim is to quantify the impact of each predictor on the statistical model's output variance. It is shown that the choice of a relevant metric has to be guided by proper constraints imposed by the data and the considered model (linear vs. nonlinear phenomenon of interest, input dimension, input dependency) together with taking the type of study the user wants to perform into consideration (detect influential variables, rank them, etc.). Finally, these metrics are estimated and analyzed on a public dataset so as to illustrate some of their theoretical and empirical properties
    corecore