801 research outputs found

    Open TURNS: An industrial software for uncertainty quantification in simulation

    Full text link
    The needs to assess robust performances for complex systems and to answer tighter regulatory processes (security, safety, environmental control, and health impacts, etc.) have led to the emergence of a new industrial simulation challenge: to take uncertainties into account when dealing with complex numerical simulation frameworks. Therefore, a generic methodology has emerged from the joint effort of several industrial companies and academic institutions. EDF R&D, Airbus Group and Phimeca Engineering started a collaboration at the beginning of 2005, joined by IMACS in 2014, for the development of an Open Source software platform dedicated to uncertainty propagation by probabilistic methods, named OpenTURNS for Open source Treatment of Uncertainty, Risk 'N Statistics. OpenTURNS addresses the specific industrial challenges attached to uncertainties, which are transparency, genericity, modularity and multi-accessibility. This paper focuses on OpenTURNS and presents its main features: openTURNS is an open source software under the LGPL license, that presents itself as a C++ library and a Python TUI, and which works under Linux and Windows environment. All the methodological tools are described in the different sections of this paper: uncertainty quantification, uncertainty propagation, sensitivity analysis and metamodeling. A section also explains the generic wrappers way to link openTURNS to any external code. The paper illustrates as much as possible the methodological tools on an educational example that simulates the height of a river and compares it to the height of a dyke that protects industrial facilities. At last, it gives an overview of the main developments planned for the next few years

    Advanced maximum entropy approaches for medical and microscopy imaging

    Get PDF
    The maximum entropy framework is a cornerstone of statistical inference, which is employed at a growing rate for constructing models capable of describing and predicting biological systems, particularly complex ones, from empirical datasets.‎ In these high-yield applications, determining exact probability distribution functions with only minimal information about data characteristics and without utilizing human subjectivity is of particular interest. In this thesis, an automated procedure of this kind for univariate and bivariate data is employed to reach this objective through combining the maximum entropy method with an appropriate optimization method. The only necessary characteristics of random variables are their continuousness and ability to be approximated as independent and identically distributed. In this work, we try to concisely present two numerical probabilistic algorithms and apply them to estimate the univariate and bivariate models of the available data. In the first case, a combination of the maximum entropy method, Newton's method, and the Bayesian maximum a posteriori approach leads to the estimation of the kinetic parameters with arterial input functions (AIFs) in cases without any measurement of the AIF. ‎The results shows that the AIF can reliably be determined from the data of dynamic contrast enhanced-magnetic resonance imaging (DCE-MRI) by maximum entropy method. Then, kinetic parameters can be obtained. By using the developed method, a good data fitting and thus a more accurate prediction of the kinetic parameters are achieved, which, in turn, leads to a more reliable application of DCE-MRI. ‎ In the bivariate case, we consider colocalization as a quantitative analysis in fluorescence microscopy imaging. The method proposed in this case is obtained by combining the Maximum Entropy Method (MEM) and a Gaussian Copula, which we call the Maximum Entropy Copula (MEC). This novel method is capable of measuring the spatial and nonlinear correlation of signals to obtain the colocalization of markers in fluorescence microscopy images. Based on the results, MEC is able to specify co- and anti-colocalization even in high-background situations.‎ ‎The main point here is that determining the joint distribution via its marginals is an important inverse problem which has one possible unique solution in case of choosing an proper copula according to Sklar's theorem. This developed combination of Gaussian copula and the univariate maximum entropy marginal distribution enables the determination of a unique bivariate distribution. Therefore, a colocalization parameter can be obtained via Kendall’s t, which is commonly employed in the copula literature. In general, the importance of applying these algorithms to biological data is attributed to the higher accuracy, faster computing rate, and lower cost of solutions in comparison to those of others. The extensive application and success of these algorithms in various contexts depend on their conceptual plainness and mathematical validity. ‎ Afterward, a probability density is estimated via enhancing trial cumulative distribution functions iteratively, in which more appropriate estimations are quantified using a scoring function that recognizes irregular fluctuations. This criterion resists under and over fitting data as an alternative to employing the Bayesian criterion. Uncertainty induced by statistical fluctuations in random samples is reflected by multiple estimates for the probability density. In addition, as a useful diagnostic for visualizing the quality of the estimated probability densities, scaled quantile residual plots are introduced. Kullback--Leibler divergence is an appropriate measure to indicate the convergence of estimations for the probability density function (PDF) to the actual PDF as sample. The findings indicate the general applicability of this method to high-yield statistical inference.Die Methode der maximalen Entropie ist ein wichtiger Bestandteil der statistischen Inferenz, die in immer stĂ€rkerem Maße fĂŒr die Konstruktion von Modellen verwendet wird, die biologische Systeme, insbesondere komplexe Systeme, aus empirischen DatensĂ€tzen beschreiben und vorhersagen können. In diesen ertragreichen Anwendungen ist es von besonderem Interesse, exakte Verteilungsfunktionen mit minimaler Information ĂŒber die Eigenschaften der Daten und ohne Ausnutzung menschlicher SubjektivitĂ€t zu bestimmen. In dieser Arbeit wird durch eine Kombination der Maximum-Entropie-Methode mit geeigneten Optimierungsverfahren ein automatisiertes Verfahren verwendet, um dieses Ziel fĂŒr univariate und bivariate Daten zu erreichen. Notwendige Eigenschaften von Zufallsvariablen sind lediglich ihre Stetigkeit und ihre Approximierbarkeit als unabhĂ€ngige und identisch verteilte Variablen. In dieser Arbeit versuchen wir, zwei numerische probabilistische Algorithmen prĂ€zise zu prĂ€sentieren und sie zur SchĂ€tzung der univariaten und bivariaten Modelle der zur VerfĂŒgung stehenden Daten anzuwenden. ZunĂ€chst wird mit einer Kombination aus der Maximum-Entropie Methode, der Newton-Methode und dem Bayes'schen Maximum-A-Posteriori-Ansatz die SchĂ€tzung der kinetischen Parameter mit arteriellen Eingangsfunktionen (AIFs) in FĂ€llen ohne Messung der AIF ermöglicht. Die Ergebnisse zeigen, dass die AIF aus den Daten der dynamischen kontrastverstĂ€rkten Magnetresonanztomographie (DCE-MRT) mit der Maximum-Entropie-Methode zuverlĂ€ssig bestimmt werden kann. Anschließend können die kinetischen Parameter gewonnen werden. Durch die Anwendung der entwickelten Methode wird eine gute Datenanpassung und damit eine genauere Vorhersage der kinetischen Parameter erreicht, was wiederum zu einer zuverlĂ€ssigeren Anwendung der DCE-MRT fĂŒhrt. Im bivariaten Fall betrachten wir die Kolokalisierung zur quantitativen Analyse in der Fluoreszenzmikroskopie-Bildgebung. Die in diesem Fall vorgeschlagene Methode ergibt sich aus der Kombination der Maximum-Entropie-Methode (MEM) und einer Gaußschen Copula, die wir Maximum-Entropie-Copula (MEC) nennen. Mit dieser neuartigen Methode kann die rĂ€umliche und nichtlineare Korrelation von Signalen gemessen werden, um die Kolokalisierung von Markern in Bildern der Fluoreszenzmikroskopie zu erhalten. Das Ergebnis zeigt, dass MEC in der Lage ist, die Ko- und Antikolokalisation auch in Situationen mit hohem Grundrauschen zu bestimmen. Der wesentliche Punkt hierbei ist, dass die Bestimmung der gemeinsamen Verteilung ĂŒber ihre Marginale ein entscheidendes inverses Problem ist, das eine mögliche eindeutige Lösung im Falle der Wahl einer geeigneten Copula gemĂ€ĂŸ dem Satz von Sklar hat. Diese neu entwickelte Kombination aus Gaußscher Kopula und der univariaten Maximum Entropie Randverteilung ermöglicht die Bestimmung einer eindeutigen bivariaten Verteilung. Daher kann ein Kolokalisationsparameter ĂŒber Kendall's t ermittelt werden, der ĂŒblicherweise in der Copula-Literatur verwendet wird. Die Bedeutung der Anwendung dieser Algorithmen auf biologische Daten lĂ€sst sich im Allgemeinen mit hoher Genauigkeit, schnellerer Rechengesch windigkeit und geringeren Kosten im Vergleich zu anderen Lösungen begrĂŒnden. Die umfassende Anwendung und der Erfolg dieser Algorithmen in verschiedenen Kontexten hĂ€ngen von ihrer konzeptionellen Eindeutigkeit und mathematischen GĂŒltigkeit ab. Anschließend wird eine Wahrscheinlichkeitsdichte durch iterative Erweiterung von kumulativen Verteilungsfunktionen geschĂ€tzt, wobei die geeignetsten SchĂ€tzungen mit einer Scoring-Funktion quantifiziert werden, um unregelmĂ€ĂŸige Schwankungen zu erkennen. Dieses Kriterium verhindert eine Unter- oder Überanpassung der Daten als Alternative zur Verwendung des Bayes-Kriteriums. Die durch statistische Schwankungen in Stichproben induzierte Unsicherheit wird durch mehrfache SchĂ€tzungen fĂŒr die Wahrscheinlichkeitsdichte berĂŒcksichtigt. ZusĂ€tzlich werden als nĂŒtzliche Diagnostik zur Visualisierung der QualitĂ€t der geschĂ€tzten Wahrscheinlichkeitsdichten skalierte Quantil-Residuen-Diagramme eingefĂŒhrt. Die Kullback-Leibler-Divergenz ist ein geeignetes Maß, um die Konvergenz der SchĂ€tzungen fĂŒr die Wahrscheinlichkeitsdichtefunktion (PDF) zu der tatsĂ€chlichen PDF als Stichprobe anzuzeigen. Die Ergebnisse zeigen die generelle Anwendbarkeit dieser Methode fĂŒr statistische Inferenz mit hohem Ertrag.

    The Global Joint Distribution of Income and Health

    Get PDF
    We investigate the evolution of global welfare in two dimensions: income per capita and life expectancy. First, we estimate the marginal distributions of income and life expectancy separately. More importantly, in contrast to previous univariate approaches, we consider income and life expectancy jointly and estimate their bivariate global distribution for 137 countries during 1970 - 2000. We reach several conclusions: the global joint distribution has evolved from a bimodal into a unimodal one, the evolution of the health distribution has preceded that of income, global inequality and poverty has decreased over time and the evolution of the global distribution has been welfare improving. Our decomposition of overall welfare indicates that global inequality would be underestimated if within-country inequality is not taken into account. Moreover, global inequality and poverty would be substantially underestimated if the dependence between the income and health distributions is ignored.Income; Health; Global Distribution; Inequality; Poverty

    Copula-Based Dependence Characterizations and Modeling for Time Series

    Get PDF
    This paper develops a new unified approach to copula-based modeling and characterizations for time series and stochastic processes. We obtain complete characterizations of many time series dependence structures in terms of copulas corresponding to their finite-dimensional distributions. In particular, we focus on copula- based representations for Markov chains of arbitrary order, m-dependent and r-independent time series as well as martingales and conditionally symmetric processes. Our results provide new methods for modeling time series that have prescribed dependence structures such as, for instance, higher order Markov processes as well as non-Markovian processes that nevertheless satisfy Chapman-Kolmogorov stochastic equations. We also focus on the construction and analysis of new classes of copulas that have flexibility to combine many different dependence properties for time series. Among other results, we present a study of new classes of cop- ulas based on expansions by linear functions (Eyraud-Farlie-Gumbel-Mongenstern copulas), power functions (power copulas) and Fourier polynomials (Fourier copulas) and introduce methods for modeling time series using these classes of dependence functions. We also focus on the study of weak convergence of empirical copula processes in the time series context and obtain new results on asymptotic gaussianity of such processes for a wide class of beta mixing sequences.

    A dynamic copula approach to recovering the index implied volatility skew

    Get PDF
    Equity index implied volatility functions are known to be excessively skewed in comparison with implied volatility at the single stock level. We study this stylized fact for the case of a major German stock index, the DAX, by recovering index implied volatility from simulating the 30 dimensional return system of all DAX constituents. Option prices are computed after risk neutralization of the multivariate process which is estimated under the physical probability measure. The multivariate models belong to the class of copula asymmetric dynamic conditional correlation models. We show that moderate tail-dependence coupled with asymmetric correlation response to negative news is essential to explain the index implied volatility skew. Standard dynamic correlation models with zero tail-dependence fail to generate a sufficiently steep implied volatility skew.Copula Dynamic Conditional Correlation, Basket Options, Multivariate GARCH Models, Change of Measure, Esscher Transform

    Approximate Bayesian inference in semiparametric copula models

    Full text link
    We describe a simple method for making inference on a functional of a multivariate distribution. The method is based on a copula representation of the multivariate distribution and it is based on the properties of an Approximate Bayesian Monte Carlo algorithm, where the proposed values of the functional of interest are weighed in terms of their empirical likelihood. This method is particularly useful when the "true" likelihood function associated with the working model is too costly to evaluate or when the working model is only partially specified.Comment: 27 pages, 18 figure
    • 

    corecore