801 research outputs found
Open TURNS: An industrial software for uncertainty quantification in simulation
The needs to assess robust performances for complex systems and to answer
tighter regulatory processes (security, safety, environmental control, and
health impacts, etc.) have led to the emergence of a new industrial simulation
challenge: to take uncertainties into account when dealing with complex
numerical simulation frameworks. Therefore, a generic methodology has emerged
from the joint effort of several industrial companies and academic
institutions. EDF R&D, Airbus Group and Phimeca Engineering started a
collaboration at the beginning of 2005, joined by IMACS in 2014, for the
development of an Open Source software platform dedicated to uncertainty
propagation by probabilistic methods, named OpenTURNS for Open source Treatment
of Uncertainty, Risk 'N Statistics. OpenTURNS addresses the specific industrial
challenges attached to uncertainties, which are transparency, genericity,
modularity and multi-accessibility. This paper focuses on OpenTURNS and
presents its main features: openTURNS is an open source software under the LGPL
license, that presents itself as a C++ library and a Python TUI, and which
works under Linux and Windows environment. All the methodological tools are
described in the different sections of this paper: uncertainty quantification,
uncertainty propagation, sensitivity analysis and metamodeling. A section also
explains the generic wrappers way to link openTURNS to any external code. The
paper illustrates as much as possible the methodological tools on an
educational example that simulates the height of a river and compares it to the
height of a dyke that protects industrial facilities. At last, it gives an
overview of the main developments planned for the next few years
Advanced maximum entropy approaches for medical and microscopy imaging
The maximum entropy framework is a cornerstone of statistical inference, which is employed at a growing rate for constructing models capable of describing and predicting biological systems, particularly complex ones, from empirical datasets.â In these high-yield applications, determining exact probability distribution functions with only minimal information about data characteristics and without utilizing human subjectivity is of particular interest. In this thesis, an automated procedure of this kind for univariate and bivariate data is employed to reach this objective through combining the maximum entropy method with an appropriate optimization method. The only necessary characteristics of random variables are their continuousness and ability to be approximated as independent and identically distributed. In this work, we try to concisely present two numerical probabilistic algorithms and apply them to estimate the univariate and bivariate models of the available data.
In the first case, a combination of the maximum entropy method, Newton's method, and the Bayesian maximum a posteriori approach leads to the estimation of the kinetic parameters with arterial input functions (AIFs) in cases without any measurement of the AIF. âThe results shows that the AIF can reliably be determined from the data of dynamic contrast enhanced-magnetic resonance imaging (DCE-MRI) by maximum entropy method. Then, kinetic parameters can be obtained. By using the developed method, a good data fitting and thus a more accurate prediction of the kinetic parameters are achieved, which, in turn, leads to a more reliable application of DCE-MRI.
â
In the bivariate case, we consider colocalization as a quantitative analysis in fluorescence microscopy imaging. The method proposed in this case is obtained by combining the Maximum Entropy Method (MEM) and a Gaussian Copula, which we call the Maximum Entropy Copula (MEC). This novel method is capable of measuring the spatial and nonlinear correlation of signals to obtain the colocalization of markers in fluorescence microscopy images. Based on the results, MEC is able to specify co- and anti-colocalization even in high-background situations.â âThe main point here is that determining the joint distribution via its marginals is an important inverse problem which has one possible unique solution in case of choosing an proper copula according to Sklar's theorem. This developed combination of Gaussian copula and the univariate maximum entropy marginal distribution enables the determination of a unique bivariate distribution. Therefore, a colocalization parameter can be obtained via Kendallâs t, which is commonly employed in the copula literature.
In general, the importance of applying these algorithms to biological data is attributed to the higher accuracy, faster computing rate, and lower cost of solutions in comparison to those of others. The extensive application and success of these algorithms in various contexts depend on their conceptual plainness and mathematical validity. â
Afterward, a probability density is estimated via enhancing trial cumulative distribution functions iteratively, in which more appropriate estimations are quantified using a scoring function that recognizes irregular fluctuations. This criterion resists under and over fitting data as an alternative to employing the Bayesian criterion. Uncertainty induced by statistical fluctuations in random samples is reflected by multiple estimates for the probability density. In addition, as a useful diagnostic for visualizing the quality of the estimated probability densities, scaled quantile residual plots are introduced. Kullback--Leibler divergence is an appropriate measure to indicate the convergence of estimations for the probability density function (PDF) to the actual PDF as sample. The findings indicate the general applicability of this method to high-yield statistical inference.Die Methode der maximalen Entropie ist ein wichtiger Bestandteil der statistischen Inferenz, die in immer stĂ€rkerem MaĂe fĂŒr die Konstruktion von Modellen verwendet wird, die biologische Systeme, insbesondere komplexe Systeme, aus empirischen DatensĂ€tzen beschreiben und vorhersagen können. In diesen ertragreichen Anwendungen ist es von besonderem Interesse, exakte Verteilungsfunktionen mit minimaler Information ĂŒber die Eigenschaften der Daten und ohne Ausnutzung menschlicher SubjektivitĂ€t zu bestimmen. In dieser Arbeit wird durch eine Kombination der Maximum-Entropie-Methode mit geeigneten Optimierungsverfahren ein automatisiertes Verfahren verwendet, um dieses Ziel fĂŒr univariate und bivariate Daten zu erreichen. Notwendige Eigenschaften von Zufallsvariablen sind lediglich ihre Stetigkeit und ihre Approximierbarkeit als unabhĂ€ngige und identisch verteilte Variablen. In dieser Arbeit versuchen wir, zwei numerische probabilistische Algorithmen prĂ€zise zu prĂ€sentieren und sie zur SchĂ€tzung der univariaten und bivariaten Modelle der zur VerfĂŒgung stehenden Daten anzuwenden.
ZunĂ€chst wird mit einer Kombination aus der Maximum-Entropie Methode, der Newton-Methode und dem Bayes'schen Maximum-A-Posteriori-Ansatz die SchĂ€tzung der kinetischen Parameter mit arteriellen Eingangsfunktionen (AIFs) in FĂ€llen ohne Messung der AIF ermöglicht. Die Ergebnisse zeigen, dass die AIF aus den Daten der dynamischen kontrastverstĂ€rkten Magnetresonanztomographie (DCE-MRT) mit der Maximum-Entropie-Methode zuverlĂ€ssig bestimmt werden kann. AnschlieĂend können die kinetischen Parameter gewonnen werden. Durch die Anwendung der entwickelten Methode wird eine gute Datenanpassung und damit eine genauere Vorhersage der kinetischen Parameter erreicht, was wiederum zu einer zuverlĂ€ssigeren Anwendung der DCE-MRT fĂŒhrt.
Im bivariaten Fall betrachten wir die Kolokalisierung zur quantitativen Analyse in der Fluoreszenzmikroskopie-Bildgebung. Die in diesem Fall vorgeschlagene Methode ergibt sich aus der Kombination der Maximum-Entropie-Methode (MEM) und einer GauĂschen Copula, die wir Maximum-Entropie-Copula (MEC) nennen. Mit dieser neuartigen Methode kann die rĂ€umliche und nichtlineare Korrelation von Signalen gemessen werden, um die Kolokalisierung von Markern in Bildern der Fluoreszenzmikroskopie zu erhalten. Das Ergebnis zeigt, dass MEC in der Lage ist, die Ko- und Antikolokalisation auch in Situationen mit hohem Grundrauschen zu bestimmen. Der wesentliche Punkt hierbei ist, dass die Bestimmung der gemeinsamen Verteilung ĂŒber ihre Marginale ein entscheidendes inverses Problem ist, das eine mögliche eindeutige Lösung im Falle der Wahl einer geeigneten Copula gemÀà dem Satz von Sklar hat. Diese neu entwickelte Kombination aus GauĂscher Kopula und der univariaten Maximum Entropie Randverteilung ermöglicht die Bestimmung einer eindeutigen bivariaten Verteilung. Daher kann ein Kolokalisationsparameter ĂŒber Kendall's t ermittelt werden, der ĂŒblicherweise in der Copula-Literatur verwendet wird.
Die Bedeutung der Anwendung dieser Algorithmen auf biologische Daten lĂ€sst sich im Allgemeinen mit hoher Genauigkeit, schnellerer Rechengesch windigkeit und geringeren Kosten im Vergleich zu anderen Lösungen begrĂŒnden. Die umfassende Anwendung und der Erfolg dieser Algorithmen in verschiedenen Kontexten hĂ€ngen von ihrer konzeptionellen Eindeutigkeit und mathematischen GĂŒltigkeit ab.
AnschlieĂend wird eine Wahrscheinlichkeitsdichte durch iterative Erweiterung von kumulativen Verteilungsfunktionen geschĂ€tzt, wobei die geeignetsten SchĂ€tzungen mit einer Scoring-Funktion quantifiziert werden, um unregelmĂ€Ăige Schwankungen zu erkennen. Dieses Kriterium verhindert eine Unter- oder Ăberanpassung der Daten als Alternative zur Verwendung des Bayes-Kriteriums. Die durch statistische Schwankungen in Stichproben induzierte Unsicherheit wird durch mehrfache SchĂ€tzungen fĂŒr die Wahrscheinlichkeitsdichte berĂŒcksichtigt. ZusĂ€tzlich werden als nĂŒtzliche Diagnostik zur Visualisierung der QualitĂ€t der geschĂ€tzten Wahrscheinlichkeitsdichten skalierte Quantil-Residuen-Diagramme eingefĂŒhrt. Die Kullback-Leibler-Divergenz ist ein geeignetes MaĂ, um die Konvergenz der SchĂ€tzungen fĂŒr die Wahrscheinlichkeitsdichtefunktion (PDF) zu der tatsĂ€chlichen PDF als Stichprobe anzuzeigen. Die Ergebnisse zeigen die generelle Anwendbarkeit dieser Methode fĂŒr statistische Inferenz mit hohem Ertrag.
The Global Joint Distribution of Income and Health
We investigate the evolution of global welfare in two dimensions: income per capita and life expectancy. First, we estimate the marginal distributions of income and life expectancy separately. More importantly, in contrast to previous univariate approaches, we consider income and life expectancy jointly and estimate their bivariate global distribution for 137 countries during 1970 - 2000. We reach several conclusions: the global joint distribution has evolved from a bimodal into a unimodal one, the evolution of the health distribution has preceded that of income, global inequality and poverty has decreased over time and the evolution of the global distribution has been welfare improving. Our decomposition of overall welfare indicates that global inequality would be underestimated if within-country inequality is not taken into account. Moreover, global inequality and poverty would be substantially underestimated if the dependence between the income and health distributions is ignored.Income; Health; Global Distribution; Inequality; Poverty
Copula-Based Dependence Characterizations and Modeling for Time Series
This paper develops a new unified approach to copula-based modeling and characterizations for time series and stochastic processes. We obtain complete characterizations of many time series dependence structures in terms of copulas corresponding to their finite-dimensional distributions. In particular, we focus on copula- based representations for Markov chains of arbitrary order, m-dependent and r-independent time series as well as martingales and conditionally symmetric processes. Our results provide new methods for modeling time series that have prescribed dependence structures such as, for instance, higher order Markov processes as well as non-Markovian processes that nevertheless satisfy Chapman-Kolmogorov stochastic equations. We also focus on the construction and analysis of new classes of copulas that have flexibility to combine many different dependence properties for time series. Among other results, we present a study of new classes of cop- ulas based on expansions by linear functions (Eyraud-Farlie-Gumbel-Mongenstern copulas), power functions (power copulas) and Fourier polynomials (Fourier copulas) and introduce methods for modeling time series using these classes of dependence functions. We also focus on the study of weak convergence of empirical copula processes in the time series context and obtain new results on asymptotic gaussianity of such processes for a wide class of beta mixing sequences.
A dynamic copula approach to recovering the index implied volatility skew
Equity index implied volatility functions are known to be excessively skewed in comparison with implied volatility at the single stock level. We study this stylized fact for the case of a major German stock index, the DAX, by recovering index implied volatility from simulating the 30 dimensional return system of all DAX constituents. Option prices are computed after risk neutralization of the multivariate process which is estimated under the physical probability measure. The multivariate models belong to the class of copula asymmetric dynamic conditional correlation models. We show that moderate tail-dependence coupled with asymmetric correlation response to negative news is essential to explain the index implied volatility skew. Standard dynamic correlation models with zero tail-dependence fail to generate a sufficiently steep implied volatility skew.Copula Dynamic Conditional Correlation, Basket Options, Multivariate GARCH Models, Change of Measure, Esscher Transform
Approximate Bayesian inference in semiparametric copula models
We describe a simple method for making inference on a functional of a
multivariate distribution. The method is based on a copula representation of
the multivariate distribution and it is based on the properties of an
Approximate Bayesian Monte Carlo algorithm, where the proposed values of the
functional of interest are weighed in terms of their empirical likelihood. This
method is particularly useful when the "true" likelihood function associated
with the working model is too costly to evaluate or when the working model is
only partially specified.Comment: 27 pages, 18 figure
- âŠ