28,092 research outputs found
Level Set Methods for Stochastic Discontinuity Detection in Nonlinear Problems
Stochastic physical problems governed by nonlinear conservation laws are
challenging due to solution discontinuities in stochastic and physical space.
In this paper, we present a level set method to track discontinuities in
stochastic space by solving a Hamilton-Jacobi equation. By introducing a speed
function that vanishes at discontinuities, the iso-zero of the level set
problem coincide with the discontinuities of the conservation law. The level
set problem is solved on a sequence of successively finer grids in stochastic
space. The method is adaptive in the sense that costly evaluations of the
conservation law of interest are only performed in the vicinity of the
discontinuities during the refinement stage. In regions of stochastic space
where the solution is smooth, a surrogate method replaces expensive evaluations
of the conservation law. The proposed method is tested in conjunction with
different sets of localized orthogonal basis functions on simplex elements, as
well as frames based on piecewise polynomials conforming to the level set
function. The performance of the proposed method is compared to existing
adaptive multi-element generalized polynomial chaos methods
Validation of Soft Classification Models using Partial Class Memberships: An Extended Concept of Sensitivity & Co. applied to the Grading of Astrocytoma Tissues
We use partial class memberships in soft classification to model uncertain
labelling and mixtures of classes. Partial class memberships are not restricted
to predictions, but may also occur in reference labels (ground truth, gold
standard diagnosis) for training and validation data.
Classifier performance is usually expressed as fractions of the confusion
matrix, such as sensitivity, specificity, negative and positive predictive
values. We extend this concept to soft classification and discuss the bias and
variance properties of the extended performance measures. Ambiguity in
reference labels translates to differences between best-case, expected and
worst-case performance. We show a second set of measures comparing expected and
ideal performance which is closely related to regression performance, namely
the root mean squared error RMSE and the mean absolute error MAE.
All calculations apply to classical crisp classification as well as to soft
classification (partial class memberships and/or one-class classifiers). The
proposed performance measures allow to test classifiers with actual borderline
cases. In addition, hardening of e.g. posterior probabilities into class labels
is not necessary, avoiding the corresponding information loss and increase in
variance.
We implement the proposed performance measures in the R package
"softclassval", which is available from CRAN and at
http://softclassval.r-forge.r-project.org.
Our reasoning as well as the importance of partial memberships for
chemometric classification is illustrated by a real-word application:
astrocytoma brain tumor tissue grading (80 patients, 37000 spectra) for finding
surgical excision borders. As borderline cases are the actual target of the
analytical technique, samples which are diagnosed to be borderline cases must
be included in the validation.Comment: The manuscript is accepted for publication in Chemometrics and
Intelligent Laboratory Systems. Supplementary figures and tables are at the
end of the pd
Bayesian emulation for optimization in multi-step portfolio decisions
We discuss the Bayesian emulation approach to computational solution of
multi-step portfolio studies in financial time series. "Bayesian emulation for
decisions" involves mapping the technical structure of a decision analysis
problem to that of Bayesian inference in a purely synthetic "emulating"
statistical model. This provides access to standard posterior analytic,
simulation and optimization methods that yield indirect solutions of the
decision problem. We develop this in time series portfolio analysis using
classes of economically and psychologically relevant multi-step ahead portfolio
utility functions. Studies with multivariate currency, commodity and stock
index time series illustrate the approach and show some of the practical
utility and benefits of the Bayesian emulation methodology.Comment: 24 pages, 7 figures, 2 table
Recommended from our members
Gaussian process regression for virtual metrology of microchip quality and the resulting strategic sampling scheme
Manufacturing of integrated circuits involves many sequential processes, often ex- ecuted to nanoscale tolerances, and the yield depends on the often unmeasured quality of intermediate steps. In the high-throughput industry of fabricating microelectronics on semi-conducting wafers, scheduling measurements of product quality before the electrical test of the complete IC can be expensive. We therefore seek to predict metrics of product quality based on sensor readings describing the environment within the relevant tool during the processing of each wafer, or to apply the concept of virtual metrology (VM) to monitor these intermediate steps. We model the data using Gaussian process regression (GPR), adapted to simultaneously learn the nonlinear dynamics that govern the quality characteristic, as well as their operating space, expressed by a linear embedding of the sensor traces’ features. Such Bayesian models predict a distribution for the target metric, such as a critical dimension, so one may assess the model’s credibility through its predictive uncertainty. Assuming measurements of the quality characteristic of interest are budgeted, we seek to hasten convergence of the GPR model to a credible form through an active sampling scheme, whereby the predictive uncertainty informs which wafer’s quality to measure next. We evaluate this convergence when predicting and updating online, as if in a factory, using a large dataset for plasma-enhanced chemical vapor deposition (PECVD), with measured thicknesses for ~32,000 wafers. By approximately optimizing the information extracted from this seemingly repetitive data describing a tightly controlled process, GPR achieves ~10% greater accuracy on average than a baseline linear model based on partial least squares (PLS). In a derivative study, we seek to discern the degree of drift in the process over the several months the data spans. We express this drift by how unusual the relevant features, as embedded by the GPR model, appear as the in- puts compensate for degrading conditions. This method detects the onset of consistently unusual behavior that extends to a bimodal thickness fault, anticipating its flagging by as much as two days.Mechanical Engineerin
Galton's Error and the Under-Representation of Systematic Risk
Our methodology of 'complete identification,' using simple algebraic geometry, throws new light on the continued commitment of Galton's Error in finance and the resulting misinformation of investors. Mutual funds conventionally advertise their relative systematic market risk, or 'betas,' to potential investors based on incomplete measurement by unidirectional bivariate projections: they commit Galton's Error by under-representing their systematic risk. Consequently, far too many mutual funds are marketed as 'defensive' and too few as 'aggressive.' Using our new methodology we found that, out of a total of 3,217 mutual funds, 2,047 funds (63.7%) claimed to be defensive based on the current industry standard methodology, but only 608 (18.9%) actually are. This under-representation of systematic risk leads to inefficiencies in the capital allocation process, since biased betas lead to mis-pricing of mutual funds. Our complete bivariate projection produces a correct representation of the epistemic uncertainty inherent in the bivariate measurement of relative market risk. Our conclusions have also serious consequences for the proper 'bench-marking' and recent regulatory proposals for the mutual funds industry.
Instrumental Variable Bayesian Model Averaging via Conditional Bayes Factors
We develop a method to perform model averaging in two-stage linear regression
systems subject to endogeneity. Our method extends an existing Gibbs sampler
for instrumental variables to incorporate a component of model uncertainty.
Direct evaluation of model probabilities is intractable in this setting. We
show that by nesting model moves inside the Gibbs sampler, model comparison can
be performed via conditional Bayes factors, leading to straightforward
calculations. This new Gibbs sampler is only slightly more involved than the
original algorithm and exhibits no evidence of mixing difficulties. We conclude
with a study of two different modeling challenges: incorporating uncertainty
into the determinants of macroeconomic growth, and estimating a demand function
by instrumenting wholesale on retail prices
- …