82,881 research outputs found
Data-driven satisficing measure and ranking
We propose an computational framework for real-time risk assessment and
prioritizing for random outcomes without prior information on probability
distributions. The basic model is built based on satisficing measure (SM) which
yields a single index for risk comparison. Since SM is a dual representation
for a family of risk measures, we consider problems constrained by general
convex risk measures and specifically by Conditional value-at-risk. Starting
from offline optimization, we apply sample average approximation technique and
argue the convergence rate and validation of optimal solutions. In online
stochastic optimization case, we develop primal-dual stochastic approximation
algorithms respectively for general risk constrained problems, and derive their
regret bounds. For both offline and online cases, we illustrate the
relationship between risk ranking accuracy with sample size (or iterations).Comment: 26 Pages, 6 Figure
Approximate Models and Robust Decisions
Decisions based partly or solely on predictions from probabilistic models may
be sensitive to model misspecification. Statisticians are taught from an early
stage that "all models are wrong", but little formal guidance exists on how to
assess the impact of model approximation on decision making, or how to proceed
when optimal actions appear sensitive to model fidelity. This article presents
an overview of recent developments across different disciplines to address
this. We review diagnostic techniques, including graphical approaches and
summary statistics, to help highlight decisions made through minimised expected
loss that are sensitive to model misspecification. We then consider formal
methods for decision making under model misspecification by quantifying
stability of optimal actions to perturbations to the model within a
neighbourhood of model space. This neighbourhood is defined in either one of
two ways. Firstly, in a strong sense via an information (Kullback-Leibler)
divergence around the approximating model. Or using a nonparametric model
extension, again centred at the approximating model, in order to `average out'
over possible misspecifications. This is presented in the context of recent
work in the robust control, macroeconomics and financial mathematics
literature. We adopt a Bayesian approach throughout although the methods are
agnostic to this position
Towards Machine Wald
The past century has seen a steady increase in the need of estimating and
predicting complex systems and making (possibly critical) decisions with
limited information. Although computers have made possible the numerical
evaluation of sophisticated statistical models, these models are still designed
\emph{by humans} because there is currently no known recipe or algorithm for
dividing the design of a statistical model into a sequence of arithmetic
operations. Indeed enabling computers to \emph{think} as \emph{humans} have the
ability to do when faced with uncertainty is challenging in several major ways:
(1) Finding optimal statistical models remains to be formulated as a well posed
problem when information on the system of interest is incomplete and comes in
the form of a complex combination of sample data, partial knowledge of
constitutive relations and a limited description of the distribution of input
random variables. (2) The space of admissible scenarios along with the space of
relevant information, assumptions, and/or beliefs, tend to be infinite
dimensional, whereas calculus on a computer is necessarily discrete and finite.
With this purpose, this paper explores the foundations of a rigorous framework
for the scientific computation of optimal statistical estimators/models and
reviews their connections with Decision Theory, Machine Learning, Bayesian
Inference, Stochastic Optimization, Robust Optimization, Optimal Uncertainty
Quantification and Information Based Complexity.Comment: 37 page
A General Framework for Updating Belief Distributions
We propose a framework for general Bayesian inference. We argue that a valid
update of a prior belief distribution to a posterior can be made for parameters
which are connected to observations through a loss function rather than the
traditional likelihood function, which is recovered under the special case of
using self information loss. Modern application areas make it is increasingly
challenging for Bayesians to attempt to model the true data generating
mechanism. Moreover, when the object of interest is low dimensional, such as a
mean or median, it is cumbersome to have to achieve this via a complete model
for the whole data distribution. More importantly, there are settings where the
parameter of interest does not directly index a family of density functions and
thus the Bayesian approach to learning about such parameters is currently
regarded as problematic. Our proposed framework uses loss-functions to connect
information in the data to functionals of interest. The updating of beliefs
then follows from a decision theoretic approach involving cumulative loss
functions. Importantly, the procedure coincides with Bayesian updating when a
true likelihood is known, yet provides coherent subjective inference in much
more general settings. Connections to other inference frameworks are
highlighted.Comment: This is the pre-peer reviewed version of the article "A General
Framework for Updating Belief Distributions", which has been accepted for
publication in the Journal of Statistical Society - Series B. This article
may be used for non-commercial purposes in accordance with Wiley Terms and
Conditions for Self-Archivin
A review of domain adaptation without target labels
Domain adaptation has become a prominent problem setting in machine learning
and related fields. This review asks the question: how can a classifier learn
from a source domain and generalize to a target domain? We present a
categorization of approaches, divided into, what we refer to as, sample-based,
feature-based and inference-based methods. Sample-based methods focus on
weighting individual observations during training based on their importance to
the target domain. Feature-based methods revolve around on mapping, projecting
and representing features such that a source classifier performs well on the
target domain and inference-based methods incorporate adaptation into the
parameter estimation procedure, for instance through constraints on the
optimization procedure. Additionally, we review a number of conditions that
allow for formulating bounds on the cross-domain generalization error. Our
categorization highlights recurring ideas and raises questions important to
further research.Comment: 20 pages, 5 figure
- …