40,058 research outputs found
Comment on `Tainted evidence: cosmological model selection versus fitting', by Eric V. Linder and Ramon Miquel (astro-ph/0702542v2)
In astro-ph/0702542v2, Linder and Miquel seek to criticize the use of
Bayesian model selection for data analysis and for survey forecasting and
design. Their discussion is based on three serious misunderstandings of the
conceptual underpinnings and application of model-level Bayesian inference,
which invalidate all their main conclusions. Their paper includes numerous
further inaccuracies, including an erroneous calculation of the Bayesian
Information Criterion. Here we seek to set the record straight.Comment: 6 pages RevTeX
Bayesian optimization for materials design
We introduce Bayesian optimization, a technique developed for optimizing
time-consuming engineering simulations and for fitting machine learning models
on large datasets. Bayesian optimization guides the choice of experiments
during materials design and discovery to find good material designs in as few
experiments as possible. We focus on the case when materials designs are
parameterized by a low-dimensional vector. Bayesian optimization is built on a
statistical technique called Gaussian process regression, which allows
predicting the performance of a new design based on previously tested designs.
After providing a detailed introduction to Gaussian process regression, we
introduce two Bayesian optimization methods: expected improvement, for design
problems with noise-free evaluations; and the knowledge-gradient method, which
generalizes expected improvement and may be used in design problems with noisy
evaluations. Both methods are derived using a value-of-information analysis,
and enjoy one-step Bayes-optimality
Domain adaptation of weighted majority votes via perturbed variation-based self-labeling
In machine learning, the domain adaptation problem arrives when the test
(target) and the train (source) data are generated from different
distributions. A key applied issue is thus the design of algorithms able to
generalize on a new distribution, for which we have no label information. We
focus on learning classification models defined as a weighted majority vote
over a set of real-val ued functions. In this context, Germain et al. (2013)
have shown that a measure of disagreement between these functions is crucial to
control. The core of this measure is a theoretical bound--the C-bound (Lacasse
et al., 2007)--which involves the disagreement and leads to a well performing
majority vote learning algorithm in usual non-adaptative supervised setting:
MinCq. In this work, we propose a framework to extend MinCq to a domain
adaptation scenario. This procedure takes advantage of the recent perturbed
variation divergence between distributions proposed by Harel and Mannor (2012).
Justified by a theoretical bound on the target risk of the vote, we provide to
MinCq a target sample labeled thanks to a perturbed variation-based
self-labeling focused on the regions where the source and target marginals
appear similar. We also study the influence of our self-labeling, from which we
deduce an original process for tuning the hyperparameters. Finally, our
framework called PV-MinCq shows very promising results on a rotation and
translation synthetic problem
- …