1,572 research outputs found
Ambiguity helps: classification with disagreements in crowdsourced annotations
Imagine we show an image to a person and ask her/him to decide whether the scene in the image is warm or not warm, and whether it is easy or not to spot a squirrel in the image. For exactly the same image, the answers to those questions are likely to differ from person to person. This is because the task is inherently ambiguous. Such an ambiguous, therefore challenging, task is pushing the boundary of computer vision in showing what can and can not be learned from visual data. Crowdsourcing has been invaluable for collecting annotations. This is particularly so for a task that goes beyond a clear-cut dichotomy as multiple human judgments per image are needed to reach a consensus. This paper makes conceptual and technical contributions. On the conceptual side, we define disagreements among annotators as privileged information about the data instance. On the technical side, we propose a framework to incorporate annotation disagreements into the classifiers. The proposed framework is simple, relatively fast, and outperforms classifiers that do not take into account the disagreements, especially if tested on high confidence annotations
Deep Gaussian processes for regression using approximate expectation propagation
Deep Gaussian processes (DGPs) are multi-layer hierarchical generalisations
of Gaussian processes (GPs) and are formally equivalent to neural networks with
multiple, infinitely wide hidden layers. DGPs are nonparametric probabilistic
models and as such are arguably more flexible, have a greater capacity to
generalise, and provide better calibrated uncertainty estimates than
alternative deep models. This paper develops a new approximate Bayesian
learning scheme that enables DGPs to be applied to a range of medium to large
scale regression problems for the first time. The new method uses an
approximate Expectation Propagation procedure and a novel and efficient
extension of the probabilistic backpropagation algorithm for learning. We
evaluate the new method for non-linear regression on eleven real-world
datasets, showing that it always outperforms GP regression and is almost always
better than state-of-the-art deterministic and sampling-based approximate
inference methods for Bayesian neural networks. As a by-product, this work
provides a comprehensive analysis of six approximate Bayesian methods for
training neural networks
Recommended from our members
Black-Box α-divergence minimization
Black-box alpha (BB-α) is a new approximate inference method based on the minimization of α-divergences. BB-α scales to large datasets because it can be implemented using stochastic gradient descent. BB-α can be applied to complex probabilistic models with little effort since it only requires as input the likelihood function and its gradients. These gradients can be easily obtained using automatic differentiation. By changing the divergence parameter α, the method is able to interpolate between variational Bayes (VB) (α → 0) and an algorithm similar to expectation propagation (EP) (α = 1). Experiments on probit regression and neural network regression and classification problems show that BB-a with non-standard settings of α, such as α = 0.5, usually produces better predictions than with α → 0 (VB) or α = 1 (EP).JMHL acknowledges support from the Rafael del Pino Foundation. YL thanks the Schlumberger Foundation Faculty for the Future fellowship on supporting her PhD study. MR acknowledges support from UK Engineering and Physical Sciences Research Council (EPSRC) grant EP/L016516/1 for the University of Cambridge Centre for Doctoral Training, the Cambridge Centre for Analysis. TDB thanks Google for funding his European Doctoral Fellowship. DHL acknowledge support from Plan National I+D+i, Grant TIN2013-42351-P and TIN2015- 70308-REDT, and from Comunidad de Madrid, Grant S2013/ICE-2845 CASI-CAM-CM. RET thanks EPSRC grant #EP/L000776/1 and #EP/M026957/1
Training Deep Gaussian Processes using Stochastic Expectation Propagation and Probabilistic Backpropagation
Deep Gaussian processes (DGPs) are multi-layer hierarchical generalisations
of Gaussian processes (GPs) and are formally equivalent to neural networks with
multiple, infinitely wide hidden layers. DGPs are probabilistic and
non-parametric and as such are arguably more flexible, have a greater capacity
to generalise, and provide better calibrated uncertainty estimates than
alternative deep models. The focus of this paper is scalable approximate
Bayesian learning of these networks. The paper develops a novel and efficient
extension of probabilistic backpropagation, a state-of-the-art method for
training Bayesian neural networks, that can be used to train DGPs. The new
method leverages a recently proposed method for scaling Expectation
Propagation, called stochastic Expectation Propagation. The method is able to
automatically discover useful input warping, expansion or compression, and it
is therefore is a flexible form of Bayesian kernel design. We demonstrate the
success of the new method for supervised learning on several real-world
datasets, showing that it typically outperforms GP regression and is never much
worse
On the impact of covariance functions in multi-objective Bayesian optimization for engineering design
This is the author accepted manuscript. The final version is available from the publisher via the DOI in this recordMulti-objective Bayesian optimization (BO) is a highly useful class of methods that can effectively solve computationally expensive engineering design optimization problems with multiple objectives. However, the impact of covariance function, which is an important part of multi-objective BO, is rarely studied in the context of engineering optimization. We aim to shed light on this issue by performing numerical experiments on engineering design optimization problems, primarily low-fidelity problems so that we are able to statistically evaluate the performance of BO methods with various covariance functions. In this paper, we performed the study using a set of subsonic airfoil optimization cases as benchmark problems. Expected hypervolume improvement was used as the acquisition function to enrich the experimental design. Results show that the choice of the covariance function give a notable impact on the performance of multi-objective BO. In this regard, Kriging models with Matern-3/2 is the most robust method in terms of the diversity and convergence to the Pareto front that can handle problems with various complexities.Natural Environment Research Council (NERC
Recommended from our members
Sequence tutor: Conservative fine-tuning of sequence generation models with KL-control
This paper proposes a general method for improving the structure and quality
of sequences generated by a recurrent neural network (RNN), while maintaining
information originally learned from data, as well as sample diversity. An RNN
is first pre-trained on data using maximum likelihood estimation (MLE), and the
probability distribution over the next token in the sequence learned by this
model is treated as a prior policy. Another RNN is then trained using
reinforcement learning (RL) to generate higher-quality outputs that account for
domain-specific incentives while retaining proximity to the prior policy of the
MLE RNN. To formalize this objective, we derive novel off-policy RL methods for
RNNs from KL-control. The effectiveness of the approach is demonstrated on two
applications; 1) generating novel musical melodies, and 2) computational
molecular generation. For both problems, we show that the proposed method
improves the desired properties and structure of the generated sequences, while
maintaining information learned from data
Stellar equilibrium configurations of white dwarfs in the gravity
In this work we investigate the equilibrium configurations of white dwarfs in
a modified gravity theory, na\-mely, gravity, for which and
stand for the Ricci scalar and trace of the energy-momentum tensor,
respectively. Considering the functional form , with
being a constant, we obtain the hydrostatic equilibrium equation for
the theory. Some physical properties of white dwarfs, such as: mass, radius,
pressure and energy density, as well as their dependence on the parameter
are derived. More massive and larger white dwarfs are found for
negative values of when it decreases. The equilibrium configurations
predict a maximum mass limit for white dwarfs slightly above the Chandrasekhar
limit, with larger radii and lower central densities when compared to standard
gravity outcomes. The most important effect of theory for massive
white dwarfs is the increase of the radius in comparison with GR and also
results. By comparing our results with some observational data of
massive white dwarfs we also find a lower limit for , namely, .Comment: To be published in EPJ
GRB 170817A-GW170817-AT 2017gfo and the observations of NS-NS, NS-WD and WD-WD mergers
The LIGO-Virgo Collaboration has announced the detection of GW170817 and has
associated it with GRB 170817A. These signals have been followed after 11 hours
by the optical and infrared emission of AT 2017gfo. The origin of this complex
phenomenon has been attributed to a neutron star-neutron star (NS-NS) merger.
In order to probe this association we confront our current understanding of the
gravitational waves and associated electromagnetic radiation with four observed
GRBs originating in binaries composed of different combinations NSs and white
dwarfs (WDs). We consider 1) GRB 090510 the prototype of NS-NS merger leading
to a black hole (BH); 2) GRB 130603B the prototype of a NS-NS merger leading to
massive NS (MNS) with an associated kilonova; 3) GRB 060614 the prototype of a
NS-WD merger leading to a MNS with an associated kilonova candidate; 4) GRB
170817A the prototype of a WD-WD merger leading to massive WD with an
associated AT 2017gfo-like emission. None of these systems support the above
mentioned association. The clear association between GRB 170817A and AT 2017gfo
has led to introduce a new model based on on a new subfamily of GRBs
originating from WD-WD mergers. We show how this novel model is in agreement
with the exceptional observations in the optical, infrared, X- and gamma-rays
of GRB 170817A-AT 2017gfo.Comment: version accepted for publication in JCAP. Missing references adde
- …