Search CORE

1,567 research outputs found

Ambiguity helps: classification with disagreements in crowdsourced annotations

Author: Hernandez-Lobato D
Hernandez-Lobato JM
Quadrianto N
Sharmanska V
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/06/2016
Field of study

Imagine we show an image to a person and ask her/him to decide whether the scene in the image is warm or not warm, and whether it is easy or not to spot a squirrel in the image. For exactly the same image, the answers to those questions are likely to differ from person to person. This is because the task is inherently ambiguous. Such an ambiguous, therefore challenging, task is pushing the boundary of computer vision in showing what can and can not be learned from visual data. Crowdsourcing has been invaluable for collecting annotations. This is particularly so for a task that goes beyond a clear-cut dichotomy as multiple human judgments per image are needed to reach a consensus. This paper makes conceptual and technical contributions. On the conceptual side, we define disagreements among annotators as privileged information about the data instance. On the technical side, we propose a framework to incorporate annotation disagreements into the classifiers. The proposed framework is simple, relatively fast, and outperforms classifiers that do not take into account the disagreements, especially if tested on high confidence annotations

Crossref

Spiral - Imperial College Digital Repository

Sussex Research Online

CUED - Cambridge University Engineering Department

Deep Gaussian processes for regression using approximate expectation propagation

Author: Bui TD
Hernández-Lobato D
Hernández-Lobato JM
Li Y
Turner RE
Publication venue: 33rd International Conference on Machine Learning, ICML 2016
Publication date: 01/01/2016
Field of study

Deep Gaussian processes (DGPs) are multi-layer hierarchical generalisations of Gaussian processes (GPs) and are formally equivalent to neural networks with multiple, infinitely wide hidden layers. DGPs are nonparametric probabilistic models and as such are arguably more flexible, have a greater capacity to generalise, and provide better calibrated uncertainty estimates than alternative deep models. This paper develops a new approximate Bayesian learning scheme that enables DGPs to be applied to a range of medium to large scale regression problems for the first time. The new method uses an approximate Expectation Propagation procedure and a novel and efficient extension of the probabilistic backpropagation algorithm for learning. We evaluate the new method for non-linear regression on eleven real-world datasets, showing that it always outperforms GP regression and is almost always better than state-of-the-art deterministic and sampling-based approximate inference methods for Bayesian neural networks. As a by-product, this work provides a comprehensive analysis of six approximate Bayesian methods for training neural networks

arXiv.org e-Print Archive

Apollo (Cambridge)

Recommended from our members

Black-Box α-divergence minimization

Author: Bui TD
Hernández-Lobato D
Hernández-Lobato JM
Li Y
Rowland M
Turner RE
Publication venue: Proceedings of the 33rd International Conference on Machine Learning
Publication date: 25/05/2016
Field of study

Black-box alpha (BB-α) is a new approximate inference method based on the minimization of α-divergences. BB-α scales to large datasets because it can be implemented using stochastic gradient descent. BB-α can be applied to complex probabilistic models with little effort since it only requires as input the likelihood function and its gradients. These gradients can be easily obtained using automatic differentiation. By changing the divergence parameter α, the method is able to interpolate between variational Bayes (VB) (α → 0) and an algorithm similar to expectation propagation (EP) (α = 1). Experiments on probit regression and neural network regression and classification problems show that BB-a with non-standard settings of α, such as α = 0.5, usually produces better predictions than with α → 0 (VB) or α = 1 (EP).JMHL acknowledges support from the Rafael del Pino Foundation. YL thanks the Schlumberger Foundation Faculty for the Future fellowship on supporting her PhD study. MR acknowledges support from UK Engineering and Physical Sciences Research Council (EPSRC) grant EP/L016516/1 for the University of Cambridge Centre for Doctoral Training, the Cambridge Centre for Analysis. TDB thanks Google for funding his European Doctoral Fellowship. DHL acknowledge support from Plan National I+D+i, Grant TIN2013-42351-P and TIN2015- 70308-REDT, and from Comunidad de Madrid, Grant S2013/ICE-2845 CASI-CAM-CM. RET thanks EPSRC grant #EP/L000776/1 and #EP/M026957/1

Apollo (Cambridge)

Training Deep Gaussian Processes using Stochastic Expectation Propagation and Probabilistic Backpropagation

Author: Bui Thang D
Hernández-Lobato Daniel
Hernández-Lobato José Miguel
Li Yingzhen
Turner Richard E
Publication venue
Publication date: 11/11/2015
Field of study

Deep Gaussian processes (DGPs) are multi-layer hierarchical generalisations of Gaussian processes (GPs) and are formally equivalent to neural networks with multiple, infinitely wide hidden layers. DGPs are probabilistic and non-parametric and as such are arguably more flexible, have a greater capacity to generalise, and provide better calibrated uncertainty estimates than alternative deep models. The focus of this paper is scalable approximate Bayesian learning of these networks. The paper develops a novel and efficient extension of probabilistic backpropagation, a state-of-the-art method for training Bayesian neural networks, that can be used to train DGPs. The new method leverages a recently proposed method for scaling Expectation Propagation, called stochastic Expectation Propagation. The method is able to automatically discover useful input warping, expansion or compression, and it is therefore is a flexible form of Bayesian kernel design. We demonstrate the success of the new method for supervised learning on several real-world datasets, showing that it typically outperforms GP regression and is never much worse

arXiv.org e-Print Archive

Apollo (Cambridge)

On the impact of covariance functions in multi-objective Bayesian optimization for engineering design

Author: Drucker H.
Hebbal A.
Hernández-Lobato D.
Hernández-Lobato J. M.
Hoffman M. D.
Palar P. S.
Stein M. L.
Publication venue: 'American Institute of Aeronautics and Astronautics (AIAA)'
Publication date: 01/01/2020
Field of study

This is the author accepted manuscript. The final version is available from the publisher via the DOI in this recordMulti-objective Bayesian optimization (BO) is a highly useful class of methods that can effectively solve computationally expensive engineering design optimization problems with multiple objectives. However, the impact of covariance function, which is an important part of multi-objective BO, is rarely studied in the context of engineering optimization. We aim to shed light on this issue by performing numerical experiments on engineering design optimization problems, primarily low-fidelity problems so that we are able to statistically evaluate the performance of BO methods with various covariance functions. In this paper, we performed the study using a set of subsonic airfoil optimization cases as benchmark problems. Expected hypervolume improvement was used as the acquisition function to enrich the experimental design. Results show that the choice of the covariance function give a notable impact on the performance of multi-objective BO. In this regard, Kriging models with Matern-3/2 is the most robust method in terms of the diversity and convergence to the Pareto front that can handle problems with various complexities.Natural Environment Research Council (NERC

Crossref

Open Research Exeter

Cronfa at Swansea University

Recommended from our members

Sequence tutor: Conservative fine-tuning of sequence generation models with KL-control

Author: Bahdanau D
Eck D
Gu S
Hernández-Lobato JM
Jaques N
Turner RE
Publication venue: 34th International Conference on Machine Learning, ICML 2017
Publication date: 01/01/2017
Field of study

This paper proposes a general method for improving the structure and quality of sequences generated by a recurrent neural network (RNN), while maintaining information originally learned from data, as well as sample diversity. An RNN is first pre-trained on data using maximum likelihood estimation (MLE), and the probability distribution over the next token in the sequence learned by this model is treated as a prior policy. Another RNN is then trained using reinforcement learning (RL) to generate higher-quality outputs that account for domain-specific incentives while retaining proximity to the prior policy of the MLE RNN. To formalize this objective, we derive novel off-policy RL methods for RNNs from KL-control. The effectiveness of the approach is demonstrated on two applications; 1) generating novel musical melodies, and 2) computational molecular generation. For both problems, we show that the proposed method improves the desired properties and structure of the generated sequences, while maintaining information learned from data

Apollo (Cambridge)

MPG.PuRe

Stellar equilibrium configurations of white dwarfs in the $f(R,T)$ gravity

Author: Arbañil José D. V.
Carvalho G. A.
Lobato R. V.
Malheiro M.
Marinho Jr R. M.
Moraes P. H. R. S.
Otoniel E.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 24/11/2017
Field of study

In this work we investigate the equilibrium configurations of white dwarfs in a modified gravity theory, na\-mely,

f(R,T)

gravity, for which

R

and

T

stand for the Ricci scalar and trace of the energy-momentum tensor, respectively. Considering the functional form

f(R,T)=R+2\lambda T

, with

\lambda

being a constant, we obtain the hydrostatic equilibrium equation for the theory. Some physical properties of white dwarfs, such as: mass, radius, pressure and energy density, as well as their dependence on the parameter

\lambda

are derived. More massive and larger white dwarfs are found for negative values of

\lambda

when it decreases. The equilibrium configurations predict a maximum mass limit for white dwarfs slightly above the Chandrasekhar limit, with larger radii and lower central densities when compared to standard gravity outcomes. The most important effect of

f(R,T)

theory for massive white dwarfs is the increase of the radius in comparison with GR and also

f(R)

results. By comparing our results with some observational data of massive white dwarfs we also find a lower limit for

\lambda

, namely,

\lambda >- 3\times 10^{-4}

.Comment: To be published in EPJ

arXiv.org e-Print Archive

EDP Sciences OAI-PMH repository (1.2.0)

Directory of Open Access Journals

GRB 170817A-GW170817-AT 2017gfo and the observations of NS-NS, NS-WD and WD-WD mergers

Author: Bianco C. L.
Chen Y.
de Almeida U. Barres
Lobato R. V.
Maia C.
Moradi R.
Primorac D.
Rodriguez J. F.
Rueda J. A.
Ruffini R.
Wang Y.
Publication venue: 'IOP Publishing'
Publication date: 01/01/2018
Field of study

The LIGO-Virgo Collaboration has announced the detection of GW170817 and has associated it with GRB 170817A. These signals have been followed after 11 hours by the optical and infrared emission of AT 2017gfo. The origin of this complex phenomenon has been attributed to a neutron star-neutron star (NS-NS) merger. In order to probe this association we confront our current understanding of the gravitational waves and associated electromagnetic radiation with four observed GRBs originating in binaries composed of different combinations NSs and white dwarfs (WDs). We consider 1) GRB 090510 the prototype of NS-NS merger leading to a black hole (BH); 2) GRB 130603B the prototype of a NS-NS merger leading to massive NS (MNS) with an associated kilonova; 3) GRB 060614 the prototype of a NS-WD merger leading to a MNS with an associated kilonova candidate; 4) GRB 170817A the prototype of a WD-WD merger leading to massive WD with an associated AT 2017gfo-like emission. None of these systems support the above mentioned association. The clear association between GRB 170817A and AT 2017gfo has led to introduce a new model based on on a new subfamily of GRBs originating from WD-WD mergers. We show how this novel model is in agreement with the exceptional observations in the optical, infrared, X- and gamma-rays of GRB 170817A-AT 2017gfo.Comment: version accepted for publication in JCAP. Missing references adde

arXiv.org e-Print Archive