Search CORE

3,213 research outputs found

Equations of States in Statistical Learning for a Nonparametrizable and Regular Case

Author: Watanabe Sumio
Publication venue: 'Institute of Electronics, Information and Communications Engineers (IEICE)'
Publication date: 02/06/2009
Field of study

Many learning machines that have hierarchical structure or hidden variables are now being used in information science, artificial intelligence, and bioinformatics. However, several learning machines used in such fields are not regular but singular statistical models, hence their generalization performance is still left unknown. To overcome these problems, in the previous papers, we proved new equations in statistical learning, by which we can estimate the Bayes generalization loss from the Bayes training loss and the functional variance, on the condition that the true distribution is a singularity contained in a learning machine. In this paper, we prove that the same equations hold even if a true distribution is not contained in a parametric model. Also we prove that, the proposed equations in a regular case are asymptotically equivalent to the Takeuchi information criterion. Therefore, the proposed equations are always applicable without any condition on the unknown true distribution

arXiv.org e-Print Archive

Crossref

A Bayesian information criterion for singular models

Author: Drton Mathias
Plummer Martyn
Publication venue
Publication date: 23/03/2016
Field of study

We consider approximate Bayesian model choice for model selection problems that involve models whose Fisher-information matrices may fail to be invertible along other competing submodels. Such singular models do not obey the regularity conditions underlying the derivation of Schwarz's Bayesian information criterion (BIC) and the penalty structure in BIC generally does not reflect the frequentist large-sample behavior of their marginal likelihood. While large-sample theory for the marginal likelihood of singular models has been developed recently, the resulting approximations depend on the true parameter value and lead to a paradox of circular reasoning. Guided by examples such as determining the number of components of mixture models, the number of factors in latent factor models or the rank in reduced-rank regression, we propose a resolution to this paradox and give a practical extension of BIC for singular model selection problems

arXiv.org e-Print Archive

CiteSeerX

Warwick Research Archives Portal Repository

Distributed multi-agent Gaussian regression via finite-dimensional approximations

Author: Pillonetto Gianluigi
Schenato Luca
Varagnolo Damiano
Publication venue
Publication date: 10/05/2018
Field of study

We consider the problem of distributedly estimating Gaussian processes in multi-agent frameworks. Each agent collects few measurements and aims to collaboratively reconstruct a common estimate based on all data. Agents are assumed with limited computational and communication capabilities and to gather

M

noisy measurements in total on input locations independently drawn from a known common probability density. The optimal solution would require agents to exchange all the

M

input locations and measurements and then invert an

M \times M

matrix, a non-scalable task. Differently, we propose two suboptimal approaches using the first

E

orthonormal eigenfunctions obtained from the \ac{KL} expansion of the chosen kernel, where typically

E \ll M

. The benefits are that the computation and communication complexities scale with

E

and not with

M

, and computing the required statistics can be performed via standard average consensus algorithms. We obtain probabilistic non-asymptotic bounds that determine a priori the desired level of estimation accuracy, and new distributed strategies relying on Stein's unbiased risk estimate (SURE) paradigms for tuning the regularization parameters and applicable to generic basis functions (thus not necessarily kernel eigenfunctions) and that can again be implemented via average consensus. The proposed estimators and bounds are finally tested on both synthetic and real field data

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Padova

A Survey on Bayesian Deep Learning

Author: Wang Hao
Yeung Dit-Yan
Publication venue
Publication date: 01/07/2020
Field of study

A comprehensive artificial intelligence system needs to not only perceive the environment with different `senses' (e.g., seeing and hearing) but also infer the world's conditional (or even causal) relations and corresponding uncertainty. The past decade has seen major advances in many perception tasks such as visual object recognition and speech recognition using deep learning models. For higher-level inference, however, probabilistic graphical models with their Bayesian nature are still more powerful and flexible. In recent years, Bayesian deep learning has emerged as a unified probabilistic framework to tightly integrate deep learning and Bayesian models. In this general framework, the perception of text or images using deep learning can boost the performance of higher-level inference and in turn, the feedback from the inference process is able to enhance the perception of text or images. This survey provides a comprehensive introduction to Bayesian deep learning and reviews its recent applications on recommender systems, topic models, control, etc. Besides, we also discuss the relationship and differences between Bayesian deep learning and other related topics such as Bayesian treatment of neural networks.Comment: To appear in ACM Computing Surveys (CSUR) 202

arXiv.org e-Print Archive

DSpace@MIT

Research and Education in Computational Science and Engineering

Over the past two decades the field of computational science and engineering (CSE) has penetrated both basic and applied research in academia, industry, and laboratories to advance discovery, optimize systems, support decision-makers, and educate the scientific and engineering workforce. Informed by centuries of theory and experiment, CSE performs computational experiments to answer questions that neither theory nor experiment alone is equipped to answer. CSE provides scientists and engineers of all persuasions with algorithmic inventions and software systems that transcend disciplines and scales. Carried on a wave of digital technology, CSE brings the power of parallelism to bear on troves of data. Mathematics-based advanced computing has become a prevalent means of discovery and innovation in essentially all areas of science, engineering, technology, and society; and the CSE community is at the core of this transformation. However, a combination of disruptive developments---including the architectural complexity of extreme-scale computing, the data revolution that engulfs the planet, and the specialization required to follow the applications to new frontiers---is redefining the scope and reach of the CSE endeavor. This report describes the rapid expansion of CSE and the challenges to sustaining its bold advances. The report also presents strategies and directions for CSE research and education for the next decade.Comment: Major revision, to appear in SIAM Revie

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne