Search CORE

51,423 research outputs found

Alchemical and structural distribution based representation for improved QML

Author: Christensen Anders S.
Faber Felix A.
Huang Bing
von Lilienfeld O. Anatole
Publication venue: 'AIP Publishing'
Publication date: 22/12/2017
Field of study

We introduce a representation of any atom in any chemical environment for the generation of efficient quantum machine learning (QML) models of common electronic ground-state properties. The representation is based on scaled distribution functions explicitly accounting for elemental and structural degrees of freedom. Resulting QML models afford very favorable learning curves for properties of out-of-sample systems including organic molecules, non-covalently bonded protein side-chains, (H

_2

_{40}

-clusters, as well as diverse crystals. The elemental components help to lower the learning curves, and, through interpolation across the periodic table, even enable "alchemical extrapolation" to covalent bonding between elements not part of training, as evinced for single, double, and triple bonds among main-group elements

arXiv.org e-Print Archive

edoc

Alchemical and structural distribution based representation for improved QML

Author: Feifei Ren (558363)
Jingchen Sun (330489)
Min Feng (141118)
Nan Zhang (46264)
Xiong Wang (384462)
Yaohong Zhou (5591045)
Publication venue: 'AIP Publishing'
Publication date: 22/12/2017
Field of study

_2

_{40}

arXiv.org e-Print Archive

edoc

FigShare

A Kernel Perspective for Regularizing Deep Neural Networks

Author: Bietti Alberto
Chen Dexiong
Mairal Julien
Mialon Grégoire
Publication venue
Publication date: 13/05/2019
Field of study

We propose a new point of view for regularizing deep neural networks by using the norm of a reproducing kernel Hilbert space (RKHS). Even though this norm cannot be computed, it admits upper and lower approximations leading to various practical strategies. Specifically, this perspective (i) provides a common umbrella for many existing regularization principles, including spectral norm and gradient penalties, or adversarial training, (ii) leads to new effective regularization penalties, and (iii) suggests hybrid strategies combining lower and upper bounds to get better approximations of the RKHS norm. We experimentally show this approach to be effective when learning on small datasets, or to obtain adversarially robust models.Comment: ICM

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Model Selection for Support Vector Machine Classification

Author: Burges
Carl Gold
Chapelle
Cristianini
Jaakkola
Krauth
Kwok
Kwok
MacKay
MacKay
Neal
Opper
Opper
Peter Sollich
Press
Smola
Sollich
Vapnik
Vapnik
Vapnik
Wahba
Williams
Williams
Publication venue: 'Elsevier BV'
Publication date: 01/01/2002
Field of study

We address the problem of model selection for Support Vector Machine (SVM) classification. For fixed functional form of the kernel, model selection amounts to tuning kernel parameters and the slack penalty coefficient

C

. We begin by reviewing a recently developed probabilistic framework for SVM classification. An extension to the case of SVMs with quadratic slack penalties is given and a simple approximation for the evidence is derived, which can be used as a criterion for model selection. We also derive the exact gradients of the evidence in terms of posterior averages and describe how they can be estimated numerically using Hybrid Monte Carlo techniques. Though computationally demanding, the resulting gradient ascent algorithm is a useful baseline tool for probabilistic SVM model selection, since it can locate maxima of the exact (unapproximated) evidence. We then perform extensive experiments on several benchmark data sets. The aim of these experiments is to compare the performance of probabilistic model selection criteria with alternatives based on estimates of the test error, namely the so-called ``span estimate'' and Wahba's Generalized Approximate Cross-Validation (GACV) error. We find that all the ``simple'' model criteria (Laplace evidence approximations, and the Span and GACV error estimates) exhibit multiple local optima with respect to the hyperparameters. While some of these give performance that is competitive with results from other approaches in the literature, a significant fraction lead to rather higher test errors. The results for the evidence gradient ascent method show that also the exact evidence exhibits local optima, but these give test errors which are much less variable and also consistently lower than for the simpler model selection criteria

arXiv.org e-Print Archive

CiteSeerX

Crossref

Caltech Authors

King's Research Portal

Learning Discriminative Bayesian Networks from High-dimensional Continuous Neuroimaging Data

Author: Liu Lingqiao
Ogunbona Philip
Shen Dinggang
Wang Lei
Zhou Luping
Publication venue
Publication date: 23/06/2015
Field of study

Due to its causal semantics, Bayesian networks (BN) have been widely employed to discover the underlying data relationship in exploratory studies, such as brain research. Despite its success in modeling the probability distribution of variables, BN is naturally a generative model, which is not necessarily discriminative. This may cause the ignorance of subtle but critical network changes that are of investigation values across populations. In this paper, we propose to improve the discriminative power of BN models for continuous variables from two different perspectives. This brings two general discriminative learning frameworks for Gaussian Bayesian networks (GBN). In the first framework, we employ Fisher kernel to bridge the generative models of GBN and the discriminative classifiers of SVMs, and convert the GBN parameter learning to Fisher kernel learning via minimizing a generalization error bound of SVMs. In the second framework, we employ the max-margin criterion and build it directly upon GBN models to explicitly optimize the classification performance of the GBNs. The advantages and disadvantages of the two frameworks are discussed and experimentally compared. Both of them demonstrate strong power in learning discriminative parameters of GBNs for neuroimaging based brain network analysis, as well as maintaining reasonable representation capacity. The contributions of this paper also include a new Directed Acyclic Graph (DAG) constraint with theoretical guarantee to ensure the graph validity of GBN.Comment: 16 pages and 5 figures for the article (excluding appendix

arXiv.org e-Print Archive

Adelaide Research & Scholarship

Carolina Digital Repository

Research Online