Search CORE

2,836 research outputs found

On Excess Risk Convergence Rates of Neural Network Classifiers

Author: Huo Xiaoming
Ko Hyunouk
Suh Namjoon
Publication venue
Publication date: 26/09/2023
Field of study

The recent success of neural networks in pattern recognition and classification problems suggests that neural networks possess qualities distinct from other more classical classifiers such as SVMs or boosting classifiers. This paper studies the performance of plug-in classifiers based on neural networks in a binary classification setting as measured by their excess risks. Compared to the typical settings imposed in the literature, we consider a more general scenario that resembles actual practice in two respects: first, the function class to be approximated includes the Barron functions as a proper subset, and second, the neural network classifier constructed is the minimizer of a surrogate loss instead of the

0

1

loss so that gradient descent-based numerical optimizations can be easily applied. While the class of functions we consider is quite large that optimal rates cannot be faster than

n^{-\frac{1}{3}}

, it is a regime in which dimension-free rates are possible and approximation power of neural networks can be taken advantage of. In particular, we analyze the estimation and approximation properties of neural networks to obtain a dimension-free, uniform rate of convergence for the excess risk. Finally, we show that the rate obtained is in fact minimax optimal up to a logarithmic factor, and the minimax lower bound shows the effect of the margin assumption in this regime

arXiv.org e-Print Archive

Importance Weighted Adversarial Nets for Partial Domain Adaptation

Author: Ding Zewei
Li Wanqing
Ogunbona Philip
Zhang Jing
Publication venue
Publication date: 01/01/2018
Field of study

This paper proposes an importance weighted adversarial nets-based method for unsupervised domain adaptation, specific for partial domain adaptation where the target domain has less number of classes compared to the source domain. Previous domain adaptation methods generally assume the identical label spaces, such that reducing the distribution divergence leads to feasible knowledge transfer. However, such an assumption is no longer valid in a more realistic scenario that requires adaptation from a larger and more diverse source domain to a smaller target domain with less number of classes. This paper extends the adversarial nets-based domain adaptation and proposes a novel adversarial nets-based partial domain adaptation method to identify the source samples that are potentially from the outlier classes and, at the same time, reduce the shift of shared classes between domains

arXiv.org e-Print Archive

Crossref

Research Online

Right for the Right Reason: Training Agnostic Networks

Author: A Halevy
Aylin Caliskan
J Li
MD Zeiler
N Cristianini
O Russakovsky
W Chu
Y Ganin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 16/06/2018
Field of study

We consider the problem of a neural network being requested to classify images (or other inputs) without making implicit use of a "protected concept", that is a concept that should not play any role in the decision of the network. Typically these concepts include information such as gender or race, or other contextual information such as image backgrounds that might be implicitly reflected in unknown correlations with other variables, making it insufficient to simply remove them from the input features. In other words, making accurate predictions is not good enough if those predictions rely on information that should not be used: predictive performance is not the only important metric for learning systems. We apply a method developed in the context of domain adaptation to address this problem of "being right for the right reason", where we request a classifier to make a decision in a way that is entirely 'agnostic' to a given protected concept (e.g. gender, race, background etc.), even if this could be implicitly reflected in other attributes via unknown correlations. After defining the concept of an 'agnostic model', we demonstrate how the Domain-Adversarial Neural Network can remove unwanted information from a model using a gradient reversal layer.Comment: Author's original versio

arXiv.org e-Print Archive

Crossref

Explore Bristol Research