Search CORE

1,792 research outputs found

Approximation and Non-parametric Estimation of ResNet-type Convolutional Neural Networks

Author: Oono Kenta
Suzuki Taiji
Publication venue
Publication date: 06/01/2021
Field of study

Convolutional neural networks (CNNs) have been shown to achieve optimal approximation and estimation error rates (in minimax sense) in several function classes. However, previous analyzed optimal CNNs are unrealistically wide and difficult to obtain via optimization due to sparse constraints in important function classes, including the H\"older class. We show a ResNet-type CNN can attain the minimax optimal error rates in these classes in more plausible situations -- it can be dense, and its width, channel size, and filter size are constant with respect to sample size. The key idea is that we can replicate the learning ability of Fully-connected neural networks (FNNs) by tailored CNNs, as long as the FNNs have \textit{block-sparse} structures. Our theory is general in a sense that we can automatically translate any approximation rate achieved by block-sparse FNNs into that by CNNs. As an application, we derive approximation and estimation error rates of the aformentioned type of CNNs for the Barron and H\"older classes with the same strategy.Comment: 8 pages + References 2 pages + Supplemental material 18 page

arXiv.org e-Print Archive

Demystifying Deep Learning: A Geometric Approach to Iterative Projections

Author: Dai Liyi
Krim Hamid
Panahi Ashkan
Publication venue
Publication date: 01/01/2018
Field of study

Parametric approaches to Learning, such as deep learning (DL), are highly popular in nonlinear regression, in spite of their extremely difficult training with their increasing complexity (e.g. number of layers in DL). In this paper, we present an alternative semi-parametric framework which foregoes the ordinarily required feedback, by introducing the novel idea of geometric regularization. We show that certain deep learning techniques such as residual network (ResNet) architecture are closely related to our approach. Hence, our technique can be used to analyze these types of deep learning. Moreover, we present preliminary results which confirm that our approach can be easily trained to obtain complex structures.Comment: To be appeared in the ICASSP 2018 proceeding

arXiv.org e-Print Archive

Crossref

Chalmers Research