604 research outputs found
Approximation and Non-parametric Estimation of ResNet-type Convolutional Neural Networks
Convolutional neural networks (CNNs) have been shown to achieve optimal
approximation and estimation error rates (in minimax sense) in several function
classes. However, previous analyzed optimal CNNs are unrealistically wide and
difficult to obtain via optimization due to sparse constraints in important
function classes, including the H\"older class. We show a ResNet-type CNN can
attain the minimax optimal error rates in these classes in more plausible
situations -- it can be dense, and its width, channel size, and filter size are
constant with respect to sample size. The key idea is that we can replicate the
learning ability of Fully-connected neural networks (FNNs) by tailored CNNs, as
long as the FNNs have \textit{block-sparse} structures. Our theory is general
in a sense that we can automatically translate any approximation rate achieved
by block-sparse FNNs into that by CNNs. As an application, we derive
approximation and estimation error rates of the aformentioned type of CNNs for
the Barron and H\"older classes with the same strategy.Comment: 8 pages + References 2 pages + Supplemental material 18 page
A Gradient Boosting Approach for Training Convolutional and Deep Neural Networks
Deep learning has revolutionized the computer vision and image classification
domains. In this context Convolutional Neural Networks (CNNs) based
architectures are the most widely applied models. In this article, we
introduced two procedures for training Convolutional Neural Networks (CNNs) and
Deep Neural Network based on Gradient Boosting (GB), namely GB-CNN and GB-DNN.
These models are trained to fit the gradient of the loss function or
pseudo-residuals of previous models. At each iteration, the proposed method
adds one dense layer to an exact copy of the previous deep NN model. The
weights of the dense layers trained on previous iterations are frozen to
prevent over-fitting, permitting the model to fit the new dense as well as to
fine-tune the convolutional layers (for GB-CNN) while still utilizing the
information already learned. Through extensive experimentation on different
2D-image classification and tabular datasets, the presented models show
superior performance in terms of classification accuracy with respect to
standard CNN and Deep-NN with the same architectures
Predicting human eye fixations via an LSTM-Based saliency attentive model
Data-driven saliency has recently gained a lot of attention thanks to the use of convolutional neural networks for predicting gaze fixations. In this paper, we go beyond standard approaches to saliency prediction, in which gaze maps are computed with a feed-forward network, and present a novel model which can predict accurate saliency maps by incorporating neural attentive mechanisms. The core of our solution is a convolutional long short-term memory that focuses on the most salient regions of the input image to iteratively refine the predicted saliency map. In addition, to tackle the center bias typical of human eye fixations, our model can learn a set of prior maps generated with Gaussian functions. We show, through an extensive evaluation, that the proposed architecture outperforms the current state-of-the-art on public saliency prediction datasets. We further study the contribution of each key component to demonstrate their robustness on different scenarios
- …