20,872 research outputs found
Understanding deep neural networks from the perspective of piecewise linear property
In recent years, deep learning models have been widely used and are behind major breakthroughs across many fields. Deep learning models are usually considered to be black boxes due to their large model structures and complicated hierarchical nonlinear transformations. As deep learning technology continues to develop, the understanding of deep learning models is raising concerns, such as the understanding of the training and prediction behaviors and the internal mechanism of models. In this thesis, we study the model understanding problem of deep neural networks from the perspective of piecewise linear property. First, we introduce the piecewise linear property. Next, we review the role and progress of deep learning understanding from the perspective of the piecewise linear property. The piecewise linear property reveals that deep neural networks with piecewise linear activation functions can generally divide the input space into a number of small disjointed regions that correspond to a local linear function within each region. Next, we investigate two typical understanding problems, namely model interpretation, and model complexity. In particular, we provide a series of derivations and analyses of the piecewise linear property of deep neural networks with piecewise linear activation functions. We propose an approach for interpreting the predictions given by such models based on the piecewise linear property. Next, we propose a method to provide local interpretation to a black box deep model by mimicking a piecewise linear approximation from the deep model. Then, we study deep neural networks with curve activation functions with the aim of providing piecewise linear approximations for these networks that would let them benefit from the piecewise linear property. After proposing a piecewise linear approximation framework, we investigate model complexity and model interpretation using the approximation. The thesis concludes by discussing future directions for understanding deep neural networks from the perspective of the piecewise linear property
Optimal approximation of piecewise smooth functions using deep ReLU neural networks
We study the necessary and sufficient complexity of ReLU neural networks---in
terms of depth and number of weights---which is required for approximating
classifier functions in . As a model class, we consider the set
of possibly discontinuous piecewise
functions , where the different smooth regions
of are separated by hypersurfaces. For dimension ,
regularity , and accuracy , we construct artificial
neural networks with ReLU activation function that approximate functions from
up to error of . The
constructed networks have a fixed number of layers, depending only on and
, and they have many nonzero weights,
which we prove to be optimal. In addition to the optimality in terms of the
number of weights, we show that in order to achieve the optimal approximation
rate, one needs ReLU networks of a certain depth. Precisely, for piecewise
functions, this minimal depth is given---up to a
multiplicative constant---by . Up to a log factor, our constructed
networks match this bound. This partly explains the benefits of depth for ReLU
networks by showing that deep networks are necessary to achieve efficient
approximation of (piecewise) smooth functions. Finally, we analyze
approximation in high-dimensional spaces where the function to be
approximated can be factorized into a smooth dimension reducing feature map
and classifier function ---defined on a low-dimensional feature
space---as . We show that in this case the approximation rate
depends only on the dimension of the feature space and not the input dimension.Comment: Generalized some estimates to norms for $0<p<\infty
Universal Approximation of Parametric Optimization via Neural Networks with Piecewise Linear Policy Approximation
Parametric optimization solves a family of optimization problems as a
function of parameters. It is a critical component in situations where optimal
decision making is repeatedly performed for updated parameter values, but
computation becomes challenging when complex problems need to be solved in
real-time. Therefore, in this study, we present theoretical foundations on
approximating optimal policy of parametric optimization problem through Neural
Networks and derive conditions that allow the Universal Approximation Theorem
to be applied to parametric optimization problems by constructing piecewise
linear policy approximation explicitly. This study fills the gap on formally
analyzing the constructed piecewise linear approximation in terms of
feasibility and optimality and show that Neural Networks (with ReLU
activations) can be valid approximator for this approximation in terms of
generalization and approximation error. Furthermore, based on theoretical
results, we propose a strategy to improve feasibility of approximated solution
and discuss training with suboptimal solutions.Comment: 17 pages, 2 figures, preprint, under revie
Approximation of Nonlinear Functionals Using Deep ReLU Networks
In recent years, functional neural networks have been proposed and studied in
order to approximate nonlinear continuous functionals defined on for integers and . However, their theoretical
properties are largely unknown beyond universality of approximation or the
existing analysis does not apply to the rectified linear unit (ReLU) activation
function. To fill in this void, we investigate here the approximation power of
functional deep neural networks associated with the ReLU activation function by
constructing a continuous piecewise linear interpolation under a simple
triangulation. In addition, we establish rates of approximation of the proposed
functional deep ReLU networks under mild regularity conditions. Finally, our
study may also shed some light on the understanding of functional data learning
algorithms
- …