91,467 research outputs found
On Neural Networks with Minimal Weights
Linear threshold elements are the basic building blocks of artificial
neural networks. A linear threshold element computes a function
that is a sign of a weighted sum of the input variables. The weights
are arbitrary integers: actually, they can be very big integers-
exponential in the number of the input variables. However, in
practice, it is difficult to implement big weights. In the present
literature a distinction is made between the two extreme cases:
linear threshold functions with polynomial-size weights as opposed
to those with exponential-size weights. The main contribution of
this paper is to fill up the gap by further refining that separation.
Namely, we prove that the class of linear threshold functions with
polynomial-size weights can be divided into subclasses according
to the degree of the polynomial. In fact we prove a more general
result-that there exists a minimal weight linear threshold function
for any arbitrary number of inputs and any weight size. To prove
those results we have developed a novel technique for constructing
linear threshold functions with minimal weights
Approximation results for Gradient Descent trained Shallow Neural Networks in
Two aspects of neural networks that have been extensively studied in the
recent literature are their function approximation properties and their
training by gradient descent methods. The approximation problem seeks accurate
approximations with a minimal number of weights. In most of the current
literature these weights are fully or partially hand-crafted, showing the
capabilities of neural networks but not necessarily their practical
performance. In contrast, optimization theory for neural networks heavily
relies on an abundance of weights in over-parametrized regimes.
This paper balances these two demands and provides an approximation result
for shallow networks in with non-convex weight optimization by gradient
descent. We consider finite width networks and infinite sample limits, which is
the typical setup in approximation theory. Technically, this problem is not
over-parametrized, however, some form of redundancy reappears as a loss in
approximation rate compared to best possible rates
SPFQ: A Stochastic Algorithm and Its Error Analysis for Neural Network Quantization
Quantization is a widely used compression method that effectively reduces
redundancies in over-parameterized neural networks. However, existing
quantization techniques for deep neural networks often lack a comprehensive
error analysis due to the presence of non-convex loss functions and nonlinear
activations. In this paper, we propose a fast stochastic algorithm for
quantizing the weights of fully trained neural networks. Our approach leverages
a greedy path-following mechanism in combination with a stochastic quantizer.
Its computational complexity scales only linearly with the number of weights in
the network, thereby enabling the efficient quantization of large networks.
Importantly, we establish, for the first time, full-network error bounds, under
an infinite alphabet condition and minimal assumptions on the weights and input
data. As an application of this result, we prove that when quantizing a
multi-layer network having Gaussian weights, the relative square quantization
error exhibits a linear decay as the degree of over-parametrization increases.
Furthermore, we demonstrate that it is possible to achieve error bounds
equivalent to those obtained in the infinite alphabet case, using on the order
of a mere bits per weight, where represents the largest number
of neurons in a layer
Optimal approximation of piecewise smooth functions using deep ReLU neural networks
We study the necessary and sufficient complexity of ReLU neural networks---in
terms of depth and number of weights---which is required for approximating
classifier functions in . As a model class, we consider the set
of possibly discontinuous piecewise
functions , where the different smooth regions
of are separated by hypersurfaces. For dimension ,
regularity , and accuracy , we construct artificial
neural networks with ReLU activation function that approximate functions from
up to error of . The
constructed networks have a fixed number of layers, depending only on and
, and they have many nonzero weights,
which we prove to be optimal. In addition to the optimality in terms of the
number of weights, we show that in order to achieve the optimal approximation
rate, one needs ReLU networks of a certain depth. Precisely, for piecewise
functions, this minimal depth is given---up to a
multiplicative constant---by . Up to a log factor, our constructed
networks match this bound. This partly explains the benefits of depth for ReLU
networks by showing that deep networks are necessary to achieve efficient
approximation of (piecewise) smooth functions. Finally, we analyze
approximation in high-dimensional spaces where the function to be
approximated can be factorized into a smooth dimension reducing feature map
and classifier function ---defined on a low-dimensional feature
space---as . We show that in this case the approximation rate
depends only on the dimension of the feature space and not the input dimension.Comment: Generalized some estimates to norms for $0<p<\infty
Dissecting the Biological Motherboard (Systems Biology and Beyond)
Genome-scale molecular networks, including gene pathways, gene regulatory networks and protein interactions, are central to the investigation of the nascent disciplines of systems biology and bio-complexity. Dissecting these genome-scale molecular networks in its all-possible manifestations is paramount in our quest for a genotype-input phenotype-output application which will also take environment-genome interactions into account.

Machine learning approaches are now increasingly being used for reverse engineering such networks. Our work stresses the importance of a system approach in biological research and how artificial neural networks are at the forefront of Artificial Intelligence techniques that are increasingly being used to construct as well as dissect molecular networks, the building blocks of the living system.

Our paper will show the application of artificial neural networks to reverse engineer a temporal gene pathway 
In this paper we will also explore the pruning of nodes of these artificial neural networks to simulate gene silencing and thus generate novel biological insight into these molecular networks (The Biological Motherboard).

The research described is novel, in that this may be the first time that the application of neural networks to temporal gene expression data is described. It will be shown that a trained artificial neural network, with pruning, can also be described as a gene network with minimal re-interpretation, where the weights on links between nodes reflect the probability of one gene affecting another gene in time
Energy Efficient Hardware Acceleration of Neural Networks with Power-of-Two Quantisation
Deep neural networks virtually dominate the domain of most modern vision
systems, providing high performance at a cost of increased computational
complexity.Since for those systems it is often required to operate both in
real-time and with minimal energy consumption (e.g., for wearable devices or
autonomous vehicles, edge Internet of Things (IoT), sensor networks), various
network optimisation techniques are used, e.g., quantisation, pruning, or
dedicated lightweight architectures. Due to the logarithmic distribution of
weights in neural network layers, a method providing high performance with
significant reduction in computational precision (for 4-bit weights and less)
is the Power-of-Two (PoT) quantisation (and therefore also with a logarithmic
distribution). This method introduces additional possibilities of replacing the
typical for neural networks Multiply and ACcumulate (MAC -- performing, e.g.,
convolution operations) units, with more energy-efficient Bitshift and
ACcumulate (BAC). In this paper, we show that a hardware neural network
accelerator with PoT weights implemented on the Zynq UltraScale + MPSoC ZCU104
SoC FPGA can be at least more energy efficient than the uniform
quantisation version. To further reduce the actual power requirement by
omitting part of the computation for zero weights, we also propose a new
pruning method adapted to logarithmic quantisation.Comment: Accepted for the ICCVG 2022 conferenc
Geometry and Expressive Power of Conditional Restricted Boltzmann Machines
Conditional restricted Boltzmann machines are undirected stochastic neural
networks with a layer of input and output units connected bipartitely to a
layer of hidden units. These networks define models of conditional probability
distributions on the states of the output units given the states of the input
units, parametrized by interaction weights and biases. We address the
representational power of these models, proving results their ability to
represent conditional Markov random fields and conditional distributions with
restricted supports, the minimal size of universal approximators, the maximal
model approximation errors, and on the dimension of the set of representable
conditional distributions. We contribute new tools for investigating
conditional probability models, which allow us to improve the results that can
be derived from existing work on restricted Boltzmann machine probability
models.Comment: 30 pages, 5 figures, 1 algorith
- …