171 research outputs found
RadiX-Net: Structured Sparse Matrices for Deep Neural Networks
The sizes of deep neural networks (DNNs) are rapidly outgrowing the capacity
of hardware to store and train them. Research over the past few decades has
explored the prospect of sparsifying DNNs before, during, and after training by
pruning edges from the underlying topology. The resulting neural network is
known as a sparse neural network. More recent work has demonstrated the
remarkable result that certain sparse DNNs can train to the same precision as
dense DNNs at lower runtime and storage cost. An intriguing class of these
sparse DNNs is the X-Nets, which are initialized and trained upon a sparse
topology with neither reference to a parent dense DNN nor subsequent pruning.
We present an algorithm that deterministically generates RadiX-Nets: sparse DNN
topologies that, as a whole, are much more diverse than X-Net topologies, while
preserving X-Nets' desired characteristics. We further present a
functional-analytic conjecture based on the longstanding observation that
sparse neural network topologies can attain the same expressive power as dense
counterpartsComment: 7 pages, 8 figures, accepted at IEEE IPDPS 2019 GrAPL workshop. arXiv
admin note: substantial text overlap with arXiv:1809.0524
Neural networks in geophysical applications
Neural networks are increasingly popular in geophysics.
Because they are universal approximators, these
tools can approximate any continuous function with an
arbitrary precision. Hence, they may yield important
contributions to finding solutions to a variety of geophysical applications.
However, knowledge of many methods and techniques
recently developed to increase the performance
and to facilitate the use of neural networks does not seem
to be widespread in the geophysical community. Therefore,
the power of these tools has not yet been explored to
their full extent. In this paper, techniques are described
for faster training, better overall performance, i.e., generalization,and the automatic estimation of network size
and architecture
Toward a More Robust Pruning Procedure for MLP Networks
Choosing a proper neural network architecture is a problem of great practical importance. Smaller models mean not only simpler designs but also lower variance for parameter estimation and network prediction. The widespread utilization of neural networks in modeling highlights an issue in human factors. The procedure of building neural models should find an appropriate level of model complexity in a more or less automatic fashion to make it less prone to human subjectivity. In this paper we present a Singular Value Decomposition based node elimination technique and enhanced implementation of the Optimal Brain Surgeon algorithm. Combining both methods creates a powerful pruning engine that can be used for tuning feedforward connectionist models. The performance of the proposed method is demonstrated by adjusting the structure of a multi-input multi-output model used to calibrate a six-component wind tunnel strain gage
Automated Architecture Design for Deep Neural Networks
Machine learning has made tremendous progress in recent years and received
large amounts of public attention. Though we are still far from designing a
full artificially intelligent agent, machine learning has brought us many
applications in which computers solve human learning tasks remarkably well.
Much of this progress comes from a recent trend within machine learning, called
deep learning. Deep learning models are responsible for many state-of-the-art
applications of machine learning. Despite their success, deep learning models
are hard to train, very difficult to understand, and often times so complex
that training is only possible on very large GPU clusters. Lots of work has
been done on enabling neural networks to learn efficiently. However, the design
and architecture of such neural networks is often done manually through trial
and error and expert knowledge. This thesis inspects different approaches,
existing and novel, to automate the design of deep feedforward neural networks
in an attempt to create less complex models with good performance that take
away the burden of deciding on an architecture and make it more efficient to
design and train such deep networks.Comment: Undergraduate Thesi
- …