Search CORE

2,415 research outputs found

Direct Feedback Alignment with Sparse Connections for Local Learning

Author: Crafton Brian
Gebhardt Evan
Parihar Abhinav
Raychowdhury Arijit
Publication venue
Publication date: 01/05/2019
Field of study

Recent advances in deep neural networks (DNNs) owe their success to training algorithms that use backpropagation and gradient-descent. Backpropagation, while highly effective on von Neumann architectures, becomes inefficient when scaling to large networks. Commonly referred to as the weight transport problem, each neuron's dependence on the weights and errors located deeper in the network require exhaustive data movement which presents a key problem in enhancing the performance and energy-efficiency of machine-learning hardware. In this work, we propose a bio-plausible alternative to backpropagation drawing from advances in feedback alignment algorithms in which the error computation at a single synapse reduces to the product of three scalar values. Using a sparse feedback matrix, we show that a neuron needs only a fraction of the information previously used by the feedback alignment algorithms. Consequently, memory and compute can be partitioned and distributed whichever way produces the most efficient forward pass so long as a single error can be delivered to each neuron. Our results show orders of magnitude improvement in data movement and

2\times

improvement in multiply-and-accumulate operations over backpropagation. Like previous work, we observe that any variant of feedback alignment suffers significant losses in classification accuracy on deep convolutional neural networks. By transferring trained convolutional layers and training the fully connected layers using direct feedback alignment, we demonstrate that direct feedback alignment can obtain results competitive with backpropagation. Furthermore, we observe that using an extremely sparse feedback matrix, rather than a dense one, results in a small accuracy drop while yielding hardware advantages. All the code and results are available under https://github.com/bcrafton/ssdfa.Comment: 15 pages, 8 figure

arXiv.org e-Print Archive

Directory of Open Access Journals

Improving neural networks by preventing co-adaptation of feature detectors

Author: Hinton Geoffrey E.
Krizhevsky Alex
Salakhutdinov Ruslan R.
Srivastava Nitish
Sutskever Ilya
Publication venue
Publication date: 01/01/2012
Field of study

When a large feedforward neural network is trained on a small training set, it typically performs poorly on held-out test data. This "overfitting" is greatly reduced by randomly omitting half of the feature detectors on each training case. This prevents complex co-adaptations in which a feature detector is only helpful in the context of several other specific feature detectors. Instead, each neuron learns to detect a feature that is generally helpful for producing the correct answer given the combinatorially large variety of internal contexts in which it must operate. Random "dropout" gives big improvements on many benchmark tasks and sets new records for speech and object recognition

arXiv.org e-Print Archive

CiteSeerX

Review of Deep Learning Algorithms and Architectures

Author: Mahmood Ausif
Shrestha Ajay
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

Deep learning (DL) is playing an increasingly important role in our lives. It has already made a huge impact in areas, such as cancer diagnosis, precision medicine, self-driving cars, predictive forecasting, and speech recognition. The painstakingly handcrafted feature extractors used in traditional learning, classification, and pattern recognition systems are not scalable for large-sized data sets. In many cases, depending on the problem complexity, DL can also overcome the limitations of earlier shallow networks that prevented efficient training and abstractions of hierarchical representations of multi-dimensional training data. Deep neural network (DNN) uses multiple (deep) layers of units with highly optimized algorithms and architectures. This paper reviews several optimization methods to improve the accuracy of the training and to reduce training time. We delve into the math behind training algorithms used in recent deep networks. We describe current shortcomings, enhancements, and implementations. The review also covers different types of deep architectures, such as deep convolution networks, deep residual networks, recurrent neural networks, reinforcement learning, variational autoencoders, and others.https://doi.org/10.1109/ACCESS.2019.291220

UB ScholarWorks

Crossref

Solar Irradiance Forecasting Using Dynamic Ensemble Selection

Author: Barchi Tathiana Mikamura
Converti Attilio
Dantas Douglas A. P.
de Mattos Neto Paulo S. G.
de Melo Filho Jos\ue9 Bione
de O. Santos Domingos S.
de Oliveira Jo\ue3o F. L.
Lima Aranildo R.
Madeiro Francisco
Marinho Manoel H. N.
Pereira Alex C.
Siqueira Hugo Valadares
Publication venue: 'MDPI AG'
Publication date: 01/01/2022
Field of study

Solar irradiance forecasting has been an essential topic in renewable energy generation. Forecasting is an important task because it can improve the planning and operation of photovoltaic systems, resulting in economic advantages. Traditionally, single models are employed in this task. However, issues regarding the selection of an inappropriate model, misspecification, or the presence of random fluctuations in the solar irradiance series can result in this approach underperforming. This paper proposes a heterogeneous ensemble dynamic selection model, named HetDS, to forecast solar irradiance. For each unseen test pattern, HetDS chooses the most suitable forecasting model based on a pool of seven well-known literature methods: ARIMA, support vector regression (SVR), multilayer perceptron neural network (MLP), extreme learning machine (ELM), deep belief network (DBN), random forest (RF), and gradient boosting (GB). The experimental evaluation was performed with four data sets of hourly solar irradiance measurements in Brazil. The proposed model attained an overall accuracy that is superior to the single models in terms of five well-known error metrics

Multidisciplinary Digital Publishing Institute

Archivio istituzionale della ricerca - Università di Genova