Search CORE

3,077 research outputs found

Training Recipe for N:M Structured Sparsity with Decaying Pruning Mask

Author: Agrawal Shivani
Evci Utku
Kao Sheng-Chun
Krishna Tushar
Subramanian Suvinay
Yazdanbakhsh Amir
Publication venue
Publication date: 15/09/2022
Field of study

Sparsity has become one of the promising methods to compress and accelerate Deep Neural Networks (DNNs). Among different categories of sparsity, structured sparsity has gained more attention due to its efficient execution on modern accelerators. Particularly, N:M sparsity is attractive because there are already hardware accelerator architectures that can leverage certain forms of N:M structured sparsity to yield higher compute-efficiency. In this work, we focus on N:M sparsity and extensively study and evaluate various training recipes for N:M sparsity in terms of the trade-off between model accuracy and compute cost (FLOPs). Building upon this study, we propose two new decay-based pruning methods, namely "pruning mask decay" and "sparse structure decay". Our evaluations indicate that these proposed methods consistently deliver state-of-the-art (SOTA) model accuracy, comparable to unstructured sparsity, on a Transformer-based model for a translation task. The increase in the accuracy of the sparse model using the new training recipes comes at the cost of marginal increase in the total training compute (FLOPs).Comment: 11 pages, 2 figures, and 9 tables. Published at the ICML Workshop on Sparsity in Neural Networks Advancing Understanding and Practice, 2022. First two authors contributed equall

arXiv.org e-Print Archive

A Generalized Method for Integrating Rule-based Knowledge into Inductive Methods Through Virtual Sample Creation

Author: Iqbal Ridwan Al
Publication venue
Publication date: 25/01/2011
Field of study

Hybrid learning methods use theoretical knowledge of a domain and a set of classified examples to develop a method for classification. Methods that use domain knowledge have been shown to perform better than inductive learners. However, there is no general method to include domain knowledge into all inductive learning algorithms as all hybrid methods are highly specialized for a particular algorithm. We present an algorithm that will take domain knowledge in the form of propositional rules, generate artificial examples from the rules and also remove instances likely to be flawed. This enriched dataset then can be used by any learning algorithm. Experimental results of different scenarios are shown that demonstrate this method to be more effective than simple inductive learning

arXiv.org e-Print Archive

CogPrints Cognitive Sciences Eprint Archive

Spectral pruning of fully connected layers

Author: Buffoni Lorenzo
Chicchi Lorenzo
Civitelli Enrico
Fanelli Duccio
Giambagli Lorenzo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 02/08/2021
Field of study

Training of neural networks can be reformulated in spectral space, by allowing eigenvalues and eigenvectors of the network to act as target of the optimization instead of the individual weights. Working in this setting, we show that the eigenvalues can be used to rank the nodes' importance within the ensemble. Indeed, we will prove that sorting the nodes based on their associated eigenvalues, enables effective pre- and post-processing pruning strategies to yield massively compacted networks (in terms of the number of composing neurons) with virtually unchanged performance. The proposed methods are tested for different architectures, with just a single or multiple hidden layers, and against distinct classification tasks of general interest.Comment: 16 pages, 11 figures. Sections rearranged in v

arXiv.org e-Print Archive

PubMed Central

Repository of the University of Namur

Theoretical Interpretations and Applications of Radial Basis Function Networks

Author: Blanzieri Enrico
Publication venue
Publication date: 01/05/2003
Field of study

Medical applications usually used Radial Basis Function Networks just as Artificial Neural Networks. However, RBFNs are Knowledge-Based Networks that can be interpreted in several way: Artificial Neural Networks, Regularization Networks, Support Vector Machines, Wavelet Networks, Fuzzy Controllers, Kernel Estimators, Instanced-Based Learners. A survey of their interpretations and of their corresponding learning algorithms is provided as well as a brief survey on dynamic learning algorithms. RBFNs' interpretations can suggest applications that are particularly interesting in medical domains

Unitn-eprints Research

Analyzing Learned Molecular Representations for Property Prediction

Author: Barzilay Regina
Coley Connor
Eiden Philipp
Gao Hua
Guzman-Perez Angel
Hopper Timothy
Jaakkola Tommi
Jensen Klavs
Jin Wengong
Kelley Brian
Mathea Miriam
Palmer Andrew
Settels Volker
Swanson Kyle
Yang Kevin
Publication venue
Publication date: 03/04/2019
Field of study

Advancements in neural machinery have led to a wide range of algorithmic solutions for molecular property prediction. Two classes of models in particular have yielded promising results: neural networks applied to computed molecular fingerprints or expert-crafted descriptors, and graph convolutional neural networks that construct a learned molecular representation by operating on the graph structure of the molecule. However, recent literature has yet to clearly determine which of these two methods is superior when generalizing to new chemical space. Furthermore, prior research has rarely examined these new models in industry research settings in comparison to existing employed models. In this paper, we benchmark models extensively on 19 public and 16 proprietary industrial datasets spanning a wide variety of chemical endpoints. In addition, we introduce a graph convolutional model that consistently matches or outperforms models using fixed molecular descriptors as well as previous graph neural architectures on both public and proprietary datasets. Our empirical findings indicate that while approaches based on these representations have yet to reach the level of experimental reproducibility, our proposed model nevertheless offers significant improvements over models currently used in industrial workflows

arXiv.org e-Print Archive

DSpace@MIT

Crossref

FigShare

Non-linear Autoregressive Neural Networks to Forecast Short-Term Solar Radiation for Photovoltaic Energy Predictions

Author: A Madanchi
A Qazi
A Tealab
AG Expósito
AK Yadav
AS Weigend
C Voyant
C Voyant
CA Gueymard
CJ Willmott
DC Montgomery
DP Mandic
DR Legates
E Dickinson
G Chandrashekar
GE Box
H Rahimi-Eichi
H Xing
HT Siegelmann
IH Witten
J Aghaei
JD Hamilton
JS Vardakas
L Bottaccioli
L Bottaccioli
L Bottaccioli
LK Hansen
M Hosenuzzaman
M Kubat
M Norgaard
N Srivastava
P Refaeilzadeh
P Siano
PJ Brockwell
R Rajamani
S Haykin
S Makridakis
S Rajakaruna
S Weckx
V Badescu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Nowadays, green energy is considered as a viable solution to hinder CO2 emissions and greenhouse effects. Indeed, it is expected that Renewable Energy Sources (RES) will cover 40% of the total energy request by 2040. This will move forward decentralized and cooperative power distribution systems also called smart grids. Among RES, solar energy will play a crucial role. However, reliable models and tools are needed to forecast and estimate with a good accuracy the renewable energy production in short-term time periods. These tools will unlock new services for smart grid management. In this paper, we propose an innovative methodology for implementing two different non-linear autoregressive neural networks to forecast Global Horizontal Solar Irradiance (GHI) in short-term time periods (i.e. from future 15 to 120min). Both neural networks have been implemented, trained and validated exploiting a dataset consisting of four years of solar radiation values collected by a real weather station. We also present the experimental results discussing and comparing the accuracy of both neural networks. Then, the resulting GHI forecast is given as input to a Photovoltaic simulator to predict energy production in short-term time periods. Finally, we present the results of this Photovoltaic energy estimation discussing also their accuracy

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Imitation Learning for Vision-based Lane Keeping Assistance

Author: Innocenti Christopher
Lindén Henrik
Mohammadiha Nasser
Panahandeh Ghazaleh
Svensson Lennart
Publication venue
Publication date: 12/09/2017
Field of study

This paper aims to investigate direct imitation learning from human drivers for the task of lane keeping assistance in highway and country roads using grayscale images from a single front view camera. The employed method utilizes convolutional neural networks (CNN) to act as a policy that is driving a vehicle. The policy is successfully learned via imitation learning using real-world data collected from human drivers and is evaluated in closed-loop simulated environments, demonstrating good driving behaviour and a robustness for domain changes. Evaluation is based on two proposed performance metrics measuring how well the vehicle is positioned in a lane and the smoothness of the driven trajectory.Comment: International Conference on Intelligent Transportation Systems (ITSC

arXiv.org e-Print Archive

Chalmers Research