3,077 research outputs found
Training Recipe for N:M Structured Sparsity with Decaying Pruning Mask
Sparsity has become one of the promising methods to compress and accelerate
Deep Neural Networks (DNNs). Among different categories of sparsity, structured
sparsity has gained more attention due to its efficient execution on modern
accelerators. Particularly, N:M sparsity is attractive because there are
already hardware accelerator architectures that can leverage certain forms of
N:M structured sparsity to yield higher compute-efficiency. In this work, we
focus on N:M sparsity and extensively study and evaluate various training
recipes for N:M sparsity in terms of the trade-off between model accuracy and
compute cost (FLOPs). Building upon this study, we propose two new decay-based
pruning methods, namely "pruning mask decay" and "sparse structure decay". Our
evaluations indicate that these proposed methods consistently deliver
state-of-the-art (SOTA) model accuracy, comparable to unstructured sparsity, on
a Transformer-based model for a translation task. The increase in the accuracy
of the sparse model using the new training recipes comes at the cost of
marginal increase in the total training compute (FLOPs).Comment: 11 pages, 2 figures, and 9 tables. Published at the ICML Workshop on
Sparsity in Neural Networks Advancing Understanding and Practice, 2022. First
two authors contributed equall
A Generalized Method for Integrating Rule-based Knowledge into Inductive Methods Through Virtual Sample Creation
Hybrid learning methods use theoretical knowledge of a domain and a set of classified examples to develop a method for classification. Methods that use domain knowledge have been shown to perform better than inductive learners. However, there is no general method to include domain knowledge into all inductive learning algorithms as all hybrid methods are highly specialized for a particular algorithm. We present an algorithm that will take domain knowledge in the form of propositional rules, generate artificial examples from the rules and also remove instances likely to be flawed. This enriched dataset then can be used by any learning algorithm. Experimental results of different scenarios are shown that demonstrate this method to be more effective than simple inductive learning
Spectral pruning of fully connected layers
Training of neural networks can be reformulated in spectral space, by
allowing eigenvalues and eigenvectors of the network to act as target of the
optimization instead of the individual weights. Working in this setting, we
show that the eigenvalues can be used to rank the nodes' importance within the
ensemble. Indeed, we will prove that sorting the nodes based on their
associated eigenvalues, enables effective pre- and post-processing pruning
strategies to yield massively compacted networks (in terms of the number of
composing neurons) with virtually unchanged performance. The proposed methods
are tested for different architectures, with just a single or multiple hidden
layers, and against distinct classification tasks of general interest.Comment: 16 pages, 11 figures. Sections rearranged in v
Theoretical Interpretations and Applications of Radial Basis Function Networks
Medical applications usually used Radial Basis Function Networks just as Artificial Neural Networks. However, RBFNs are Knowledge-Based Networks that can be interpreted in several way: Artificial Neural Networks, Regularization Networks, Support Vector Machines, Wavelet Networks, Fuzzy Controllers, Kernel Estimators, Instanced-Based Learners. A survey of their interpretations and of their corresponding learning algorithms is provided as well as a brief survey on dynamic learning algorithms. RBFNs' interpretations can suggest applications that are particularly interesting in medical domains
Analyzing Learned Molecular Representations for Property Prediction
Advancements in neural machinery have led to a wide range of algorithmic
solutions for molecular property prediction. Two classes of models in
particular have yielded promising results: neural networks applied to computed
molecular fingerprints or expert-crafted descriptors, and graph convolutional
neural networks that construct a learned molecular representation by operating
on the graph structure of the molecule. However, recent literature has yet to
clearly determine which of these two methods is superior when generalizing to
new chemical space. Furthermore, prior research has rarely examined these new
models in industry research settings in comparison to existing employed models.
In this paper, we benchmark models extensively on 19 public and 16 proprietary
industrial datasets spanning a wide variety of chemical endpoints. In addition,
we introduce a graph convolutional model that consistently matches or
outperforms models using fixed molecular descriptors as well as previous graph
neural architectures on both public and proprietary datasets. Our empirical
findings indicate that while approaches based on these representations have yet
to reach the level of experimental reproducibility, our proposed model
nevertheless offers significant improvements over models currently used in
industrial workflows
Non-linear Autoregressive Neural Networks to Forecast Short-Term Solar Radiation for Photovoltaic Energy Predictions
Nowadays, green energy is considered as a viable solution to hinder CO2 emissions and greenhouse effects. Indeed, it is expected that Renewable Energy Sources (RES) will cover 40% of the total energy request by 2040. This will move forward decentralized and cooperative power distribution systems also called smart grids. Among RES, solar energy will play a crucial role. However, reliable models and tools are needed to forecast and estimate with a good accuracy the renewable energy production in short-term time periods. These tools will unlock new services for smart grid management.
In this paper, we propose an innovative methodology for implementing two different non-linear autoregressive neural networks to forecast Global Horizontal Solar Irradiance (GHI) in short-term time periods (i.e. from future 15 to 120min). Both neural networks have been implemented, trained and validated exploiting a dataset consisting of four years of solar radiation values collected by a real weather station. We also present the experimental results discussing and comparing the accuracy of both neural networks. Then, the resulting GHI forecast is given as input to a Photovoltaic simulator to predict energy production in short-term time periods. Finally, we present the results of this Photovoltaic energy estimation discussing also their accuracy
Imitation Learning for Vision-based Lane Keeping Assistance
This paper aims to investigate direct imitation learning from human drivers
for the task of lane keeping assistance in highway and country roads using
grayscale images from a single front view camera. The employed method utilizes
convolutional neural networks (CNN) to act as a policy that is driving a
vehicle. The policy is successfully learned via imitation learning using
real-world data collected from human drivers and is evaluated in closed-loop
simulated environments, demonstrating good driving behaviour and a robustness
for domain changes. Evaluation is based on two proposed performance metrics
measuring how well the vehicle is positioned in a lane and the smoothness of
the driven trajectory.Comment: International Conference on Intelligent Transportation Systems (ITSC
- …