3,077 research outputs found

    Training Recipe for N:M Structured Sparsity with Decaying Pruning Mask

    Full text link
    Sparsity has become one of the promising methods to compress and accelerate Deep Neural Networks (DNNs). Among different categories of sparsity, structured sparsity has gained more attention due to its efficient execution on modern accelerators. Particularly, N:M sparsity is attractive because there are already hardware accelerator architectures that can leverage certain forms of N:M structured sparsity to yield higher compute-efficiency. In this work, we focus on N:M sparsity and extensively study and evaluate various training recipes for N:M sparsity in terms of the trade-off between model accuracy and compute cost (FLOPs). Building upon this study, we propose two new decay-based pruning methods, namely "pruning mask decay" and "sparse structure decay". Our evaluations indicate that these proposed methods consistently deliver state-of-the-art (SOTA) model accuracy, comparable to unstructured sparsity, on a Transformer-based model for a translation task. The increase in the accuracy of the sparse model using the new training recipes comes at the cost of marginal increase in the total training compute (FLOPs).Comment: 11 pages, 2 figures, and 9 tables. Published at the ICML Workshop on Sparsity in Neural Networks Advancing Understanding and Practice, 2022. First two authors contributed equall

    A Generalized Method for Integrating Rule-based Knowledge into Inductive Methods Through Virtual Sample Creation

    Get PDF
    Hybrid learning methods use theoretical knowledge of a domain and a set of classified examples to develop a method for classification. Methods that use domain knowledge have been shown to perform better than inductive learners. However, there is no general method to include domain knowledge into all inductive learning algorithms as all hybrid methods are highly specialized for a particular algorithm. We present an algorithm that will take domain knowledge in the form of propositional rules, generate artificial examples from the rules and also remove instances likely to be flawed. This enriched dataset then can be used by any learning algorithm. Experimental results of different scenarios are shown that demonstrate this method to be more effective than simple inductive learning

    Spectral pruning of fully connected layers

    Get PDF
    Training of neural networks can be reformulated in spectral space, by allowing eigenvalues and eigenvectors of the network to act as target of the optimization instead of the individual weights. Working in this setting, we show that the eigenvalues can be used to rank the nodes' importance within the ensemble. Indeed, we will prove that sorting the nodes based on their associated eigenvalues, enables effective pre- and post-processing pruning strategies to yield massively compacted networks (in terms of the number of composing neurons) with virtually unchanged performance. The proposed methods are tested for different architectures, with just a single or multiple hidden layers, and against distinct classification tasks of general interest.Comment: 16 pages, 11 figures. Sections rearranged in v

    Theoretical Interpretations and Applications of Radial Basis Function Networks

    Get PDF
    Medical applications usually used Radial Basis Function Networks just as Artificial Neural Networks. However, RBFNs are Knowledge-Based Networks that can be interpreted in several way: Artificial Neural Networks, Regularization Networks, Support Vector Machines, Wavelet Networks, Fuzzy Controllers, Kernel Estimators, Instanced-Based Learners. A survey of their interpretations and of their corresponding learning algorithms is provided as well as a brief survey on dynamic learning algorithms. RBFNs' interpretations can suggest applications that are particularly interesting in medical domains

    Analyzing Learned Molecular Representations for Property Prediction

    Full text link
    Advancements in neural machinery have led to a wide range of algorithmic solutions for molecular property prediction. Two classes of models in particular have yielded promising results: neural networks applied to computed molecular fingerprints or expert-crafted descriptors, and graph convolutional neural networks that construct a learned molecular representation by operating on the graph structure of the molecule. However, recent literature has yet to clearly determine which of these two methods is superior when generalizing to new chemical space. Furthermore, prior research has rarely examined these new models in industry research settings in comparison to existing employed models. In this paper, we benchmark models extensively on 19 public and 16 proprietary industrial datasets spanning a wide variety of chemical endpoints. In addition, we introduce a graph convolutional model that consistently matches or outperforms models using fixed molecular descriptors as well as previous graph neural architectures on both public and proprietary datasets. Our empirical findings indicate that while approaches based on these representations have yet to reach the level of experimental reproducibility, our proposed model nevertheless offers significant improvements over models currently used in industrial workflows

    Non-linear Autoregressive Neural Networks to Forecast Short-Term Solar Radiation for Photovoltaic Energy Predictions

    Get PDF
    Nowadays, green energy is considered as a viable solution to hinder CO2 emissions and greenhouse effects. Indeed, it is expected that Renewable Energy Sources (RES) will cover 40% of the total energy request by 2040. This will move forward decentralized and cooperative power distribution systems also called smart grids. Among RES, solar energy will play a crucial role. However, reliable models and tools are needed to forecast and estimate with a good accuracy the renewable energy production in short-term time periods. These tools will unlock new services for smart grid management. In this paper, we propose an innovative methodology for implementing two different non-linear autoregressive neural networks to forecast Global Horizontal Solar Irradiance (GHI) in short-term time periods (i.e. from future 15 to 120min). Both neural networks have been implemented, trained and validated exploiting a dataset consisting of four years of solar radiation values collected by a real weather station. We also present the experimental results discussing and comparing the accuracy of both neural networks. Then, the resulting GHI forecast is given as input to a Photovoltaic simulator to predict energy production in short-term time periods. Finally, we present the results of this Photovoltaic energy estimation discussing also their accuracy

    Imitation Learning for Vision-based Lane Keeping Assistance

    Full text link
    This paper aims to investigate direct imitation learning from human drivers for the task of lane keeping assistance in highway and country roads using grayscale images from a single front view camera. The employed method utilizes convolutional neural networks (CNN) to act as a policy that is driving a vehicle. The policy is successfully learned via imitation learning using real-world data collected from human drivers and is evaluated in closed-loop simulated environments, demonstrating good driving behaviour and a robustness for domain changes. Evaluation is based on two proposed performance metrics measuring how well the vehicle is positioned in a lane and the smoothness of the driven trajectory.Comment: International Conference on Intelligent Transportation Systems (ITSC
    corecore