Search CORE

48,033 research outputs found

Dynamic Optimal Training for Competitive Neural Networks

Author: Bouroumi Abdelaziz
Madiafi Mohammed
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 24/06/2014
Field of study

This paper introduces an unsupervised learning algorithm for optimal training of competitive neural networks. The learning rule of this algorithm is rived from the minimization of a new objective criterion using the gradient descent technique. Its learning rate and competition difficulty are dynamically adjusted throughout iterations. Numerical results that illustrate the performance of this algorithm in unsupervised pattern classification and image compression are also presented, discussed, and compared to those provided by other well-known algorithms for several examples of real test data

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

Exploring the Potential of Flexible 8-bit Format: Design and Algorithm

Author: Gong Ruihao
Liu Xianglong
Lu Lewei
Shen Yu
Shi Gonglei
Xia Xiaoxu
Zhang Qi
Zhang Yunchen
Zhang Zhuoyi
Publication venue
Publication date: 26/10/2023
Field of study

Neural network quantization is widely used to reduce model inference complexity in real-world deployments. However, traditional integer quantization suffers from accuracy degradation when adapting to various dynamic ranges. Recent research has focused on a new 8-bit format, FP8, with hardware support for both training and inference of neural networks but lacks guidance for hardware design. In this paper, we analyze the benefits of using FP8 quantization and provide a comprehensive comparison of FP8 with INT quantization. Then we propose a flexible mixed-precision quantization framework that supports various number systems, enabling optimal selection of the most appropriate quantization format for different neural network architectures. Experimental results demonstrate that our proposed framework achieves competitive performance compared to full precision on various tasks, including image classification, object detection, segmentation, and natural language understanding. Our work furnishes critical insights into the tangible benefits and feasibility of employing FP8 quantization, paving the way for heightened neural network efficiency in tangible scenarios. Our code is available in the supplementary material

arXiv.org e-Print Archive

Regularized Evolutionary Algorithm for Dynamic Neural Topology Search

Author: Iacca Giovanni
Roy Subhankar
Saltori Cristiano
Sebe Nicu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 19/08/2019
Field of study

Designing neural networks for object recognition requires considerable architecture engineering. As a remedy, neuro-evolutionary network architecture search, which automatically searches for optimal network architectures using evolutionary algorithms, has recently become very popular. Although very effective, evolutionary algorithms rely heavily on having a large population of individuals (i.e., network architectures) and is therefore memory expensive. In this work, we propose a Regularized Evolutionary Algorithm with low memory footprint to evolve a dynamic image classifier. In details, we introduce novel custom operators that regularize the evolutionary process of a micro-population of 10 individuals. We conduct experiments on three different digits datasets (MNIST, USPS, SVHN) and show that our evolutionary method obtains competitive results with the current state-of-the-art

arXiv.org e-Print Archive

Crossref

Archivio della ricerca - Fondazione Bruno Kessler

Learning Anytime Predictions in Neural Networks via Adaptive Loss Balancing

Author: Bagnell J. Andrew
Dey Debadeepta
Hebert Martial
Hu Hanzhang
Publication venue
Publication date: 25/05/2018
Field of study

This work considers the trade-off between accuracy and test-time computational cost of deep neural networks (DNNs) via \emph{anytime} predictions from auxiliary predictions. Specifically, we optimize auxiliary losses jointly in an \emph{adaptive} weighted sum, where the weights are inversely proportional to average of each loss. Intuitively, this balances the losses to have the same scale. We demonstrate theoretical considerations that motivate this approach from multiple viewpoints, including connecting it to optimizing the geometric mean of the expectation of each loss, an objective that ignores the scale of losses. Experimentally, the adaptive weights induce more competitive anytime predictions on multiple recognition data-sets and models than non-adaptive approaches including weighing all losses equally. In particular, anytime neural networks (ANNs) can achieve the same accuracy faster using adaptive weights on a small network than using static constant weights on a large one. For problems with high performance saturation, we also show a sequence of exponentially deepening ANNscan achieve near-optimal anytime results at any budget, at the cost of a const fraction of extra computation

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications