Search CORE

249 research outputs found

A new Sigma-Pi-Sigma neural network based on $L_1$ and $L_2$ regularization and applications

Author: Jianwei Jiao
Keqin Su
Publication venue: AIMS Press
Publication date: 01/02/2024
Field of study

As one type of the important higher-order neural networks developed in the last decade, the Sigma-Pi-Sigma neural network has more powerful nonlinear mapping capabilities compared with other popular neural networks. This paper is concerned with a new Sigma-Pi-Sigma neural network based on a

L_1

and

L_2

regularization batch gradient method, and the numerical experiments for classification and regression problems prove that the proposed algorithm is effective and has better properties comparing with other classical penalization methods. The proposed model combines the sparse solution tendency of

L_1

norm and the high benefits in efficiency of the

L_2

norm, which can regulate the complexity of a network and prevent overfitting. Also, the numerical oscillation, induced by the non-differentiability of

L_1

plus

L_2

regularization at the origin, can be eliminated by a smoothing technique to approximate the objective function

Directory of Open Access Journals

Theoretical Explanation of Activation Sparsity through Flat Minima and Adversarial Robustness

Author: Gao Yang
Peng Ze
Qi Lei
Shi Yinghuan
Publication venue
Publication date: 06/09/2023
Field of study

A recent empirical observation of activation sparsity in MLP layers offers an opportunity to drastically reduce computation costs for free. Despite several works attributing it to training dynamics, the theoretical explanation of activation sparsity's emergence is restricted to shallow networks, small training steps well as modified training, even though the sparsity has been found in deep models trained by vanilla protocols for large steps. To fill the three gaps, we propose the notion of gradient sparsity as the source of activation sparsity and a theoretical explanation based on it that explains gradient sparsity and then activation sparsity as necessary steps to adversarial robustness w.r.t. hidden features and parameters, which is approximately the flatness of minima for well-learned models. The theory applies to standardly trained LayerNorm-ed pure MLPs, and further to Transformers or other architectures if noises are added to weights during training. To eliminate other sources of flatness when arguing sparsities' necessity, we discover the phenomenon of spectral concentration, i.e., the ratio between the largest and the smallest non-zero singular values of weight matrices is small. We utilize random matrix theory (RMT) as a powerful theoretical tool to analyze stochastic gradient noises and discuss the emergence of spectral concentration. With these insights, we propose two plug-and-play modules for both training from scratch and sparsity finetuning, as well as one radical modification that only applies to from-scratch training. Another under-testing module for both sparsity and flatness is also immediate from our theories. Validational experiments are conducted to verify our explanation. Experiments for productivity demonstrate modifications' improvement in sparsity, indicating further theoretical cost reduction in both training and inference

arXiv.org e-Print Archive

Tree-Structure Expectation Propagation for LDPC Decoding over the BEC

Author: Murillo-Fuentes Juan José
Olmos Pablo M.
Pérez-Cruz Fernando
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/08/2012
Field of study

We present the tree-structure expectation propagation (Tree-EP) algorithm to decode low-density parity-check (LDPC) codes over discrete memoryless channels (DMCs). EP generalizes belief propagation (BP) in two ways. First, it can be used with any exponential family distribution over the cliques in the graph. Second, it can impose additional constraints on the marginal distributions. We use this second property to impose pair-wise marginal constraints over pairs of variables connected to a check node of the LDPC code's Tanner graph. Thanks to these additional constraints, the Tree-EP marginal estimates for each variable in the graph are more accurate than those provided by BP. We also reformulate the Tree-EP algorithm for the binary erasure channel (BEC) as a peeling-type algorithm (TEP) and we show that the algorithm has the same computational complexity as BP and it decodes a higher fraction of errors. We describe the TEP decoding process by a set of differential equations that represents the expected residual graph evolution as a function of the code parameters. The solution of these equations is used to predict the TEP decoder performance in both the asymptotic regime and the finite-length regime over the BEC. While the asymptotic threshold of the TEP decoder is the same as the BP decoder for regular and optimized codes, we propose a scaling law (SL) for finite-length LDPC codes, which accurately approximates the TEP improved performance and facilitates its optimization

arXiv.org e-Print Archive

CiteSeerX

Crossref

Design and Management of Manufacturing Systems

Author
Publication venue: 'MDPI AG'
Publication date: 11/01/2022
Field of study

Although the design and management of manufacturing systems have been explored in the literature for many years now, they still remain topical problems in the current scientific research. The changing market trends, globalization, the constant pressure to reduce production costs, and technical and technological progress make it necessary to search for new manufacturing methods and ways of organizing them, and to modify manufacturing system design paradigms. This book presents current research in different areas connected with the design and management of manufacturing systems and covers such subject areas as: methods supporting the design of manufacturing systems, methods of improving maintenance processes in companies, the design and improvement of manufacturing processes, the control of production processes in modern manufacturing systems production methods and techniques used in modern manufacturing systems and environmental aspects of production and their impact on the design and management of manufacturing systems. The wide range of research findings reported in this book confirms that the design of manufacturing systems is a complex problem and that the achievement of goals set for modern manufacturing systems requires interdisciplinary knowledge and the simultaneous design of the product, process and system, as well as the knowledge of modern manufacturing and organizational methods and techniques

Directory of Open Access Books (DOAB)

Fitness Landscape Analysis of Feed-Forward Neural Networks

Author: Bosman Anna Sergeevna
Publication venue: 'University of Pretoria - Department of Philosophy'
Publication date: 01/01/2019
Field of study

Neural network training is a highly non-convex optimisation problem with poorly understood properties. Due to the inherent high dimensionality, neural network search spaces cannot be intuitively visualised, thus other means to establish search space properties have to be employed. Fitness landscape analysis encompasses a selection of techniques designed to estimate the properties of a search landscape associated with an optimisation problem. Applied to neural network training, fitness landscape analysis can be used to establish a link between the properties of the error landscape and various neural network hyperparameters. This study applies fitness landscape analysis to investigate the influence of the search space boundaries, regularisation parameters, loss functions, activation functions, and feed-forward neural network architectures on the properties of the resulting error landscape. A novel gradient-based sampling technique is proposed, together with a novel method to quantify and visualise stationary points and the associated basins of attraction in neural network error landscapes.Thesis (PhD)--University of Pretoria, 2019.NRFComputer SciencePhDUnrestricte

UPSpace at the University of Pretoria

Review of Laurence R. Horn and Yasuhiko Kato (eds) (2000) Negation and polarity: syntactic and semantic perspectives. (Oxford University Press.)

Author: Rowlett PA
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/2002
Field of study

University of Salford Institutional Repository

Crossref

Adaptive control and neural network control of nonlinear discrete-time systems

Author: YANG CHENGUANG
Publication venue
Publication date: 24/07/2009
Field of study

Ph.DDOCTOR OF PHILOSOPH

ScholarBank@NUS