Search CORE

15 research outputs found

Preprint: Norm Loss: An efficient yet effective regularization method for deep neural networks

Author: Bäck Thomas
Chen Wei
Georgiou Theodoros
Lew Michael
Schmitt Sebastian
Publication venue
Publication date: 01/01/2021
Field of study

Convolutional neural network training can suffer from diverse issues like exploding or vanishing gradients, scaling-based weight space symmetry and covariant-shift. In order to address these issues, researchers develop weight regularization methods and activation normalization methods. In this work we propose a weight soft-regularization method based on the Oblique manifold. The proposed method uses a loss function which pushes each weight vector to have a norm close to one, i.e. the weight matrix is smoothly steered toward the so-called Oblique manifold. We evaluate our method on the very popular CIFAR-10, CIFAR-100 and ImageNet 2012 datasets using two state-of-the-art architectures, namely the ResNet and wide-ResNet. Our method introduces negligible computational overhead and the results show that it is competitive to the state-of-the-art and in some cases superior to it. Additionally, the results are less sensitive to hyperparameter settings such as batch size and regularization factor

arXiv.org e-Print Archive

Leiden University Scholary Publications

Towards Accelerating Training of Batch Normalization: A Manifold Perspective

Author: Chen Wei
Ma Zhi-Ming
Meng Qi
Yi Mingyang
Publication venue
Publication date: 08/01/2021
Field of study

Batch normalization (BN) has become a crucial component across diverse deep neural networks. The network with BN is invariant to positively linear re-scaling of weights, which makes there exist infinite functionally equivalent networks with various scales of weights. However, optimizing these equivalent networks with the first-order method such as stochastic gradient descent will converge to different local optima owing to different gradients across training. To alleviate this, we propose a quotient manifold \emph{PSI manifold}, in which all the equivalent weights of the network with BN are regarded as the same one element. Then, gradient descent and stochastic gradient descent on the PSI manifold are also constructed. The two algorithms guarantee that every group of equivalent weights (caused by positively re-scaling) converge to the equivalent optima. Besides that, we give the convergence rate of the proposed algorithms on PSI manifold and justify that they accelerate training compared with the algorithms on the Euclidean weight space. Empirical studies show that our algorithms can consistently achieve better performances over various experimental settings

arXiv.org e-Print Archive

Joint Communication and Sensing in RIS-enabled mmWave Networks

Author: Abanto-Leon Luis F.
Asadi Arash
Wang Lu
Publication venue
Publication date: 07/10/2022
Field of study

Empowering cellular networks with augmented sensing capabilities is one of the key research areas in 6G communication systems. Recently, we have witnessed a plethora of efforts to devise solutions that integrate sensing capabilities into communication systems, i.e., joint communication and sensing (JCAS). However, most prior works do not consider the impact of reconfigurable intelligent surfaces (RISs) on JCAS systems, especially at millimeter-wave (mmWave) bands. Given that RISs are expected to become an integral part of cellular systems, it is important to investigate their potential in cellular networks beyond communication goals. In this paper, we study mmWave orthogonal frequency-division multiplexing (OFDM) JCAS systems in the presence of RISs. Specifically, we jointly design the hybrid beamforming and RIS phase shifts to guarantee the sensing functionalities via minimizing a chordal-distance metric, subject to signal-to-interference-plus-noise (SINR) and power constraints. The non-convexity of the investigated problem poses a challenge which we address by proposing a solution based on the penalty method and manifold-based alternating direction method of multipliers (ADMM). Simulation results demonstrate that under various settings both sensing and communication experience improved performance when the RIS is adequately designed. In addition, we discuss the tradeoff between sensing and communication

arXiv.org e-Print Archive