Search CORE

17 research outputs found

Preprint: Norm Loss: An efficient yet effective regularization method for deep neural networks

Author: Bäck Thomas
Chen Wei
Georgiou Theodoros
Lew Michael
Schmitt Sebastian
Publication venue
Publication date: 01/01/2021
Field of study

Convolutional neural network training can suffer from diverse issues like exploding or vanishing gradients, scaling-based weight space symmetry and covariant-shift. In order to address these issues, researchers develop weight regularization methods and activation normalization methods. In this work we propose a weight soft-regularization method based on the Oblique manifold. The proposed method uses a loss function which pushes each weight vector to have a norm close to one, i.e. the weight matrix is smoothly steered toward the so-called Oblique manifold. We evaluate our method on the very popular CIFAR-10, CIFAR-100 and ImageNet 2012 datasets using two state-of-the-art architectures, namely the ResNet and wide-ResNet. Our method introduces negligible computational overhead and the results show that it is competitive to the state-of-the-art and in some cases superior to it. Additionally, the results are less sensitive to hyperparameter settings such as batch size and regularization factor

arXiv.org e-Print Archive

Leiden University Scholary Publications

Almost-Orthogonal Layers for Efficient General-Purpose Lipschitz Networks

Author: Lampert Christoph H.
Prach Bernd
Publication venue
Publication date: 01/09/2023
Field of study

It is a highly desirable property for deep networks to be robust against small input changes. One popular way to achieve this property is by designing networks with a small Lipschitz constant. In this work, we propose a new technique for constructing such Lipschitz networks that has a number of desirable properties: it can be applied to any linear network layer (fully-connected or convolutional), it provides formal guarantees on the Lipschitz constant, it is easy to implement and efficient to run, and it can be combined with any training objective and optimization method. In fact, our technique is the first one in the literature that achieves all of these properties simultaneously. Our main contribution is a rescaling-based weight matrix parametrization that guarantees each network layer to have a Lipschitz constant of at most 1 and results in the learned weight matrices to be close to orthogonal. Hence we call such layers almost-orthogonal Lipschitz (AOL). Experiments and ablation studies in the context of image classification with certified robust accuracy confirm that AOL layers achieve results that are on par with most existing methods. Yet, they are simpler to implement and more broadly applicable, because they do not require computationally expensive matrix orthogonalization or inversion steps as part of the network architecture. We provide code at https://github.com/berndprach/AOL.Comment: - Corrected the results from competitor ECO. - Corrected a typo in the loss function equatio

arXiv.org e-Print Archive

Covert Communication in Autoencoder Wireless Systems

Author: Mohammadi Ali Teshnizi
Publication venue: University of Calgary
Publication date: 12/09/2023
Field of study

The broadcast nature of wireless communications presents security and privacy challenges. Covert communication is a wireless security practice that focuses on intentionally hiding transmitted information. Recently, wireless systems have experienced significant growth, including the emergence of autoencoder-based models. These models, like other DNN architectures, are vulnerable to adversarial attacks, highlighting the need to study their susceptibility to covert communication. While there is ample research on covert communication in traditional wireless systems, the investigation of autoencoder wireless systems remains scarce. Furthermore, many existing covert methods are either detectable analytically or difficult to adapt to diverse wireless systems. The first part of this thesis provides a comprehensive examination of autoencoder-based communication systems in various scenarios and channel conditions. It begins with an introduction to autoencoder communication systems, followed by a detailed discussion of our own implementation and evaluation results. This serves as a solid foundation for the subsequent part of the thesis, where we propose a GAN-based covert communication model. By treating the covert sender, covert receiver, and observer as generator, decoder, and discriminator neural networks, respectively, we conduct joint training in an adversarial setting to develop a covert communication scheme that can be integrated into any normal autoencoder. Our proposal minimizes the impact on ongoing normal communication, addressing previous works shortcomings. We also introduce a training algorithm that allows for the desired tradeoff between covertness and reliability. Numerical results demonstrate the establishment of a reliable and undetectable channel between covert users, regardless of the cover signal or channel condition, with minimal disruption to the normal system operation

PRISM: University of Calgary Digital Repository

1-Lipschitz Neural Networks are more expressive with N-Activations

Author: Lampert Christoph H.
Prach Bernd
Publication venue
Publication date: 10/11/2023
Field of study

A crucial property for achieving secure, trustworthy and interpretable deep learning systems is their robustness: small changes to a system's inputs should not result in large changes to its outputs. Mathematically, this means one strives for networks with a small Lipschitz constant. Several recent works have focused on how to construct such Lipschitz networks, typically by imposing constraints on the weight matrices. In this work, we study an orthogonal aspect, namely the role of the activation function. We show that commonly used activation functions, such as MaxMin, as well as all piece-wise linear ones with two segments unnecessarily restrict the class of representable functions, even in the simplest one-dimensional setting. We furthermore introduce the new N-activation function that is provably more expressive than currently popular activation functions. We provide code at https://github.com/berndprach/NActivation

arXiv.org e-Print Archive

Applying Deep Learning for Phase-Array Antenna Design

Author: Zhang Peng Jr
Publication venue: 'Faculty of Medicine Prince of Songkla University'
Publication date: 01/01/2022
Field of study

Master of Engineering (Electrical Engineering), 2021Hybrid beamforming (HBF) can provide rapid data transmission rates while reducing the complexity and cost of massive multiple-input multiple-output (MIMO) systems. However, channel state information (CSI) is imperfect in realistic downlink channels, introducing challenges to hybrid beamforming (HBF) design. For HBF designs, we had a hard time finding the proper labels. If we use the optimized output based on the traditional algorithm as the label, the neural network can only be trained to approximate the traditional algorithm, but not better than the traditional algorithm. This thesis proposes a hybrid beamforming neural network based on unsupervised deep learning (USDNN) to prevent the labeling overhead of supervised learning and improve the achievable sum rate based on imperfect CSI. Compared with the traditional HBF method, the unsupervised learning-based method can avoid the labeling overhead as well as obtain better performance than the traditional algorithm. The network consists of 5 dense layers, 4 batch normalization (BN) layers and 5 activation functions. After training, the optimized beamformer can be obtained, and the optimized beamforming vector can be directly output. The simulation results show that our proposed method is 74% better than manifold optimization (MO) and 120% better than orthogonal match pursuit (OMP) systems. Furthermore, our proposed USDNN can achieve near-optimal performance

PSU Knowledge Bank