There is a recent interest in neural network (NN)based communication algorithms which have shown to achieve (beyond) state-of-the-art performance for a variety of problems or lead to reduced implementation complexity. However, most work on this topic is simulation based and implementation on specialized hardware for fast inference, such as field-programmable gate arrays (FPGAs), is widely ignored. In particular for practical uses, NN weights should be quantized and inference carried out by a fixed-point instead of floating-point system, widely used in consumer class computers and graphics processing units (GPUs). Moving to such representations enables higher inference rates and complexity reductions, at the cost of precision loss. We demonstrate that it is possible to implement NN-based algorithms in fixed-point arithmetic with quantized weights at negligible performance loss and with hardware complexity compatible with practical systems, such as FPGAs and application-specific integrated circuits (ASICs).
I. INTRODUCTION
Inspired by the success of deep learning (DL) in various fields such as computer vision and natural language processing (NLP), NN-based communication systems have gained a lot of attention recently. Approaches leveraging DL have lately emerged for channel coding [1] , multiple-input multipleoutput (MIMO) systems [2] , orthogonal frequency-division multiplexing (OFDM) [3] , and so forth. However, most of these contributions are simulation-based, and only a few have considered hardware implementation of DL-based approaches as well as the issues it raises. Indeed, one of the biggest obstacle to the use of NNs is their high memory requirement and computational complexity. Hardware acceleration is needed to achieve reasonable inference time, and most of previous contributions leverage graphics processing units (GPUs), which come at high monetary and energy cost not viable for communication systems. Indeed, inference speed on the physical layer are on the order of micro to nanoseconds, which is at least one order of magnitude faster than in other applications, e.g., autonomous cars.
Recently, multiple methods were proposed to compress NNs to reduce their complexity, such as weight quantization [4] , weight pruning [5] , or more efficient implementations of the conventional floating-point operators [6] . Nevertheless, such methods were mostly considered for flagship machine learning tasks, such as image classification and speech recognition. In this paper, we consider the efficient implementation of NNbased communication algorithms in fixed-point arithmetic. Our aim is not to reduce the memory footprint of the NN by learning a compressed representation of the architecture, but its computational complexity as we assume the NN is implemented in hardware. Fixed-point compute units are faster and consume less hardware resources and energy than conventional floating-point units [4] , [6] . We consider the implementation of an NN-based receiver as shown in Fig. 1 . However, the approach used in this paper can be applied to a wide variety of NN-based communication algorithm, e.g., fully learned transceiver implementations as done in [7] . The NN-based receiver trained with no constraints and uncompressed enables block error rates (BLERs) close to the ones of maximum likelihood (ML) detection as shown in Fig. 2 . In this figure, we assume a transmitter implementing the Agrell [8] scheme with eight bit blocks and four channel uses transmitting over an additive white Gaussian noise (AWGN) channel. We aim to quantize the weights of the NN-based receiver so that they take values in a finite codebook, enabling a more efficient implementation. A straightforward approach to quantize the NN-based receiver is to first train it with no constraints, and then to quantize its weights. This naive approach usually leads to significant performance loss as shown in Fig. 2 , which motivates the use of quantization-aware algorithms. In this paper, we leverage a two-stages approach to achieve efficient implementation without significant performance loss: 1) First, the weights are quantized using the learningcompression (LC) algorithm [9] so that they take values in a predefined finite codebook. Training using LC is done on regular high precision floating-point arithmetic. 2) Next, the quantized NN is implemented on a fixed-point arithmetic system, less precise than the floating-point system used to train it, but with a reduced complexity. We show that the quantized NN-receiver trained on a 32 bit floating-point arithmetic system but implemented on a 14 bit fixed-point arithmetic system enables BLERs close to the ones of ML detection, while being more than 60% less complex.
To the best of our knowledge, only a few papers have considered implementations of NN-based communication algorithms. NN-based transceivers [7] were implemented on field-programmable gate array (FPGA) in [10] . The implementation of a DL-based modulation classifier on FPGA was described in [11] . However, in neither of these contributions did the authors attempt to reduce the model complexity to make inference more efficient. In [12] , a recurrent NN was considered to decode polar codes. After each training epoch, the weights were quantized in a two-step process: the weights were first rounded to the nearest fixed-point value that can be represented by a predefined number of bits, before being assigned values from a codebook based on the frequency with which rounded weights appear. The authors showed that weight quantization enables significant decrease of the memory footprint and computational complexity, without significant performance loss. However, evaluation of the NN in fixed-point arithmetic was not performed.
Notations: Boldface upper-and lower-case letters denote matrices and column vectors, respectively. R and C respectively denote the sets of real and complex numbers. Z denotes the set of integer.
II. BACKGROUND ON NN COMPRESSION A. Fixed-point arithmetic
Conventional hardware used in machine learning, such as GPUs, rely on floating-point arithmetic. With this scheme, real numbers that can be represented exactly are of the form
where s and e are integers called the significand and the exponent, respectively, and sign is the sign function. The numbers of bits used for the significand and the exponent control the range and precision of the representation. A floating-point number is stored as the sign bit, the exponent field, and the significand field. A key feature of floatingpoint representation is that it does not form a uniformlyspaced grid, as the spacing between consecutive numbers grows with the exponent. This enables the representation of Regarding fixed-point arithmetic, only real numbers that can be written as
Bpzq´i2´i¸ (2) can be represented. K I and K F are non-negative integers that correspond to the number of bits of the integer and fractional parts, respectively. Notice that (2) corresponds to writing z in the binary numeral system and constraining the number of bits allowed for the integer and fractional parts to finite values. One can see that, contrary to floating-point representation, representable numbers form a uniformly-spaced grid whose range and precision are controlled by K I and K F , respectively. Two consecutive numbers are spaced by 2´K F . A fixed-point number is typically stored using K " K I`KF`1 bit (an additional bit is required to handle negative numbers), as shown in Fig. 3 . The number is represented as an K bit integer, with an implicit factor 2´K F that does not need to be stored as it is fixed. With regard to complexity, fixed-point operators are typically of low complexity compared to floating-point operators as no additional processing steps needs to be taken, which motivates their use to reduce the memory footprint and computational requirements of NNs, e.g. [13] . We adopt the same approach in this paper, by implementing an NN-based receiver in fixed-point arithmetic.
B. The LC algorithm
Moving to fixed-point arithmetic is not the only way to reduce the resources required by an NN. Another approach used in this paper conjointly with implementation in fixedpoint arithmetic is quantization of the weights. Quantizing the weights of an NN means forcing them to take values in a discrete codebook.
It is well-known that multiplication is significantly more computationally demanding than addition in fixed-point arithmetic. Indeed, a fixed-point multiplication of K bit operands requires up to K bit shifts and K´1 additions. By forcing the Stop if ψ´p ψ is small enough 9: end for weights to take values in a well-chosen codebooks, the cost of multiplications can be drastically reduced. For example, using the codebook t´1, 0, 1u reduces multiplications to zeroing or sign changes. Also, choosing the codebook to be a set of powers of two reduces multiplication to bit shifts in fixed-point arithmetic, as multiplication by 2 q is equivalent to moving the radix point q digits to the left or right depending on the sign of q, as illustrated in Fig. 3 .
A key question is how to train an NN while forcing its weights to take values in a given codebook. Let us denote by f ψ the mapping implemented by an NN with parameters ψ P R P , Lpψq the loss function, C the quantization codebook, and p ψ P C P the quantized weights. A naive approach is to first train the NN with no constraints on its weights by solving arg min ψ pLpψqq and then to quantize the model by choosing for each weight its closest value in the codebook by solving arg min This simple approach, referred to as direct compression (DC), typically does not lead to satisfactory results. To circumvent this issue, compressionaware algorithms are typically used. One such algorithm is LC [9] , in which compression of an NN is considered as a constrained optimization problem, solved by applying alternating optimization to the augmented Lagrangian. LC is guaranteed to converge to a local optimum in some cases [9, Section 3.2]. Each iteration of LC performs two steps, a learning step and a quantization step. The learning step updates the unquantized weights ψ by solving
where µ is a parameter of the algorithm, which is increased at each iteration following a predefined schedule, and λ is the Lagrange multiplier estimate. One can see that a regularization term is added to the loss, which ensures that the unquantized weights stay close to p ψ`1 µ λ. Fig. 4 . Architecture of the receiver which corresponds to the quantization of ψ´1 µ λ to the codebook C. The LC algorithm is depicted in Algorithm 1. The parameter µ is typically increased following a multiplicative schedule µ pkq " aµ pk´1q , where µ pkq is the value of µ at the kth iteration and a is a parameter of the algorithm larger than one. ψ and p ψ are initialized by training the NN without any constraints and then quantizing the weights (lines 1 and 2). In practice, the learning step (line 5) is approximatively solved using stochastic gradient descent (SGD) or a variant.
4)

III. QUANTIZATION OF NN-BASED RECEIVER
In this section, efficient implementation of an NN-based receiver is achieved using a two-stages approach. First, quantization of the NN-based receiver is performed by training it with LC on a usual floating-point arithmetic system. Next, the quantized NN is implemented and evaluated on a fixed-point system. While this paper focuses on an NN-based receiver, this approach can be applied to a wide variety of NN-based communication algorithms.
A. NN-based receiver
In a point-to-point communication system, two nodes aim to reliably exchange information over a stochastic channel as shown in Fig. 1 . The output of the channel y follows a probability distribution conditional to its input x, i.e., y " P py|xq. The transmitter aims to communicate messages m drawn from a finite set M " t1, . . . , M u, while the task of the receiver is to detect the sent messages m from the received signal y. The receiver is implemented as an NN (see Fig. 1 
, where θ R is the set of parameters and N the number of channel uses. Its purpose is to estimate the conditional probability P pm|yq, which corresponds to a supervised learning task. Once trained, the receiver can be deployed for practical use.
A communication system operating over an AWGN channel is considered in this work, with M " 256 and N " 4. The transmitter implements the Agrell scheme [8] , a subset of the E8 lattice designed by numerical optimization to approximately solve the sphere packing problem for M " 256 in eight dimensions (corresponding to four channel uses). Normalization is performed to ensure that E
where e s is the energy per complex symbol. The receiver is implemented by a C2R layer, mapping the N received complex symbols to 2N real numbers, followed by a dense layer of 64 units with ReLu activation, a dense layer of 32 units with ReLu activation, and finally a dense layer of M units with softmax activation, as shown in Fig. 4 . All the dense layers but the last use biases. Hard decoding is performed by taking the message with highest probability.
Regarding implementation considerations, the ReLu activation was chosen as it requires minimal overhead. Indeed, its implementation requires neither approximation using a lookup table nor arithmetic operations. Therefore, it does not incur computational overhead nor arithmetic errors due to approximation. Moreover, implementation of the output layer softmax activation is not required at deployment, as hard decoding can be performed based on the pre-activations.
C. Weight quantization
Quantization of the NN-based receiver was done by training the NN using the LC algorithm presented in Section II-B, on usual GPUs with floating-point arithmetic. Different codebooks were used for the weights and biases. Regarding the weights, the codebook was
where K " K I`KF`1 . The choice of this codebook was motivated by the much higher complexity of multiplications compared to additions. Accordingly, with this codebook, all multiplications are reduced to either zeroing or bit shifting. Moreover, multiplications by 2 q with |q| ě K´1 lead to zeroing on a K bit fixed-point system. Therefore, the codebook was restricted to powers of two with exponent less than K´1 in absolute value. Biases were constraint to take values in the codebook defined by the set of fixed-point numbers with K I bit for the integer part and K F bit for the fractional, i.e., the set of real numbers that can be represented as in (2) .´2 To evaluate the impact of receiver quantization on the BLER, comparison was done between the ML receiver, the unquantized, LC-and DC-quantized NN-based receiver. When quantization was performed, K I was set to 5 as it was experimentally found to be the smallest value large enough to avoid overflows on a fixed-point system. Trainings and evaluations were performed for values of K F of 2, 4, 8, and 12 bit. The signal-to-noise ratio (SNR) is defined as
where σ 2 is the per-complex noise symbol variance, and the equality results from the energy constraint ensured by the transmitter normalization layer. σ 2 was set to´80 dB, and the SNR was controlled by setting e s . Evaluations were done using the Tensorflow [14] framework and training with the Adam [15] variant of SGD. Fig. 5 shows the BLER achieved by the compared schemes for SNR values ranging from´2dB to 11 dB. Evaluations were performed on a floating-point system. Only results with K F " 8bit are shown for readability, as other values of K F lead to almost identical BLERs. One can see that quantization using the naive DC approach leads to higher error rates than quantization with LC. Moreover, quantizing the NN-based receiver using LC leads to BLERs close to the ones achieved by the not quantized NN-based receiver and ML detection.
D. Impact of fixed-point arithmetic
This section investigates the impact on the BLER of implementing the quantized NN-receiver on a fixed-point arithmetic system. The quantized NNs trained on GPUs with LC for K I " 5 bit and different values of K F were implemented on a fixed-point system with the corresponding number of bits allocated to the integer and fractional part. Fixed-point arithmetic was simulated in Python. As all the weights are powers of two, all multiplications were reduced to bit shifts. Implementation of the ReLu activation function is straightforward. The softmax activation of the output layer was not implemented, as hard decoding can be performed based on the pre-activations. Fig. 6 shows the BLER achieved by the receiver for different values of K F . It can be seen that using only 2 or 4 bit for the fractional part leads to significant increase of the error rate, while using 8 bit (or more) leads to no BLER degradation. These result shows that it is possible to implement the NNbased receiver in 14 bits fixed-point arithmetic with no BLER degradation, despite the fact that it was trained on a 32 bits floating point arithmetic system.
E. Complexity evaluation
It was shown in the previous section that an NN-based receiver with weights taking as values powers of two and implemented on a fixed-point arithmetic system can achieve BLER close to the ones of ML detection. In this section, we compare the computational complexity of the previously evaluated ML receiver and quantized NN-based receiver.
Regarding the NN-based receiver, only multiplications and additions are required. Moreover, multiplications only involve layers inputs and weights. Because the weights are quantized to take as values powers of two, all the multiplications required by the NN correspond to bit shifts. As the weights are assumed to be fixed after deployment, the bit shifts can be "hardwired" in the hardware implementation, removing the need for storing the weights in memory, as well as programmable bit shifters.
The ML-receiver is assumed to be implemented by measuring the squared Euclidean distance of the received signal with each of the M possibly sent signals, and taking the closest. Therefore, it requires squaring operations, i.e., multiplications. On a fixed-point system, each multiplication requires K´1 additions as well as K bit shifts, these latter being assumed to have a negligible complexity compared to additions. Accordingly, only additions are considered to compare the complexities of the implementations. Complexities of the considered schemes are therefore evaluated by comparing the number of required additions. Table I shows the number of additions required by the quantized NN-based receiver and the ML receiver, for which each multiplication was counted as K´1 additions. Notice that the complexity of an addition depends on how it is implemented, and of the number of bits K used in the fixed-point system. As one can see, the quantized NN-based receiver requires approximately 60% less additions than the ML receiver with K " 14 bits, without incurring significant BLER degradation as seen in the previous section. This encouraging result illustrates how NNbased approaches have the potential to significantly reduce the complexity of communication systems, without significant loss of performance.
IV. CONCLUSION
We presented in this paper an approach to reduce the implementation complexity of NN-based communication algorithms. Considering an NN-based receiver as example, complexity reduction was achieved by quantizing the weights so that they take as values powers of two, reducing all multiplication to bit shifts in fixed-point arithmetic. Compared to naive direct compression, this approach incurs almost no BLER increase, while enabling significant gain in computational complexity. Our results show that the compressed NN-based receiver achieves BLERs close to the ones of ML detection, while enabling 60 % gains in computational complexity when implemented on a 14 bits fixed-point arithmetic system.
We believe that future work on quantization, compression, and more broadly, the efficient hardware implementation of NNs for physical layer tasks is required by our community before machine learning-based solutions can make it into commercial products.
