713 research outputs found
Geometric Constellation Shaping for Fiber-Optic Channels via End-to-End Learning
End-to-end learning has become a popular method to optimize a constellation
shape of a communication system. When the channel model is differentiable,
end-to-end learning can be applied with conventional backpropagation algorithm
for optimization of the shape. A variety of optimization algorithms have also
been developed for end-to-end learning over a non-differentiable channel model.
In this paper, we compare gradient-free optimization method based on the
cubature Kalman filter, model-free optimization and backpropagation for
end-to-end learning on a fiber-optic channel modeled by the split-step Fourier
method. The results indicate that the gradient-free optimization algorithms
provide a decent replacement to backpropagation in terms of performance at the
expense of computational complexity. Furthermore, the quantization problem of
finite bit resolution of the digital-to-analog and analog-to-digital converters
is addressed and its impact on geometrically shaped constellations is analysed.
Here, the results show that when optimizing a constellation with respect to
mutual information, a minimum number of quantization levels is required to
achieve shaping gain. For generalized mutual information, the gain is
maintained throughout all of the considered quantization levels. Also, the
results implied that the autoencoder can adapt the constellation size to the
given channel conditions
A Very Brief Introduction to Machine Learning With Applications to Communication Systems
Given the unprecedented availability of data and computing resources, there
is widespread renewed interest in applying data-driven machine learning methods
to problems for which the development of conventional engineering solutions is
challenged by modelling or algorithmic deficiencies. This tutorial-style paper
starts by addressing the questions of why and when such techniques can be
useful. It then provides a high-level introduction to the basics of supervised
and unsupervised learning. For both supervised and unsupervised learning,
exemplifying applications to communication networks are discussed by
distinguishing tasks carried out at the edge and at the cloud segments of the
network at different layers of the protocol stack
Recent Trends and Considerations for High Speed Data in Chips and System Interconnects
This paper discusses key issues related to the design of large processing volume chip architectures and high speed system interconnects. Design methodologies and techniques are discussed, where recent trends and considerations are highlighted
Multidimensional Optimized Optical Modulation Formats
This chapter overviews the relatively large body of work (experimental and theoretical) on modulation formats for optical coherent links. It first gives basic definitions and performance metrics for modulation formats that are common in the literature. Then, the chapter discusses optimization of modulation formats in coded systems. It distinguishes between three cases, depending on the type of decoder employed, which pose quite different requirements on the choice of modulation format. The three cases are soft-decision decoding, hard-decision decoding, and iterative decoding, which loosely correspond to weak, medium, and strong coding, respectively. The chapter also discusses the realizations of the transmitter and transmission link properties and the receiver algorithms, including DSP and decoding. It further explains how to simply determine the transmitted symbol from the received 4D vector, without resorting to a full search of the Euclidean distances to all points in the whole constellation
High-Level Synthesis Based VLSI Architectures for Video Coding
High Efficiency Video Coding (HEVC) is state-of-the-art video coding standard. Emerging applications like free-viewpoint video, 360degree video, augmented reality, 3D movies etc. require standardized extensions of HEVC. The standardized extensions of HEVC include HEVC Scalable Video Coding (SHVC), HEVC Multiview Video Coding (MV-HEVC), MV-HEVC+ Depth (3D-HEVC) and HEVC Screen Content Coding. 3D-HEVC is used for applications like view synthesis generation, free-viewpoint video. Coding and transmission of depth maps in 3D-HEVC is used for the virtual view synthesis by the algorithms like Depth Image Based Rendering (DIBR). As first step, we performed the profiling of the 3D-HEVC standard. Computational intensive parts of the standard are identified for the efficient hardware implementation. One of the computational intensive part of the 3D-HEVC, HEVC and H.264/AVC is the Interpolation Filtering used for Fractional Motion Estimation (FME). The hardware implementation of the interpolation filtering is carried out using High-Level Synthesis (HLS) tools. Xilinx Vivado Design Suite is used for the HLS implementation of the interpolation filters of HEVC and H.264/AVC. The complexity of the digital systems is greatly increased. High-Level Synthesis is the methodology which offers great benefits such as late architectural or functional changes without time consuming in rewriting of RTL-code, algorithms can be tested and evaluated early in the design cycle and development of accurate models against which the final hardware can be verified
A neural operator-based surrogate solver for free-form electromagnetic inverse design
Neural operators have emerged as a powerful tool for solving partial
differential equations in the context of scientific machine learning. Here, we
implement and train a modified Fourier neural operator as a surrogate solver
for electromagnetic scattering problems and compare its data efficiency to
existing methods. We further demonstrate its application to the gradient-based
nanophotonic inverse design of free-form, fully three-dimensional
electromagnetic scatterers, an area that has so far eluded the application of
deep learning techniques
Vertical Optimizations of Convolutional Neural Networks for Embedded Systems
L'abstract è presente nell'allegato / the abstract is in the attachmen
Reconfigurable acceleration of Recurrent Neural Networks
Recurrent Neural Networks (RNNs) have been successful in a wide range of applications involving temporal sequences such as natural language processing, speech recognition and video analysis. However, RNNs often require a significant amount of memory and computational resources. In addition, the recurrent nature and data dependencies in RNN computations can lead to system stall, resulting in low throughput and high latency.
This work describes novel parallel hardware architectures for accelerating RNN inference using Field-Programmable Gate Array (FPGA) technology, which considers the data dependencies and high computational costs of RNNs.
The first contribution of this thesis is a latency-hiding architecture that utilizes column-wise matrix-vector multiplication instead of the conventional row-wise operation to eliminate data dependencies and improve the throughput of RNN inference designs. This architecture is further enhanced by a configurable checkerboard tiling strategy which allows large dimensions of weight matrices, while supporting element-based parallelism and vector-based parallelism. The presented reconfigurable RNN designs show significant speedup over CPU, GPU, and other FPGA designs.
The second contribution of this thesis is a weight reuse approach for large RNN models with weights stored in off-chip memory, running with a batch size of one. A novel blocking-batching strategy is proposed to optimize the throughput of large RNN designs on FPGAs by reusing the RNN weights. Performance analysis is also introduced to enable FPGA designs to achieve the best trade-off between area, power consumption and performance. Promising power efficiency improvement has been achieved in addition to speeding up over CPU and GPU designs.
The third contribution of this thesis is a low latency design for RNNs based on a partially-folded hardware architecture. It also introduces a technique that balances initiation interval of multi-layer RNN inferences to increase hardware efficiency and throughput while reducing latency. The approach is evaluated on a variety of applications, including gravitational wave detection and Bayesian RNN-based ECG anomaly detection.
To facilitate the use of this approach, we open source an RNN template which enables the generation of low-latency FPGA designs with efficient resource utilization using high-level synthesis tools.Open Acces
- …