5 research outputs found

    MIPP: a Portable C++ SIMD Wrapper and its use for Error Correction Coding in 5G Standard

    Get PDF
    International audienceError correction code (ECC) processing has so far been performed on dedicated hardware for previous generations of mobile communication standards, to meet latency and bandwidth constraints. As the 5G mobile standard, and its associated channel coding algorithms , are now being specified, modern CPUs are progressing to the point where software channel decoders can viably be contemplated. A key aspect in reaching this transition point is to get the most of CPUs SIMD units on the decoding algorithms being pondered for 5G mobile standards. The nature and diversity of such algorithms requires highly versatile programming tools. This paper demonstrates the virtues and versatility of our MIPP SIMD wrapper in implementing a high performance portfolio of key ECC decoding algorithms

    Toward High-Performance Implementation of 5G SCMA Algorithms

    Get PDF
    International audienceThe recent evolution of mobile communication systems toward a 5G network is associated with the search for new types of non-orthogonal modulations such as Sparse Code Multiple Access (SCMA). Such modulations are proposed in response to demands for increasing the number of connected users. SCMA is a non-orthogonal multiple access technique that offers improved Bit Error Rate (BER) performance and higher spectral efficiency than other comparable techniques, but these improvements come at the cost of complex decoders. There are many challenges in designing near-optimum high throughput SCMA decoders. This paper explores means to enhance the performance of SCMA decoders. To achieve this goal, various improvements to the MPA algorithms are proposed. They notably aim at adapting SCMA decoding to the Single Instruction Multiple Data (SIMD) paradigm. An approximate modeling of noise is performed to reduce the complexity of floating-point calculations. The effects of Forward Error Corrections (FEC) such as polar, turbo and LDPC codes, as well as different ways of accessing memory and improving power efficiency of modified MPAs are investigated. The results show that the throughput of a SCMA decoder can be increased by 3.1 to 21 times when compared to the original MPA on different computing platforms using the suggested improvements

    Contributions to Confidentiality and Integrity Algorithms for 5G

    Get PDF
    The confidentiality and integrity algorithms in cellular networks protect the transmission of user and signaling data over the air between users and the network, e.g., the base stations. There are three standardised cryptographic suites for confidentiality and integrity protection in 4G, which are based on the AES, SNOW 3G, and ZUC primitives, respectively. These primitives are used for providing a 128-bit security level and are usually implemented in hardware, e.g., using IP (intellectual property) cores, thus can be quite efficient. When we come to 5G, the innovative network architecture and high-performance demands pose new challenges to security. For the confidentiality and integrity protection, there are some new requirements on the underlying cryptographic algorithms. Specifically, these algorithms should: 1) provide 256 bits of security to protect against attackers equipped with quantum computing capabilities; and 2) provide at least 20 Gbps (Gigabits per second) speed in pure software environments, which is the downlink peak data rate in 5G. The reason for considering software environments is that the encryption in 5G will likely be moved to the cloud and implemented in software. Therefore, it is crucial to investigate existing algorithms in 4G, checking if they can satisfy the 5G requirements in terms of security and speed, and possibly propose new dedicated algorithms targeting these goals. This is the motivation of this thesis, which focuses on the confidentiality and integrity algorithms for 5G. The results can be summarised as follows.1. We investigate the security of SNOW 3G under 256-bit keys and propose two linear attacks against it with complexities 2172 and 2177, respectively. These cryptanalysis results indicate that SNOW 3G cannot provide the full 256-bit security level. 2. We design some spectral tools for linear cryptanalysis and apply these tools to investigate the security of ZUC-256, the 256-bit version of ZUC. We propose a distinguishing attack against ZUC-256 with complexity 2236, which is 220 faster than exhaustive key search. 3. We design a new stream cipher called SNOW-V in response to the new requirements for 5G confidentiality and integrity protection, in terms of security and speed. SNOW-V can provide a 256-bit security level and achieve a speed as high as 58 Gbps in software based on our extensive evaluation. The cipher is currently under evaluation in ETSI SAGE (Security Algorithms Group of Experts) as a promising candidate for 5G confidentiality and integrity algorithms. 4. We perform deeper cryptanalysis of SNOW-V to ensure that two common cryptanalysis techniques, guess-and-determine attacks and linear cryptanalysis, do not apply to SNOW-V faster than exhaustive key search. 5. We introduce two minor modifications in SNOW-V and propose an extreme performance variant, called SNOW-Vi, in response to the feedback about SNOW-V that some use cases are not fully covered. SNOW-Vi covers more use cases, especially some platforms with less capabilities. The speeds in software are increased by 50% in average over SNOW-V and can be up to 92 Gbps.Besides these works on 5G confidentiality and integrity algorithms, the thesis is also devoted to local pseudorandom generators (PRGs). 6. We investigate the security of local PRGs and propose two attacks against some constructions instantiated on the P5 predicate. The attacks improve existing results with a large gap and narrow down the secure parameter regime. We also extend the attacks to other local PRGs instantiated on general XOR-AND and XOR-MAJ predicates and provide some insight in the choice of safe parameters

    DĂ©codage de codes polaires sur des architectures programmables

    Get PDF
    RÉSUMÉ Les codes polaires constituent une classe de codes correcteurs d’erreurs inventés récemment qui suscite l’intérêt des chercheurs et des industriels, comme en atteste leur sélection pour le codage des canaux de contrôle dans la prochaine génération de téléphonie mobile (5G). Un des enjeux des futurs réseaux mobiles est la virtualisation des traitements numériques du signal, et en particulier les algorithmes de codage et de décodage. Afin d’améliorer la flexibilité du réseau, ces algorithmes doivent être décrits de manière logicielle et être déployés sur des architectures programmables. Une telle infrastructure de réseau permet de mieux répartir l’effort de calcul sur l’ensemble des noeuds et d’améliorer la coopération entre cellules. Ces techniques ont pour but de réduire la consommation d’énergie, d’augmenter le débit et de diminuer la latence des communications. Les travaux présentés dans ce manuscrit portent sur l’implémentation logicielle des algorithmes de décodage de codes polaires et la conception d’architectures programmables spécialisées pour leur exécution. Une des caractéristiques principales d’une chaîne de communication mobile est l’instabilité du canal de communication. Afin de remédier à cette instabilité, des techniques de modulations et de codages adaptatifs sont utilisées dans les normes de communication. Ces techniques impliquent que les décodeurs supportent une vaste gamme de codes : ils doivent être génériques. La première contribution de ces travaux est l’implémentation logicielle de décodeurs génériques des algorithmes de décodage "à Liste" sur des processeurs à usage général. En plus d’être génériques, les décodeurs proposés sont également flexibles. Ils permettent en effet des compromis entre pouvoir de correction, débit et latence de décodage par la paramétrisation fine des algorithmes. En outre, les débits des décodeurs proposés atteignent les performances de l’état de l’art et, dans certains cas, les dépassent. La deuxième contribution de ces travaux est la proposition d’une nouvelle architecture programmable performante spécialisée dans le décodage de codes polaires. Elle fait partie de la famille des processeurs à jeu d’instructions dédiés à l’application. Un processeur de type RISC à faible consommation en constitue la base. Cette base est ensuite configurée, son jeu d’instructions est étendu et des unités matérielles dédiées lui sont ajoutées. Les simulations montrent que cette architecture atteint des débits et des latences proches des implémentations logicielles de l’état de l’art sur des processeurs à usage général. La consommation énergétique est réduite d’un ordre de grandeur.----------ABSTRACT Polar codes are a recently invented class of error-correcting codes that are of interest to both researchers and industry, as evidenced by their selection for the coding of control channels in the next generation of cellular mobile communications (5G). One of the challenges of future mobile networks is the virtualization of digital signal processing, including channel encoding and decoding algorithms. In order to improve network flexibility, these algorithms must be written in software and deployed on programmable architectures. Such a network infrastructure allow dynamic balancing of the computational effort across the network, as well as inter-cell cooperation. These techniques are designed to reduce energy consumption, increase throughput and reduce communication latency. The work presented in this manuscript focuses on the software implementation of polar codes decoding algorithms and the design of programmable architectures specialized in their execution. One of the main characteristics of a mobile communication chain is that the state of communication channel changes over time. In order to address issue, adaptive modulation and coding techniques are used in communication standards. These techniques require the decoders to support a wide range of codes : they must be generic. The first contribution of this work is the software implementation of generic decoders for "List" polar decoding algorithms on general purpose processors. In addition to their genericity, the proposed decoders are also flexible. Trade-offs between correction power, throughput and decoding latency are enabled by fine-tuning the algorithms. In addition, the throughputs of the proposed decoders achieve state-of-the-art performance and, in some cases, exceed it. The second contribution of this work is the proposal of a new high-performance programmable architecture specialized in polar code decoding. It is part of the family of Application Specific Instruction-set Processors (ASIP). The base architecture is a RISC processor. This base architecture is then configured, its instruction set is extended and dedicated hardware units are added. Simulations show that this architecture achieves throughputs and latencies close to state-of-the-art software implementations on general purpose processors. Energy consumption is reduced by an order of magnitude. The energy required per decoded bit is about 10 nJ on general purpose processors compared to 1 nJ on proposed processors when considering the Successive Cancellation (SC) decoding algorithm of a polar code (1024,512). The third contribution of this work is also the design of an ASIP architecture

    Datacenter Design for Future Cloud Radio Access Network.

    Full text link
    Cloud radio access network (C-RAN), an emerging cloud service that combines the traditional radio access network (RAN) with cloud computing technology, has been proposed as a solution to handle the growing energy consumption and cost of the traditional RAN. Through aggregating baseband units (BBUs) in a centralized cloud datacenter, C-RAN reduces energy and cost, and improves wireless throughput and quality of service. However, designing a datacenter for C-RAN has not yet been studied. In this dissertation, I investigate how a datacenter for C-RAN BBUs should be built on commodity servers. I first design WiBench, an open-source benchmark suite containing the key signal processing kernels of many mainstream wireless protocols, and study its characteristics. The characterization study shows that there is abundant data level parallelism (DLP) and thread level parallelism (TLP). Based on this result, I then develop high performance software implementations of C-RAN BBU kernels in C++ and CUDA for both CPUs and GPUs. In addition, I generalize the GPU parallelization techniques of the Turbo decoder to the trellis algorithms, an important family of algorithms that are widely used in data compression and channel coding. Then I evaluate the performance of commodity CPU servers and GPU servers. The study shows that the datacenter with GPU servers can meet the LTE standard throughput with 4× to 16× fewer machines than with CPU servers. A further energy and cost analysis show that GPU servers can save on average 13× more energy and 6× more cost. Thus, I propose the C-RAN datacenter be built using GPUs as a server platform. Next I study resource management techniques to handle the temporal and spatial traffic imbalance in a C-RAN datacenter. I propose a “hill-climbing” power management that combines powering-off GPUs and DVFS to match the temporal C-RAN traffic pattern. Under a practical traffic model, this technique saves 40% of the BBU energy in a GPU-based C-RAN datacenter. For spatial traffic imbalance, I propose three workload distribution techniques to improve load balance and throughput. Among all three techniques, pipelining packets has the most throughput improvement at 10% and 16% for balanced and unbalanced loads, respectively.PhDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/120825/1/qizheng_1.pd
    corecore