23 research outputs found

    Datacenter Design for Future Cloud Radio Access Network.

    Full text link
    Cloud radio access network (C-RAN), an emerging cloud service that combines the traditional radio access network (RAN) with cloud computing technology, has been proposed as a solution to handle the growing energy consumption and cost of the traditional RAN. Through aggregating baseband units (BBUs) in a centralized cloud datacenter, C-RAN reduces energy and cost, and improves wireless throughput and quality of service. However, designing a datacenter for C-RAN has not yet been studied. In this dissertation, I investigate how a datacenter for C-RAN BBUs should be built on commodity servers. I first design WiBench, an open-source benchmark suite containing the key signal processing kernels of many mainstream wireless protocols, and study its characteristics. The characterization study shows that there is abundant data level parallelism (DLP) and thread level parallelism (TLP). Based on this result, I then develop high performance software implementations of C-RAN BBU kernels in C++ and CUDA for both CPUs and GPUs. In addition, I generalize the GPU parallelization techniques of the Turbo decoder to the trellis algorithms, an important family of algorithms that are widely used in data compression and channel coding. Then I evaluate the performance of commodity CPU servers and GPU servers. The study shows that the datacenter with GPU servers can meet the LTE standard throughput with 4× to 16× fewer machines than with CPU servers. A further energy and cost analysis show that GPU servers can save on average 13× more energy and 6× more cost. Thus, I propose the C-RAN datacenter be built using GPUs as a server platform. Next I study resource management techniques to handle the temporal and spatial traffic imbalance in a C-RAN datacenter. I propose a “hill-climbing” power management that combines powering-off GPUs and DVFS to match the temporal C-RAN traffic pattern. Under a practical traffic model, this technique saves 40% of the BBU energy in a GPU-based C-RAN datacenter. For spatial traffic imbalance, I propose three workload distribution techniques to improve load balance and throughput. Among all three techniques, pipelining packets has the most throughput improvement at 10% and 16% for balanced and unbalanced loads, respectively.PhDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/120825/1/qizheng_1.pd

    6G White Paper on Machine Learning in Wireless Communication Networks

    Full text link
    The focus of this white paper is on machine learning (ML) in wireless communications. 6G wireless communication networks will be the backbone of the digital transformation of societies by providing ubiquitous, reliable, and near-instant wireless connectivity for humans and machines. Recent advances in ML research has led enable a wide range of novel technologies such as self-driving vehicles and voice assistants. Such innovation is possible as a result of the availability of advanced ML models, large datasets, and high computational power. On the other hand, the ever-increasing demand for connectivity will require a lot of innovation in 6G wireless networks, and ML tools will play a major role in solving problems in the wireless domain. In this paper, we provide an overview of the vision of how ML will impact the wireless communication systems. We first give an overview of the ML methods that have the highest potential to be used in wireless networks. Then, we discuss the problems that can be solved by using ML in various layers of the network such as the physical layer, medium access layer, and application layer. Zero-touch optimization of wireless networks using ML is another interesting aspect that is discussed in this paper. Finally, at the end of each section, important research questions that the section aims to answer are presented

    Implementation of New Multiple Access Technique Encoder for 5G Wireless Telecomunication Networks

    Get PDF
    RÉSUMÉ Les exigences de la connectivitĂ© mobile massive de diffĂ©rents appareils et de diverses applications dĂ©terminent les besoins des prochaines gĂ©nĂ©rations de technologies mobiles (5G) afin de surmonter les demandes futures. L'expansion significative de la connectivitĂ© et de la densitĂ© du trafic caractĂ©risent les besoins de la cinquiĂšme gĂ©nĂ©ration de rĂ©seaux mobiles. Par consĂ©quent, pour la 5G, il est nĂ©cessaire d'avoir une densitĂ© de connectivitĂ© beaucoup plus Ă©levĂ©e et une plus grande portĂ©e de mobilitĂ©, un dĂ©bit beaucoup plus Ă©levĂ© et une latence beaucoup plus faible. En raison de l'exigence d'une connectivitĂ© massive, de nombreuses nouvelles technologies doivent ĂȘtre amĂ©liorĂ©es: le codage des canaux, la technique d'accĂšs multiple, la modulation et la diversitĂ©, etc. Par consĂ©quent, compte tenu de l'environnement 5G, surcoĂ»t de signalisation et de la latence devrait ĂȘtre pris en compte [1]. En outre, l'application de la virtualisation des accĂšs sans fil (WAV) devrait Ă©galement ĂȘtre considĂ©rĂ©e et, par consĂ©quent, il est Ă©galement nĂ©cessaire de concevoir la plate-forme matĂ©rielle prenant en charge les nouvelles normes pour la mise en Ɠuvre des Ă©metteurs-rĂ©cepteurs virtuels. L'une des nouvelles technologies possibles pour la 5G est l'accĂšs multiple pour amĂ©liorer le dĂ©bit. Par consĂ©quent, au lieu d'OFDMA utilisĂ© dans la norme LTE (4G), l'application d'une nouvelle technique d'accĂšs multiple appelĂ©e Sparse Code Multiple Access (SCMA) est investiguĂ©e dans cette dissertation. SCMA est une nouvelle technique d'accĂšs multiple non orthogonale du domaine frĂ©quentiel proposĂ©e pour amĂ©liorer l'efficacitĂ© spectrale de l'accĂšs radio sans fil [2]. L'encodage SCMA est l'un des algorithmes les plus simples dans les techniques d'accĂšs multiple qui offre l'opportunitĂ© d'expĂ©rimenter des mĂ©thodes gĂ©nĂ©riques de mise en oeuvre. En outre, la nouvelle mĂ©thode d'accĂšs multiple est supposĂ©e fournir un dĂ©bit plus Ă©levĂ©. Le choix du codage SCMA avec moins de complexitĂ© pourrait ĂȘtre une approche appropriĂ©e. La cible fixĂ©e pour cette recherche Ă©tait d'atteindre un dĂ©bit d’encodage de plus de 1 Gbps pour le codeur SCMA. Les implĂ©mentations de codage SCMA ont Ă©tĂ© effectuĂ©es Ă  la fois en logiciel et en matĂ©riel pour permettre de les comparer. Les implĂ©mentations logicielles ont Ă©tĂ© dĂ©veloppĂ©es avec le langage de programmation C. Parmi plusieurs conceptions, la performance a Ă©tĂ© amĂ©liorĂ©e en utilisant diffĂ©rentes mĂ©thodes pour augmenter le parallĂ©lisme, diminuer la complexitĂ© de calcul et par consĂ©quent le temps de traitement.----------ABSTRACT The demands of massive mobile connectivity of different devices and diverse applications at the same time set requirments for next generations of mobile technology (5G). The significant expansion of connectivity and traffic density characterize the requirements of fifth generation mobile. Therefore, in 5G, there is a need to have much higher connectivity density, higher mobility ranges, much higher throughput, and much lower latency. In pursuance of the requirement of massive connectivity, numerous technologies must be improved: channel coding, multiple access technique, modulation and diversity, etc. For instance, with 5G, the cost of signaling overhead and latency should be taken into account [1]. Besides, applying wireless access virtualization (WAV) should be considered and there is also a need to have effective implementations supporting novel virtual transceiver. One of the possible new technologies for 5G is exploiting multiple access techniques to improve throughput. Therefore, instead of OFDMA in LTE (4G), applying a new multiple access technique called Sparse Code Multiple Access (SCMA) is an approach considered in this dissertation. SCMA is a new frequency domain non-orthogonal multiple access technique proposed to improve spectral efficiency of wireless radio access [2]. SCMA encoding is one of the simplest multiple access technique that offers an opportunity to experiment generic implementation methods. In addition, the new multiple access method is supposed to provide higher throughput, thus choosing SCMA encoding with less complexity could be an appropriate approach. The target with SCMA was to achieve an encoding throughput of more that 1Gbps. SCMA encoding implementations were done both in software and hardware to allow comparing them. The software implementations were developed with the C programing language. Among several designs, the performance was improved by using different methods to increase parallelism, decrease the computational complexity and consequently the processing time. The best achieved results with software implementations offer a 3.59 Gbps throughput, which is 3.5 times more that the target. For hardware implementation, high level synthesis was experimented. In order to do that, the C based functions and testbenches which were developed for software implementations, were used as inputs to Vivado HLS

    Near Deterministic Signal Processing Using GPU, DPDK, and MKL

    Get PDF
    RÉSUMÉ En radio dĂ©fnie par logiciel, le traitement numcrique du signal impose le traitement en temps rĂ©el des donnĂ©s et des signaux. En outre, dans le dĂ©veloppement de systĂšmes de communication sans fil basĂ©es sur la norme dite Long Term Evolution (LTE), le temps rĂ©el et une faible latence des processus de calcul sont essentiels pour obtenir une bonne experience utilisateur. De plus, la latence des calculs est une clĂ© essentielle dans le traitement LTE, nous voulons explorer si des unitĂ©s de traitement graphique (GPU) peuvent ĂȘtre utilisĂ©es pour accĂ©lĂ©rer le traitement LTE. Dans ce but, nous explorons la technologie GPU de NVIDIA en utilisant le modĂ©le de programmation Compute Unified Device Architecture (CUDA) pour rĂ©duire le temps de calcul associĂ© au traitement LTE. Nous prĂ©sentons briĂ©vement l'architecture CUDA et le traitement parallĂ©le avec GPU sous Matlab, puis nous comparons les temps de calculs avec Matlab et CUDA. Nous concluons que CUDA et Matlab accĂ©lĂ©rent le temps de calcul des fonctions qui sont basĂ©es sur des algorithmes de traitement en parallĂ©le et qui ont le mĂȘme type de donnĂ©es, mais que cette accĂ©lĂ©ration est fortement variable en fonction de l'algorithme implantĂ©. Intel a proposĂ© une boite Ă  outil pour le dĂ©veloppement de plan de donnĂ©es (DPDK) pour faciliter le dĂ©veloppement des logiciels de haute performance pour le traitement des fonctionnalitĂ©s de tĂ©lĂ©communication. Dans ce projet, nous explorons son utilisation ainsi que celle de l'isolation du systĂšme d'exploitation pour rĂ©duire la variabilitĂ© des temps de calcul des processus de LTE. Plus prĂ©cisĂ©ment, nous utilisons DPDK avec la Math Kernel Library (MKL) pour calculer la transformĂ©e de Fourier rapide (FFT) associĂ©e avec le processus LTE et nous mesurons leur temps de calcul. Nous Ă©valuons quatre cas: 1) code FFT dans le cƓur esclave sans isolation du CPU, 2) code FFT dans le cƓur esclave avec l'isolation du CPU, 3) code FFT utilisant MKL sans DPDK et 4) code FFT de base. Nous combinons DPDK et MKL pour les cas 1 et 2 et Ă©valuons quel cas est plus dĂ©terministe et rĂ©duit le plus la latence des processus LTE. Nous montrons que le temps de calcul moyen pour la FFT de base est environ 100 fois plus grand alors que l'Ă©cart-type est environ 20 fois plus Ă©levĂ©. On constate que MKL offre d'excellentes performances, mais comme il n'est pas extensible par lui-mĂȘme dans le domaine infonuagique, le combiner avec DPDK est une alternative trĂšs prometteuse. DPDK permet d'amĂ©liorer la performance, la gestion de la mĂ©moire et rend MKL Ă©volutif.----------ABSTRACT In software defined radio, digital signal processing requires strict real time processing of data and signals. Specifically, in the development of the Long Term Evolution (LTE) standard, real time and low latency of computation processes are essential to obtain good user experience. As low latency computation is critical in real time processing of LTE, we explore the possibility of using Graphics Processing Units (GPUs) to accelerate its functions. As the first contribution of this thesis, we adopt NVIDIA GPU technology using the Compute Unified Device Architecture (CUDA) programming model in order to reduce the computation times of LTE. Furthermore, we investigate the efficiency of using MATLAB for parallel computing on GPUs. This allows us to evaluate MATLAB and CUDA programming paradigms and provide a comprehensive comparison between them for parallel computing of LTE processes on GPUs. We conclude that CUDA and Matlab accelerate processing of structured basic algorithms but that acceleration is variable and depends which algorithm is involved. Intel has proposed its Data Plane Development Kit (DPDK) as a tool to develop high performance software for processing of telecommunication data. As the second contribution of this thesis, we explore the possibility of using DPDK and isolation of operating system to reduce the variability of the computation times of LTE processes. Specifically, we use DPDK along with the Math Kernel Library (MKL) provided by Intel to calculate Fast Fourier Transforms (FFT) associated with LTE processes and measure their computation times. We study the computation times in different scenarios where FFT calculation is done with and without the isolation of processing units along the use of DPDK. Our experimental analysis shows that when DPDK and MKL are simultaneously used and the processing units are isolated, the resulting processing times of FFT calculation are reduced and have a near-deterministic characteristic. Explicitly, using DPDK and MKL along with the isolation of processing units reduces the mean and standard deviation of processing times for FFT calculation by 100 times and 20 times, respectively. Moreover, we conclude that although MKL reduces the computation time of FFTs, it does not offer a scalable solution but combining it with DPDK is a promising avenue

    Development of an orchestrator layer for a wireless cloud

    Get PDF
    In the last years mobile devices have gained much popularity, that's why the current mobile infrastructure is becoming obsolete. Cloud-RAN arises as an excellent candidat to overcome the strong limitations of the existing infrastructure as it uses a centralized administration of the resources. However, the implementation of such infrastructure presents a significant challenge due to the strict real time requirements of the wireless standards and the implementation of efficient administration algorithms. In order to provide solutions to the problems presented above, in this project we have focused on the implementation of a Cloud-RAN infrastructure. Specifically, we have focused on the study of algorithms that allows a better use of the wireless resources. Later, we have carried out a study to analyse the feasibility of the proposed infrastructure and we have compared it with the current infrastructures. Results obtained show that Cloud-RAN allows a better management of the wireless resources thanks to its flexibility and the fact that it is totally reconfigurable. Furthermore, its implementation reduces the CAPEX/OPEX costs significantly with respect to the current infrastructure

    Real-Time Localization Using Software Defined Radio

    Get PDF
    Service providers make use of cost-effective wireless solutions to identify, localize, and possibly track users using their carried MDs to support added services, such as geo-advertisement, security, and management. Indoor and outdoor hotspot areas play a significant role for such services. However, GPS does not work in many of these areas. To solve this problem, service providers leverage available indoor radio technologies, such as WiFi, GSM, and LTE, to identify and localize users. We focus our research on passive services provided by third parties, which are responsible for (i) data acquisition and (ii) processing, and network-based services, where (i) and (ii) are done inside the serving network. For better understanding of parameters that affect indoor localization, we investigate several factors that affect indoor signal propagation for both Bluetooth and WiFi technologies. For GSM-based passive services, we developed first a data acquisition module: a GSM receiver that can overhear GSM uplink messages transmitted by MDs while being invisible. A set of optimizations were made for the receiver components to support wideband capturing of the GSM spectrum while operating in real-time. Processing the wide-spectrum of the GSM is possible using a proposed distributed processing approach over an IP network. Then, to overcome the lack of information about tracked devices’ radio settings, we developed two novel localization algorithms that rely on proximity-based solutions to estimate in real environments devices’ locations. Given the challenging indoor environment on radio signals, such as NLOS reception and multipath propagation, we developed an original algorithm to detect and remove contaminated radio signals before being fed to the localization algorithm. To improve the localization algorithm, we extended our work with a hybrid based approach that uses both WiFi and GSM interfaces to localize users. For network-based services, we used a software implementation of a LTE base station to develop our algorithms, which characterize the indoor environment before applying the localization algorithm. Experiments were conducted without any special hardware, any prior knowledge of the indoor layout or any offline calibration of the system

    Dependable Embedded Systems

    Get PDF
    This Open Access book introduces readers to many new techniques for enhancing and optimizing reliability in embedded systems, which have emerged particularly within the last five years. This book introduces the most prominent reliability concerns from today’s points of view and roughly recapitulates the progress in the community so far. Unlike other books that focus on a single abstraction level such circuit level or system level alone, the focus of this book is to deal with the different reliability challenges across different levels starting from the physical level all the way to the system level (cross-layer approaches). The book aims at demonstrating how new hardware/software co-design solution can be proposed to ef-fectively mitigate reliability degradation such as transistor aging, processor variation, temperature effects, soft errors, etc. Provides readers with latest insights into novel, cross-layer methods and models with respect to dependability of embedded systems; Describes cross-layer approaches that can leverage reliability through techniques that are pro-actively designed with respect to techniques at other layers; Explains run-time adaptation and concepts/means of self-organization, in order to achieve error resiliency in complex, future many core systems

    Satellite Networks: Architectures, Applications, and Technologies

    Get PDF
    Since global satellite networks are moving to the forefront in enhancing the national and global information infrastructures due to communication satellites' unique networking characteristics, a workshop was organized to assess the progress made to date and chart the future. This workshop provided the forum to assess the current state-of-the-art, identify key issues, and highlight the emerging trends in the next-generation architectures, data protocol development, communication interoperability, and applications. Presentations on overview, state-of-the-art in research, development, deployment and applications and future trends on satellite networks are assembled

    Algorithm-Architecture Co-Design for Digital Front-Ends in Mobile Receivers

    Get PDF
    The methodology behind this work has been to use the concept of algorithm-hardware co-design to achieve efficient solutions related to the digital front-end in mobile receivers. It has been shown that, by looking at algorithms and hardware architectures together, more efficient solutions can be found; i.e., efficient with respect to some design measure. In this thesis the main focus have been placed on two such parameters; first reduced complexity algorithms to lower energy consumptions at limited performance degradation, secondly to handle the increasing number of wireless standards that preferably should run on the same hardware platform. To be able to perform this task it is crucial to understand both sides of the table, i.e., both algorithms and concepts for wireless communication as well as the implications arising on the hardware architecture. It is easier to handle the high complexity by separating those disciplines in a way of layered abstraction. However, this representation is imperfect, since many interconnected "details" belonging to different layers are lost in the attempt of handling the complexity. This results in poor implementations and the design of mobile terminals is no exception. Wireless communication standards are often designed based on mathematical algorithms with theoretical boundaries, with few considerations to actual implementation constraints such as, energy consumption, silicon area, etc. This thesis does not try to remove the layer abstraction model, given its undeniable advantages, but rather uses those cross-layer "details" that went missing during the abstraction. This is done in three manners: In the first part, the cross-layer optimization is carried out from the algorithm perspective. Important circuit design parameters, such as quantization are taken into consideration when designing the algorithm for OFDM symbol timing, CFO, and SNR estimation with a single bit, namely, the Sign-Bit. Proof-of-concept circuits were fabricated and showed high potential for low-end receivers. In the second part, the cross-layer optimization is accomplished from the opposite side, i.e., the hardware-architectural side. A SDR architecture is known for its flexibility and scalability over many applications. In this work a filtering application is mapped into software instructions in the SDR architecture in order to make filtering-specific modules redundant, and thus, save silicon area. In the third and last part, the optimization is done from an intermediate point within the algorithm-architecture spectrum. Here, a heterogeneous architecture with a combination of highly efficient and highly flexible modules is used to accomplish initial synchronization in at least two concurrent OFDM standards. A demonstrator was build capable of performing synchronization in any two standards, including LTE, WiFi, and DVB-H

    Software Defined Radio Solutions for Wireless Communications Systems

    Get PDF
    Wireless technologies have been advancing rapidly, especially in the recent years. Design, implementation, and manufacturing of devices supporting the continuously evolving technologies require great efforts. Thus, building platforms compatible with different generations of standards and technologies has gained a lot of interest. As a result, software deïŹned radios (SDRs) are investigated to offer more ïŹ‚exibility and scalability, and reduce the design efforts, compared to the conventional ïŹxed-function hardware-based solutions.This thesis mainly addresses the challenges related to SDR-based implementation of today’s wireless devices. One of the main targets of most of the wireless standards has been to improve the achievable data rates, which imposes strict requirements on the processing platforms. Realizing real-time processing of high throughput signal processing algorithms using SDR-based platforms while maintaining energy consumption close to conventional approaches is a challenging topic that is addressed in this thesis.Firstly, this thesis concentrates on the challenges of a real-time software-based implementation for the very high throughput (VHT) Institute of Electrical and Electronics Engineers (IEEE) 802.11ac amendment from the wireless local area networks (WLAN) family, where an SDR-based solution is introduced for the frequency-domain baseband processing of a multiple-input multipleoutput (MIMO) transmitter and receiver. The feasibility of the implementation is evaluated with respect to the number of clock cycles and the consumed power. Furthermore, a digital front-end (DFE) concept is developed for the IEEE 802.11ac receiver, where the 80 MHz waveform is divided to two 40 MHz signals. This is carried out through time-domain digital ïŹltering and decimation, which is challenging due to the latency and cyclic preïŹx (CP) budget of the receiver. Different multi-rate channelization architectures are developed, and the software implementation is presented and evaluated in terms of execution time, number of clock cycles, power, and energy consumption on different multi-core platforms.Secondly, this thesis addresses selected advanced techniques developed to realize inband fullduplex (IBFD) systems, which aim at improving spectral efïŹciency in today’s congested radio spectrum. IBFD refers to concurrent transmission and reception on the same frequency band, where the main challenge to combat is the strong self-interference (SI). In this thesis, an SDRbased solution is introduced, which is capable of real-time mitigation of the SI signal. The implementation results show possibility of achieving real-time sufïŹcient SI suppression under time-varying environments using low-power, mobile-scale multi-core processing platforms. To investigate the challenges associated with SDR implementations for mobile-scale devices with limited processing and power resources, processing platforms suitable for hand-held devices are selected in this thesis work. On the baseband processing side, a very long instruction word (VLIW) processor, optimized for wireless communication applications, is utilized. Furthermore, in the solutions presented for the DFE processing and the digital SI canceller, commercial off-the-shelf (COTS) multi-core central processing units (CPUs) and graphics processing units (GPUs) are used with the aim of investigating the performance enhancement achieved by utilizing parallel processing.Overall, this thesis provides solutions to the challenges of low-power, and real-time software-based implementation of computationally intensive signal processing algorithms for the current and future communications systems
    corecore