994 research outputs found

    Datacenter Design for Future Cloud Radio Access Network.

    Full text link
    Cloud radio access network (C-RAN), an emerging cloud service that combines the traditional radio access network (RAN) with cloud computing technology, has been proposed as a solution to handle the growing energy consumption and cost of the traditional RAN. Through aggregating baseband units (BBUs) in a centralized cloud datacenter, C-RAN reduces energy and cost, and improves wireless throughput and quality of service. However, designing a datacenter for C-RAN has not yet been studied. In this dissertation, I investigate how a datacenter for C-RAN BBUs should be built on commodity servers. I first design WiBench, an open-source benchmark suite containing the key signal processing kernels of many mainstream wireless protocols, and study its characteristics. The characterization study shows that there is abundant data level parallelism (DLP) and thread level parallelism (TLP). Based on this result, I then develop high performance software implementations of C-RAN BBU kernels in C++ and CUDA for both CPUs and GPUs. In addition, I generalize the GPU parallelization techniques of the Turbo decoder to the trellis algorithms, an important family of algorithms that are widely used in data compression and channel coding. Then I evaluate the performance of commodity CPU servers and GPU servers. The study shows that the datacenter with GPU servers can meet the LTE standard throughput with 4× to 16× fewer machines than with CPU servers. A further energy and cost analysis show that GPU servers can save on average 13× more energy and 6× more cost. Thus, I propose the C-RAN datacenter be built using GPUs as a server platform. Next I study resource management techniques to handle the temporal and spatial traffic imbalance in a C-RAN datacenter. I propose a “hill-climbing” power management that combines powering-off GPUs and DVFS to match the temporal C-RAN traffic pattern. Under a practical traffic model, this technique saves 40% of the BBU energy in a GPU-based C-RAN datacenter. For spatial traffic imbalance, I propose three workload distribution techniques to improve load balance and throughput. Among all three techniques, pipelining packets has the most throughput improvement at 10% and 16% for balanced and unbalanced loads, respectively.PhDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/120825/1/qizheng_1.pd

    Implementation of a High Throughput Soft MIMO Detector on GPU

    Get PDF
    Multiple-input multiple-output (MIMO) significantly increases the throughput of a communication system by employing multiple antennas at the transmitter and the receiver. To extract maximum performance from a MIMO system, a computationally intensive search based detector is needed. To meet the challenge of MIMO detection, typical suboptimal MIMO detectors are ASIC or FPGA designs. We aim to show that a MIMO detector on Graphic processor unit (GPU), a low-cost parallel programmable co-processor, can achieve high throughput and can serve as an alternative to ASIC/FPGA designs. However, careful architecture aware software design is needed to leverage the performance offered by GPU. We propose a novel soft MIMO detection algorithm, multi-pass trellis traversal (MTT), and show that we can achieve ASIC/FPGA-like performance and handle different configurations in software on GPU. The proposed design can be used to accelerate wireless physical layer simulations and to offload MIMO detection processing in wireless testbed platforms.NokiaNokia Siemens Networks (NSN)Texas InstrumentsXilinxNational Science Foundatio

    State of the art baseband DSP platforms for Software Defined Radio: A survey

    Get PDF
    Software Defined Radio (SDR) is an innovative approach which is becoming a more and more promising technology for future mobile handsets. Several proposals in the field of embedded systems have been introduced by different universities and industries to support SDR applications. This article presents an overview of current platforms and analyzes the related architectural choices, the current issues in SDR, as well as potential future trends.Peer reviewe

    End-to-End Simulation of 5G mmWave Networks

    Full text link
    Due to its potential for multi-gigabit and low latency wireless links, millimeter wave (mmWave) technology is expected to play a central role in 5th generation cellular systems. While there has been considerable progress in understanding the mmWave physical layer, innovations will be required at all layers of the protocol stack, in both the access and the core network. Discrete-event network simulation is essential for end-to-end, cross-layer research and development. This paper provides a tutorial on a recently developed full-stack mmWave module integrated into the widely used open-source ns--3 simulator. The module includes a number of detailed statistical channel models as well as the ability to incorporate real measurements or ray-tracing data. The Physical (PHY) and Medium Access Control (MAC) layers are modular and highly customizable, making it easy to integrate algorithms or compare Orthogonal Frequency Division Multiplexing (OFDM) numerologies, for example. The module is interfaced with the core network of the ns--3 Long Term Evolution (LTE) module for full-stack simulations of end-to-end connectivity, and advanced architectural features, such as dual-connectivity, are also available. To facilitate the understanding of the module, and verify its correct functioning, we provide several examples that show the performance of the custom mmWave stack as well as custom congestion control algorithms designed specifically for efficient utilization of the mmWave channel.Comment: 25 pages, 16 figures, submitted to IEEE Communications Surveys and Tutorials (revised Jan. 2018

    Real-Time Waveform Prototyping

    Get PDF
    Mobile Netzwerke der fünften Generation zeichen sich aus durch vielfältigen Anforderungen und Einsatzszenarien. Drei unterschiedliche Anwendungsfälle sind hierbei besonders relevant: 1) Industrie-Applikationen fordern Echtzeitfunkübertragungen mit besonders niedrigen Ausfallraten. 2) Internet-of-things-Anwendungen erfordern die Anbindung einer Vielzahl von verteilten Sensoren. 3) Die Datenraten für Anwendung wie z.B. der Übermittlung von Videoinhalten sind massiv gestiegen. Diese zum Teil gegensätzlichen Erwartungen veranlassen Forscher und Ingenieure dazu, neue Konzepte und Technologien für zukünftige drahtlose Kommunikationssysteme in Betracht zu ziehen. Ziel ist es, aus einer Vielzahl neuer Ideen vielversprechende Kandidatentechnologien zu identifizieren und zu entscheiden, welche für die Umsetzung in zukünftige Produkte geeignet sind. Die Herausforderungen, diese Anforderungen zu erreichen, liegen jedoch jenseits der Möglichkeiten, die eine einzelne Verarbeitungsschicht in einem drahtlosen Netzwerk bieten kann. Daher müssen mehrere Forschungsbereiche Forschungsideen gemeinsam nutzen. Diese Arbeit beschreibt daher eine Plattform als Basis für zukünftige experimentelle Erforschung von drahtlosen Netzwerken unter reellen Bedingungen. Es werden folgende drei Aspekte näher vorgestellt: Zunächst erfolgt ein Überblick über moderne Prototypen und Testbed-Lösungen, die auf großes Interesse, Nachfrage, aber auch Förderungsmöglichkeiten stoßen. Allerdings ist der Entwicklungsaufwand nicht unerheblich und richtet sich stark nach den gewählten Eigenschaften der Plattform. Der Auswahlprozess ist jedoch aufgrund der Menge der verfügbaren Optionen und ihrer jeweiligen (versteckten) Implikationen komplex. Daher wird ein Leitfaden anhand verschiedener Beispiele vorgestellt, mit dem Ziel Erwartungen im Vergleich zu den für den Prototyp erforderlichen Aufwänden zu bewerten. Zweitens wird ein flexibler, aber echtzeitfähiger Signalprozessor eingeführt, der auf einer software-programmierbaren Funkplattform läuft. Der Prozessor ermöglicht die Rekonfiguration wichtiger Parameter der physikalischen Schicht während der Laufzeit, um eine Vielzahl moderner Wellenformen zu erzeugen. Es werden vier Parametereinstellungen 'LLC', 'WiFi', 'eMBB' und 'IoT' vorgestellt, um die Anforderungen der verschiedenen drahtlosen Anwendungen widerzuspiegeln. Diese werden dann zur Evaluierung der die in dieser Arbeit vorgestellte Implementierung herangezogen. Drittens wird durch die Einführung einer generischen Testinfrastruktur die Einbeziehung externer Partner aus der Ferne ermöglicht. Das Testfeld kann hier für verschiedenste Experimente flexibel auf die Anforderungen drahtloser Technologien zugeschnitten werden. Mit Hilfe der Testinfrastruktur wird die Leistung des vorgestellten Transceivers hinsichtlich Latenz, erreichbarem Durchsatz und Paketfehlerraten bewertet. Die öffentliche Demonstration eines taktilen Internet-Prototypen, unter Verwendung von Roboterarmen in einer Mehrbenutzerumgebung, konnte erfolgreich durchgeführt und bei mehreren Gelegenheiten präsentiert werden.:List of figures List of tables Abbreviations Notations 1 Introduction 1.1 Wireless applications 1.2 Motivation 1.3 Software-Defined Radio 1.4 State of the art 1.5 Testbed 1.6 Summary 2 Background 2.1 System Model 2.2 PHY Layer Structure 2.3 Generalized Frequency Division Multiplexing 2.4 Wireless Standards 2.4.1 IEEE 802.15.4 2.4.2 802.11 WLAN 2.4.3 LTE 2.4.4 Low Latency Industrial Wireless Communications 2.4.5 Summary 3 Wireless Prototyping 3.1 Testbed Examples 3.1.1 PHY - focused Testbeds 3.1.2 MAC - focused Testbeds 3.1.3 Network - focused testbeds 3.1.4 Generic testbeds 3.2 Considerations 3.3 Use cases and Scenarios 3.4 Requirements 3.5 Methodology 3.6 Hardware Platform 3.6.1 Host 3.6.2 FPGA 3.6.3 Hybrid 3.6.4 ASIC 3.7 Software Platform 3.7.1 Testbed Management Frameworks 3.7.2 Development Frameworks 3.7.3 Software Implementations 3.8 Deployment 3.9 Discussion 3.10 Conclusion 4 Flexible Transceiver 4.1 Signal Processing Modules 4.1.1 MAC interface 4.1.2 Encoding and Mapping 4.1.3 Modem 4.1.4 Post modem processing 4.1.5 Synchronization 4.1.6 Channel Estimation and Equalization 4.1.7 Demapping 4.1.8 Flexible Configuration 4.2 Analysis 4.2.1 Numerical Precision 4.2.2 Spectral analysis 4.2.3 Latency 4.2.4 Resource Consumption 4.3 Discussion 4.3.1 Extension to MIMO 4.4 Summary 5 Testbed 5.1 Infrastructure 5.2 Automation 5.3 Software Defined Radio Platform 5.4 Radio Frequency Front-end 5.4.1 Sub 6 GHz front-end 5.4.2 26 GHz mmWave front-end 5.5 Performance evaluation 5.6 Summary 6 Experiments 6.1 Single Link 6.1.1 Infrastructure 6.1.2 Single Link Experiments 6.1.3 End-to-End 6.2 Multi-User 6.3 26 GHz mmWave experimentation 6.4 Summary 7 Key lessons 7.1 Limitations Experienced During Development 7.2 Prototyping Future 7.3 Open points 7.4 Workflow 7.5 Summary 8 Conclusions 8.1 Future Work 8.1.1 Prototyping Workflow 8.1.2 Flexible Transceiver Core 8.1.3 Experimental Data-sets 8.1.4 Evolved Access Point Prototype For Industrial Networks 8.1.5 Testbed Standardization A Additional Resources A.1 Fourier Transform Blocks A.2 Resource Consumption A.3 Channel Sounding using Chirp sequences A.3.1 SNR Estimation A.3.2 Channel Estimation A.4 Hardware part listThe demand to achieve higher data rates for the Enhanced Mobile Broadband scenario and novel fifth generation use cases like Ultra-Reliable Low-Latency and Massive Machine-type Communications drive researchers and engineers to consider new concepts and technologies for future wireless communication systems. The goal is to identify promising candidate technologies among a vast number of new ideas and to decide, which are suitable for implementation in future products. However, the challenges to achieve those demands are beyond the capabilities a single processing layer in a wireless network can offer. Therefore, several research domains have to collaboratively exploit research ideas. This thesis presents a platform to provide a base for future applied research on wireless networks. Firstly, by giving an overview of state-of-the-art prototypes and testbed solutions. Secondly by introducing a flexible, yet real-time physical layer signal processor running on a software defined radio platform. The processor enables reconfiguring important parameters of the physical layer during run-time in order to create a multitude of modern waveforms. Thirdly, by introducing a generic test infrastructure, which can be tailored to prototype diverse wireless technology and which is remotely accessible in order to invite new ideas by third parties. Using the test infrastructure, the performance of the flexible transceiver is evaluated regarding latency, achievable throughput and packet error rates.:List of figures List of tables Abbreviations Notations 1 Introduction 1.1 Wireless applications 1.2 Motivation 1.3 Software-Defined Radio 1.4 State of the art 1.5 Testbed 1.6 Summary 2 Background 2.1 System Model 2.2 PHY Layer Structure 2.3 Generalized Frequency Division Multiplexing 2.4 Wireless Standards 2.4.1 IEEE 802.15.4 2.4.2 802.11 WLAN 2.4.3 LTE 2.4.4 Low Latency Industrial Wireless Communications 2.4.5 Summary 3 Wireless Prototyping 3.1 Testbed Examples 3.1.1 PHY - focused Testbeds 3.1.2 MAC - focused Testbeds 3.1.3 Network - focused testbeds 3.1.4 Generic testbeds 3.2 Considerations 3.3 Use cases and Scenarios 3.4 Requirements 3.5 Methodology 3.6 Hardware Platform 3.6.1 Host 3.6.2 FPGA 3.6.3 Hybrid 3.6.4 ASIC 3.7 Software Platform 3.7.1 Testbed Management Frameworks 3.7.2 Development Frameworks 3.7.3 Software Implementations 3.8 Deployment 3.9 Discussion 3.10 Conclusion 4 Flexible Transceiver 4.1 Signal Processing Modules 4.1.1 MAC interface 4.1.2 Encoding and Mapping 4.1.3 Modem 4.1.4 Post modem processing 4.1.5 Synchronization 4.1.6 Channel Estimation and Equalization 4.1.7 Demapping 4.1.8 Flexible Configuration 4.2 Analysis 4.2.1 Numerical Precision 4.2.2 Spectral analysis 4.2.3 Latency 4.2.4 Resource Consumption 4.3 Discussion 4.3.1 Extension to MIMO 4.4 Summary 5 Testbed 5.1 Infrastructure 5.2 Automation 5.3 Software Defined Radio Platform 5.4 Radio Frequency Front-end 5.4.1 Sub 6 GHz front-end 5.4.2 26 GHz mmWave front-end 5.5 Performance evaluation 5.6 Summary 6 Experiments 6.1 Single Link 6.1.1 Infrastructure 6.1.2 Single Link Experiments 6.1.3 End-to-End 6.2 Multi-User 6.3 26 GHz mmWave experimentation 6.4 Summary 7 Key lessons 7.1 Limitations Experienced During Development 7.2 Prototyping Future 7.3 Open points 7.4 Workflow 7.5 Summary 8 Conclusions 8.1 Future Work 8.1.1 Prototyping Workflow 8.1.2 Flexible Transceiver Core 8.1.3 Experimental Data-sets 8.1.4 Evolved Access Point Prototype For Industrial Networks 8.1.5 Testbed Standardization A Additional Resources A.1 Fourier Transform Blocks A.2 Resource Consumption A.3 Channel Sounding using Chirp sequences A.3.1 SNR Estimation A.3.2 Channel Estimation A.4 Hardware part lis

    A Compilation Flow for Parametric Dataflow: Programming Model, Scheduling, and Application to Heterogeneous MPSoC

    Get PDF
    International audienceEfficient programming of signal processing applications on embedded systems is a complex problem. High level models such as Synchronous dataflow (SDF) have been privileged candidates for dealing with this complexity. These models permit to express inherent application parallelism, as well as analysis for both verification and optimization. Parametric dataflow models aim at providing sufficient dynamicity to model new applications, while at the same time maintaining the high level of analyzability needed for efficient real life implementations. This paper presents a new compilation flow that targets parametric dataflows. Built on the LLVM compiler infrastructure, it offers an actor based C++ programming model to describe parametric graphs, a compilation front-end providing graph analysis features, and a retargetable back-end to map the application on real hardware. This paper gives an overview of this flow, with a specific focus on scheduling. The crucial gap between dataflow models and real hardware on which actor firing is not atomic, as well as the consequences on FIFOs sizing and execution pipelining are taken into account.The experimental results illustrate our compilation flow applied to compilation of 3GPP LTE-Advanced demodulation on a heterogeneous MPSoC with distributed scheduling features. This achieves performances similar to time-consuming hand made optimizations

    A Survey of Techniques For Improving Energy Efficiency in Embedded Computing Systems

    Full text link
    Recent technological advances have greatly improved the performance and features of embedded systems. With the number of just mobile devices now reaching nearly equal to the population of earth, embedded systems have truly become ubiquitous. These trends, however, have also made the task of managing their power consumption extremely challenging. In recent years, several techniques have been proposed to address this issue. In this paper, we survey the techniques for managing power consumption of embedded systems. We discuss the need of power management and provide a classification of the techniques on several important parameters to highlight their similarities and differences. This paper is intended to help the researchers and application-developers in gaining insights into the working of power management techniques and designing even more efficient high-performance embedded systems of tomorrow
    • …
    corecore