Search CORE

19,078 research outputs found

Study of combining GPU/FPGA accelerators for high-performance computing

Author: Braeken An
Cornelis Jan G
D'Hollander Erik
da Silva Gomes Bruno
Lemeire Jan
Touhafi Abdellah
Publication venue: HiPEAC
Publication date: 01/01/2013
Field of study

This contribution presents the performance modeling of a super desktop with GPU and FPGA accelerators, using OpenCL for the GPU and a high-level synthesis compiler for the FPGAs. The performance model is used to evaluate the different high-level synthesis optimizations, taking into account the resource usage, and to compare the compute power of the FPGA with the GP

Ghent University Academic Bibliography

A Scalable Correlator Architecture Based on Modular FPGA Hardware, Reuseable Gateware, and Data Packetization

Author: Aaron Parsons
Andrew Siemion
Arash Parsa
Blackman R.
Bradley R.
Dan Werthimer
David MacMahon
Demorest P.
Donald Backer
Heiles C.
Henry Chen
Jason Manley
Melvyn Wright
Peter McMahon
Pierre Droz
Terry Filiba
Weinreb S.
Yen J. L.
Publication venue: 'University of Chicago Press'
Publication date: 17/03/2009
Field of study

A new generation of radio telescopes is achieving unprecedented levels of sensitivity and resolution, as well as increased agility and field-of-view, by employing high-performance digital signal processing hardware to phase and correlate large numbers of antennas. The computational demands of these imaging systems scale in proportion to BMN^2, where B is the signal bandwidth, M is the number of independent beams, and N is the number of antennas. The specifications of many new arrays lead to demands in excess of tens of PetaOps per second. To meet this challenge, we have developed a general purpose correlator architecture using standard 10-Gbit Ethernet switches to pass data between flexible hardware modules containing Field Programmable Gate Array (FPGA) chips. These chips are programmed using open-source signal processing libraries we have developed to be flexible, scalable, and chip-independent. This work reduces the time and cost of implementing a wide range of signal processing systems, with correlators foremost among them,and facilitates upgrading to new generations of processing technology. We present several correlator deployments, including a 16-antenna, 200-MHz bandwidth, 4-bit, full Stokes parameter application deployed on the Precision Array for Probing the Epoch of Reionization.Comment: Accepted to Publications of the Astronomy Society of the Pacific. 31 pages. v2: corrected typo, v3: corrected Fig. 1

arXiv.org e-Print Archive

Crossref

Low-Complexity Sub-band Digital Predistortion for Spurious Emission Suppression in Noncontiguous Spectrum Access

Author: Abdelaziz Mahmoud
Anttila Lauri
Cavallaro Joseph R.
Li Kaipeng
Tarver Chance
Valkama Mikko
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 19/08/2016
Field of study

Noncontiguous transmission schemes combined with high power-efficiency requirements pose big challenges for radio transmitter and power amplifier (PA) design and implementation. Due to the nonlinear nature of the PA, severe unwanted emissions can occur, which can potentially interfere with neighboring channel signals or even desensitize the own receiver in frequency division duplexing (FDD) transceivers. In this article, to suppress such unwanted emissions, a low-complexity sub-band DPD solution, specifically tailored for spectrally noncontiguous transmission schemes in low-cost devices, is proposed. The proposed technique aims at mitigating only the selected spurious intermodulation distortion components at the PA output, hence allowing for substantially reduced processing complexity compared to classical linearization solutions. Furthermore, novel decorrelation based parameter learning solutions are also proposed and formulated, which offer reduced computing complexity in parameter estimation as well as the ability to track time-varying features adaptively. Comprehensive simulation and RF measurement results are provided, using a commercial LTE-Advanced mobile PA, to evaluate and validate the effectiveness of the proposed solution in real world scenarios. The obtained results demonstrate that highly efficient spurious component suppression can be obtained using the proposed solutions

arXiv.org e-Print Archive

DSpace at Rice University

Massive MIMO with Non-Ideal Arbitrary Arrays: Hardware Scaling Laws and Circuit-Aware Design

Author: Björnson Emil
Debbah Mérouane
Matthaiou Michail
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

Massive multiple-input multiple-output (MIMO) systems are cellular networks where the base stations (BSs) are equipped with unconventionally many antennas, deployed on co-located or distributed arrays. Huge spatial degrees-of-freedom are achieved by coherent processing over these massive arrays, which provide strong signal gains, resilience to imperfect channel knowledge, and low interference. This comes at the price of more infrastructure; the hardware cost and circuit power consumption scale linearly/affinely with the number of BS antennas

N

. Hence, the key to cost-efficient deployment of large arrays is low-cost antenna branches with low circuit power, in contrast to today's conventional expensive and power-hungry BS antenna branches. Such low-cost transceivers are prone to hardware imperfections, but it has been conjectured that the huge degrees-of-freedom would bring robustness to such imperfections. We prove this claim for a generalized uplink system with multiplicative phase-drifts, additive distortion noise, and noise amplification. Specifically, we derive closed-form expressions for the user rates and a scaling law that shows how fast the hardware imperfections can increase with

N

while maintaining high rates. The connection between this scaling law and the power consumption of different transceiver circuits is rigorously exemplified. This reveals that one can make the circuit power increase as

\sqrt{N}

, instead of linearly, by careful circuit-aware system design.Comment: Accepted for publication in IEEE Transactions on Wireless Communications, 16 pages, 8 figures. The results can be reproduced using the following Matlab code: https://github.com/emilbjornson/hardware-scaling-law

HAL-CentraleSupelec

Queen's University Belfast Research Portal

Chalmers Research

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Hal-Diderot

arXiv.org e-Print Archive

CiteSeerX

Publikationer från Linköpings universitet

Chalmers Publication Library

HAL-Rennes 1

Millimeter-wave Wireless LAN and its Extension toward 5G Heterogeneous Networks

Author: Kusano Hideyuki
Miyamoto Shinichi
Mizukami Makoto
Mohamed Ehab Mahmoud
Namba Shinobu
Peng Hailan
Rezagah Roya
Sakaguchi Kei
Shirakata Naganori
Takahashi Kazuaki
Takinami Koji
Yamamoto Toshiaki
Publication venue
Publication date: 01/01/2015
Field of study

Millimeter-wave (mmw) frequency bands, especially 60 GHz unlicensed band, are considered as a promising solution for gigabit short range wireless communication systems. IEEE standard 802.11ad, also known as WiGig, is standardized for the usage of the 60 GHz unlicensed band for wireless local area networks (WLANs). By using this mmw WLAN, multi-Gbps rate can be achieved to support bandwidth-intensive multimedia applications. Exhaustive search along with beamforming (BF) is usually used to overcome 60 GHz channel propagation loss and accomplish data transmissions in such mmw WLANs. Because of its short range transmission with a high susceptibility to path blocking, multiple number of mmw access points (APs) should be used to fully cover a typical target environment for future high capacity multi-Gbps WLANs. Therefore, coordination among mmw APs is highly needed to overcome packet collisions resulting from un-coordinated exhaustive search BF and to increase the total capacity of mmw WLANs. In this paper, we firstly give the current status of mmw WLANs with our developed WiGig AP prototype. Then, we highlight the great need for coordinated transmissions among mmw APs as a key enabler for future high capacity mmw WLANs. Two different types of coordinated mmw WLAN architecture are introduced. One is the distributed antenna type architecture to realize centralized coordination, while the other is an autonomous coordination with the assistance of legacy Wi-Fi signaling. Moreover, two heterogeneous network (HetNet) architectures are also introduced to efficiently extend the coordinated mmw WLANs to be used for future 5th Generation (5G) cellular networks.Comment: 18 pages, 24 figures, accepted, invited paper

arXiv.org e-Print Archive

Crossref

Temporal unpredictability detection of real-time video sequence

Author: Liu Yang
Liu Yang
Publication venue
Publication date: 01/01/2008
Field of study

Imperial Users onl

Spiral - Imperial College Digital Repository

Neuromorphic Hardware In The Loop: Training a Deep Spiking Network on the BrainScaleS Wafer-Scale System

Author: Bellec Guillaume
Grubl Andreas
Guttler Maurice
Hartel Andreas
Hartmann Stephan
Husmann Dan
Husmann Kai
Jeltsch Sebastian
Karasenko Vitali
Klahn Johann
Kleider Mitja
Koke Christoph
Kononov Alexander
Legenstein Robert
Maass Wolfgang
Mauch Christian
Mayr Christian
Meier Karlheinz
Muller Eric
Muller Paul
Partzsch Johannes
Petrovici Mihai Alexandru
Schemmel Johannes
Schiefer Stefan
Schmitt Sebastian
Scholze Stefan
Schuffny Rene
Thanasoulis Vasilis
Vogginger Bernhard
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

Emulating spiking neural networks on analog neuromorphic hardware offers several advantages over simulating them on conventional computers, particularly in terms of speed and energy consumption. However, this usually comes at the cost of reduced control over the dynamics of the emulated networks. In this paper, we demonstrate how iterative training of a hardware-emulated network can compensate for anomalies induced by the analog substrate. We first convert a deep neural network trained in software to a spiking network on the BrainScaleS wafer-scale neuromorphic system, thereby enabling an acceleration factor of 10 000 compared to the biological time domain. This mapping is followed by the in-the-loop training, where in each training step, the network activity is first recorded in hardware and then used to compute the parameter updates in software via backpropagation. An essential finding is that the parameter updates do not have to be precise, but only need to approximately follow the correct gradient, which simplifies the computation of updates. Using this approach, after only several tens of iterations, the spiking network shows an accuracy close to the ideal software-emulated prototype. The presented techniques show that deep spiking networks emulated on analog neuromorphic devices can attain good computational performance despite the inherent variations of the analog substrate.Comment: 8 pages, 10 figures, submitted to IJCNN 201

arXiv.org e-Print Archive

Crossref

Bern Open Repository and Information System (BORIS)