Search CORE

16,454 research outputs found

Quantum Monte Carlo for large chemical systems: Implementing efficient strategies for petascale platforms and beyond

Author: Caffarel Michel
Jalby William
Oseret Emmanuel
Scemama Anthony
Publication venue: 'Wiley'
Publication date: 01/10/2012
Field of study

Various strategies to implement efficiently QMC simulations for large chemical systems are presented. These include: i.) the introduction of an efficient algorithm to calculate the computationally expensive Slater matrices. This novel scheme is based on the use of the highly localized character of atomic Gaussian basis functions (not the molecular orbitals as usually done), ii.) the possibility of keeping the memory footprint minimal, iii.) the important enhancement of single-core performance when efficient optimization tools are employed, and iv.) the definition of a universal, dynamic, fault-tolerant, and load-balanced computational framework adapted to all kinds of computational platforms (massively parallel machines, clusters, or distributed grids). These strategies have been implemented in the QMC=Chem code developed at Toulouse and illustrated with numerical applications on small peptides of increasing sizes (158, 434, 1056 and 1731 electrons). Using 10k-80k computing cores of the Curie machine (GENCI-TGCC-CEA, France) QMC=Chem has been shown to be capable of running at the petascale level, thus demonstrating that for this machine a large part of the peak performance can be achieved. Implementation of large-scale QMC simulations for future exascale platforms with a comparable level of efficiency is expected to be feasible

arXiv.org e-Print Archive

HAL-INSA Toulouse

HAL UVSQ

A study of performance and complexity for IEEE 802.11n MIMO-OFDM GIS solutions

Author: Abdul Aziz MK
Fletcher PN
Nix AR
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/06/2004
Field of study

Explore Bristol Research

Large-Scale MIMO Detection for 3GPP LTE: Algorithms and FPGA Implementations

Author: Cavallaro Joseph R.
Dick Chris
Studer Christoph
Wang Guohui
Wu Michael
Yin Bei
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

Large-scale (or massive) multiple-input multiple-output (MIMO) is expected to be one of the key technologies in next-generation multi-user cellular systems, based on the upcoming 3GPP LTE Release 12 standard, for example. In this work, we propose - to the best of our knowledge - the first VLSI design enabling high-throughput data detection in single-carrier frequency-division multiple access (SC-FDMA)-based large-scale MIMO systems. We propose a new approximate matrix inversion algorithm relying on a Neumann series expansion, which substantially reduces the complexity of linear data detection. We analyze the associated error, and we compare its performance and complexity to those of an exact linear detector. We present corresponding VLSI architectures, which perform exact and approximate soft-output detection for large-scale MIMO systems with various antenna/user configurations. Reference implementation results for a Xilinx Virtex-7 XC7VX980T FPGA show that our designs are able to achieve more than 600 Mb/s for a 128 antenna, 8 user 3GPP LTE-based large-scale MIMO system. We finally provide a performance/complexity trade-off comparison using the presented FPGA designs, which reveals that the detector circuit of choice is determined by the ratio between BS antennas and users, as well as the desired error-rate performance.Comment: To appear in the IEEE Journal of Selected Topics in Signal Processin

arXiv.org e-Print Archive

CiteSeerX

Repository for Publications and Research Data

Scaling up MIMO: Opportunities and Challenges with Very Large Arrays

Author: Buon Kiong Lau
Buon Kiong Lau
Buon Kiong Lau
Daniel Persson
Daniel Persson
Daniel Persson
Erik G. Larsson
Erik G. Larsson
Erik G. Larsson
Fredrik Rusek
Fredrik Rusek
Fredrik Rusek
Fredrik Tufvesson
Fredrik Tufvesson
Ove Edfors
Ove Edfors
Post Print
Thomas L. Marzetta
Thomas L. Marzetta
Thomas L. Marzetta
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/01/2012
Field of study

This paper surveys recent advances in the area of very large MIMO systems. With very large MIMO, we think of systems that use antenna arrays with an order of magnitude more elements than in systems being built today, say a hundred antennas or more. Very large MIMO entails an unprecedented number of antennas simultaneously serving a much smaller number of terminals. The disparity in number emerges as a desirable operating condition and a practical one as well. The number of terminals that can be simultaneously served is limited, not by the number of antennas, but rather by our inability to acquire channel-state information for an unlimited number of terminals. Larger numbers of terminals can always be accommodated by combining very large MIMO technology with conventional time- and frequency-division multiplexing via OFDM. Very large MIMO arrays is a new research field both in communication theory, propagation, and electronics and represents a paradigm shift in the way of thinking both with regards to theory, systems and implementation. The ultimate vision of very large MIMO systems is that the antenna array would consist of small active antenna units, plugged into an (optical) fieldbus.Comment: Accepted for publication in the IEEE Signal Processing Magazine, October 201

arXiv.org e-Print Archive

CiteSeerX

Publikationer från Linköpings universitet

Lund University Publications

Crossref

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Synthesis and Optimization of Reversible Circuits - A Survey

Author: Arabzadeh M.
Cheung D.
Cheung D.
Cuccaro S. A.
De Vos A.
Doucçot B.
Fazel K.
Glück R.
Hirata Y.
Igor L. Markov
Korf R.
Kutin S.
Kutin S. A.
Lee S.
Markov I. L.
Markov I. L.
Mehdi Saeedi
Miller D.
Mishchenko A.
Patel K. N.
Politi A.
Saeedi M.
Saeedi M.
Saeedi M.
Shende V. V.
Shi Z.
Soeken M.
Storme L.
Takahashi Y.
Takahashi Y.
Viamontes G. F.
Wille R.
Yamashita S.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 20/03/2013
Field of study

Reversible logic circuits have been historically motivated by theoretical research in low-power electronics as well as practical improvement of bit-manipulation transforms in cryptography and computer graphics. Recently, reversible circuits have attracted interest as components of quantum algorithms, as well as in photonic and nano-computing technologies where some switching devices offer no signal gain. Research in generating reversible logic distinguishes between circuit synthesis, post-synthesis optimization, and technology mapping. In this survey, we review algorithmic paradigms --- search-based, cycle-based, transformation-based, and BDD-based --- as well as specific algorithms for reversible synthesis, both exact and heuristic. We conclude the survey by outlining key open challenges in synthesis of reversible and quantum logic, as well as most common misconceptions.Comment: 34 pages, 15 figures, 2 table

arXiv.org e-Print Archive

Crossref

Digital implementation of the cellular sensor-computers

Author: Benthien
Bolotski
Catrysse
Diamantaras
Domínguez-Castro
Dudek
Dudek
Dudek
El Gamal
El Gamal
Espejo
Foldesy
Gielen
Grossberg
Hamamoto
Herbordt
Johansson
Kleinfelder
Linan
Liu
Liñán
Liñán
Morris
Murray
Nayar
Ohta
Roska
Roska
Roska
Rudack
Schneider
Serra
Sinno
Tian
Wagner
Wanhammar
Zarándy
Publication venue: 'Wiley'
Publication date: 01/01/2006
Field of study

Two different kinds of cellular sensor-processor architectures are used nowadays in various applications. The first is the traditional sensor-processor architecture, where the sensor and the processor arrays are mapped into each other. The second is the foveal architecture, in which a small active fovea is navigating in a large sensor array. This second architecture is introduced and compared here. Both of these architectures can be implemented with analog and digital processor arrays. The efficiency of the different implementation types, depending on the used CMOS technology, is analyzed. It turned out, that the finer the technology is, the better to use digital implementation rather than analog

Crossref

SZTAKI Publication Repository

Repository of the Academy's Library

A Multi-GPU Programming Library for Real-Time Applications

Author: Schaetz Sebastian
Uecker Martin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

We present MGPU, a C++ programming library targeted at single-node multi-GPU systems. Such systems combine disproportionate floating point performance with high data locality and are thus well suited to implement real-time algorithms. We describe the library design, programming interface and implementation details in light of this specific problem domain. The core concepts of this work are a novel kind of container abstraction and MPI-like communication methods for intra-system communication. We further demonstrate how MGPU is used as a framework for porting existing GPU libraries to multi-device architectures. Putting our library to the test, we accelerate an iterative non-linear image reconstruction algorithm for real-time magnetic resonance imaging using multiple GPUs. We achieve a speed-up of about 1.7 using 2 GPUs and reach a final speed-up of 2.1 with 4 GPUs. These promising results lead us to conclude that multi-GPU systems are a viable solution for real-time MRI reconstruction as well as signal-processing applications in general.Comment: 15 pages, 10 figure

arXiv.org e-Print Archive

MPG.PuRe

Generic photonic integrated linear operator processor

Author: Chen Minjia
Cheng Qixiang
Penty Richard
Wonfor Adrian
Yao Chunhui
Publication venue
Publication date: 26/05/2023
Field of study

Photonic integration platforms have been explored extensively for optical computing with the aim of breaking the speed and power efficiency limitations of traditional digital electronic computers. Current technologies typically focus on implementing a single computation iteration optically while leaving the intermediate processing in the electronic domain, which are still limited by the electronic bottlenecks. Few explorations have been made of all-optical recursive architectures for computations on integrated photonic platforms. Here we propose a generic photonic integrated linear operator processor based on an all-optical recursive system that supports linear operations ranging from matrix computations to solving equations. We demonstrate the first all-optical on-chip matrix inversion system and use this to solve integral and differential equations. The absence of electronic processing during multiple iterations indicates the potential for an orders-of-magnitudes speed enhancement of this all-optical computing approach compared to electronic computers. We realize matrix inversions, Fredholm integral equations of the second kind, 2^{nd} order ordinary differential equations, and Poisson equations using the generic photonic integrated linear operator processor

arXiv.org e-Print Archive