Search CORE

203 research outputs found

JANUS: an FPGA-based System for High Performance Scientific Computing

Author: A Cruz
Alfonso Tarancon
Andrea Maiorano
Antonio Gordillo-Guerrero
Antonio Munoz-Sudupe
Daniele Sciretti
David Yllanes
Denis Navarro
Enzo Marinari
Filippo Mantovani
Francesco Belletti
Gianpaolo Zanier
Giorgio Parisi
J Luis Velasco
Juan J Ruiz-Lorenzo
Luis Antonio Fernandez
Marco Guidetti
Maria Cotallo
Mauro Rossi
Raffaele Tripiccione
Sebastiano Fabio Schifano
Sergio Perez-Gaviro
Victor Martin-Mayor
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/04/2008
Field of study

This paper describes JANUS, a modular massively parallel and reconfigurable FPGA-based computing system. Each JANUS module has a computational core and a host. The computational core is a 4x4 array of FPGA-based processing elements with nearest-neighbor data links. Processors are also directly connected to an I/O node attached to the JANUS host, a conventional PC. JANUS is tailored for, but not limited to, the requirements of a class of hard scientific applications characterized by regular code structure, unconventional data manipulation instructions and not too large data-base size. We discuss the architecture of this configurable machine, and focus on its use on Monte Carlo simulations of statistical mechanics. On this class of application JANUS achieves impressive performances: in some cases one JANUS processing element outperfoms high-end PCs by a factor ~ 1000. We also discuss the role of JANUS on other classes of scientific applications.Comment: 11 pages, 6 figures. Improved version, largely rewritten, submitted to Computing in Science & Engineerin

arXiv.org e-Print Archive

Docta Complutense

Crossref

Archivio istituzionale della ricerca - Università di Ferrara

Archivio della ricerca- Università di Roma La Sapienza

Janus II: a new generation application-driven computer for spin-system simulations

This paper describes the architecture, the development and the implementation of Janus II, a new generation application-driven number cruncher optimized for Monte Carlo simulations of spin systems (mainly spin glasses). This domain of computational physics is a recognized grand challenge of high-performance computing: the resources necessary to study in detail theoretical models that can make contact with experimental data are by far beyond those available using commodity computer systems. On the other hand, several specific features of the associated algorithms suggest that unconventional computer architectures, which can be implemented with available electronics technologies, may lead to order of magnitude increases in performance, reducing to acceptable values on human scales the time needed to carry out simulation campaigns that would take centuries on commercially available machines. Janus II is one such machine, recently developed and commissioned, that builds upon and improves on the successful JANUS machine, which has been used for physics since 2008 and is still in operation today. This paper describes in detail the motivations behind the project, the computational requirements, the architecture and the implementation of this new machine and compares its expected performances with those of currently available commercial systems.Comment: 28 pages, 6 figure

arXiv.org e-Print Archive

Docta Complutense

Crossref

Directory of Open Access Journals

La Colmena

Turismo y patrimonio (E-Journal)

Archivio istituzionale della ricerca - Università di Ferrara

DIALNET

Archivio della ricerca- Università di Roma La Sapienza

Janus: a recongurable system for scientic computing

Author: Mantovani Filippo
Publication venue
Publication date: 01/01/2009
Field of study

EprintsUnife

Archivio istituzionale della ricerca - Università di Ferrara

Weighted p-bits for FPGA implementation of probabilistic circuits

Author: Camsari Kerem Y.
Ghantasala Lakshmi Anirudh
Pervaiz Ahmed Zeeshan
Sutton Brian M.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/11/2018
Field of study

Probabilistic spin logic (PSL) is a recently proposed computing paradigm based on unstable stochastic units called probabilistic bits (p-bits) that can be correlated to form probabilistic circuits (p-circuits). These p-circuits can be used to solve problems of optimization, inference and also to implement precise Boolean functions in an "inverted" mode, where a given Boolean circuit can operate in reverse to find the input combinations that are consistent with a given output. In this paper we present a scalable FPGA implementation of such invertible p-circuits. We implement a "weighted" p-bit that combines stochastic units with localized memory structures. We also present a generalized tile of weighted p-bits to which a large class of problems beyond invertible Boolean logic can be mapped, and how invertibility can be applied to interesting problems such as the NP-complete Subset Sum Problem by solving a small instance of this problem in hardware

arXiv.org e-Print Archive

eScholarship - University of California

Critical Behavior of Three-Dimensional Disordered Potts Models with Many States

Author: A Cruz
A Gordillo-Guerrero
A Maiorano
A Muñoz Sudupe
A Tarancon
Amit D J
B Seoane
Brangian C
Brangian C
Brangian C
Brangian C
D Navarro
D Yllanes
E Marinari
Elderfield D
Elderfield D
F Mantovani
G Parisi
J J Ruiz-Lorenzo
J M Gil-Narvion
J Monforte-Garcia
L A Fernandez
M Guidetti
Marinari E
R Alvarez Baños
R Tripiccione
S F Schifano
S Perez-Gaviro
Sokal A D
V Martin-Mayor
Publication venue: 'IOP Publishing'
Publication date: 01/01/2010
Field of study

We study the 3D Disordered Potts Model with p=5 and p=6. Our numerical simulations (that severely slow down for increasing p) detect a very clear spin glass phase transition. We evaluate the critical exponents and the critical value of the temperature, and we use known results at lower

p

values to discuss how they evolve for increasing p. We do not find any sign of the presence of a transition to a ferromagnetic regime.Comment: 9 pages and 9 Postscript figures. Final version published in J. Stat. Mec

arXiv.org e-Print Archive

Docta Complutense

Crossref

Archivio istituzionale della ricerca - Università di Ferrara

Archivio della ricerca- Università di Roma La Sapienza

Nature of the spin-glass phase at experimental length scales

Author: A. Cruz
A. Gordillo-Guerrero
A. Munoz Sudupe
A. Muoz Sudupe
A. Tarancon
A.M. Sudupe
Andrea Maiorano
B. Seoane
D. Navarro
D. Yllanes
E. Marinari
F. Mantovani
Giorgio Parisi
J. Monforte-Garcia
J.J. Ruiz-Lorenzo
J.M. Gil-Narvion
L.A. Fernandez
M. Guidetti
R. Alvarez Banos
R. Tripiccione
R.A. Banos
S. Perez-Gaviro
S.F. Schifano
V. Martin-Mayor
Vincenzo Marinari
Publication venue: 'IOP Publishing'
Publication date: 01/01/2010
Field of study

We present a massive equilibrium simulation of the three-dimensional Ising spin glass at low temperatures. The Janus special-purpose computer has allowed us to equilibrate, using parallel tempering, L=32 lattices down to T=0.64 Tc. We demonstrate the relevance of equilibrium finite-size simulations to understand experimental non-equilibrium spin glasses in the thermodynamical limit by establishing a time-length dictionary. We conclude that non-equilibrium experiments performed on a time scale of one hour can be matched with equilibrium results on L=110 lattices. A detailed investigation of the probability distribution functions of the spin and link overlap, as well as of their correlation functions, shows that Replica Symmetry Breaking is the appropriate theoretical framework for the physically relevant length scales. Besides, we improve over existing methodologies to ensure equilibration in parallel tempering simulations.Comment: 48 pages, 19 postscript figures, 9 tables. Version accepted for publication in the Journal of Statistical Mechanic

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Ferrara

Archivio della ricerca- Università di Roma La Sapienza

Autonomous Probabilistic Coprocessing with Petaflips per Second

Author: Camsari Kerem Y.
Datta Supriyo
Faria Rafatul
Ghantasala Lakshmi A.
Jaiswal Risi
Sutton Brian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

In this paper we present a concrete design for a probabilistic (p-) computer based on a network of p-bits, robust classical entities fluctuating between -1 and +1, with probabilities that are controlled through an input constructed from the outputs of other p-bits. The architecture of this probabilistic computer is similar to a stochastic neural network with the p-bit playing the role of a binary stochastic neuron, but with one key difference: there is no sequencer used to enforce an ordering of p-bit updates, as is typically required. Instead, we explore \textit{sequencerless} designs where all p-bits are allowed to flip autonomously and demonstrate that such designs can allow ultrafast operation unconstrained by available clock speeds without compromising the solution's fidelity. Based on experimental results from a hardware benchmark of the autonomous design and benchmarked device models, we project that a nanomagnetic implementation can scale to achieve petaflips per second with millions of neurons. A key contribution of this paper is the focus on a hardware metric

-

flips per second

-

as a problem and substrate-independent figure-of-merit for an emerging class of hardware annealers known as Ising Machines. Much like the shrinking feature sizes of transistors that have continually driven Moore's Law, we believe that flips per second can be continually improved in later technology generations of a wide class of probabilistic, domain specific hardware.Comment: 13 pages, 8 figures, 1 tabl

arXiv.org e-Print Archive

eScholarship - University of California

Critical parameters of the three-dimensional Ising spin glass

Author: Baity-Jesi M.
Baños R. A.
Cruz A.
Fernandez L. A.
Gil-Narvion J. M.
Gordillo-Guerrero A.
Iñiguez D.
Janus Collaboration
Maiorano A.
Mantovani F.
Marinari E.
Martin-Mayor V.
Monforte-Garcia J.
Navarro D.
Parisi G.
Perez-Gaviro S.
Pivanti M.
Ricci-Tersenghi F.
Ruiz-Lorenzo J. J.
Schifano S. F.
Seoane B.
Sudupe A. Muñoz
Tarancon A.
Tripiccione R.
Yllanes D.
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2013
Field of study

We report a high-precision finite-size scaling study of the critical behavior of the three-dimensional Ising Edwards-Anderson model (the Ising spin glass). We have thermalized lattices up to L=40 using the Janus dedicated computer. Our analysis takes into account leading-order corrections to scaling. We obtain Tc = 1.1019(29) for the critical temperature, \nu = 2.562(42) for the thermal exponent, \eta = -0.3900(36) for the anomalous dimension and \omega = 1.12(10) for the exponent of the leading corrections to scaling. Standard (hyper)scaling relations yield \alpha = -5.69(13), \beta = 0.782(10) and \gamma = 6.13(11). We also compute several universal quantities at Tc.Comment: 9 pages, 5 figure

arXiv.org e-Print Archive

Archivio della ricerca- Università di Roma La Sapienza

FPGA based technical solutions for high throughput data processing and encryption for 5G communication: A review

Author: Fazio R. de
Soto Carolina Del-Valle
Velazquez R.
Visconti P.
Publication venue: 'Universitas Ahmad Dahlan'
Publication date: 01/08/2021
Field of study

The field programmable gate array (FPGA) devices are ideal solutions for high-speed processing applications, given their flexibility, parallel processing capability, and power efficiency. In this review paper, at first, an overview of the key applications of FPGA-based platforms in 5G networks/systems is presented, exploiting the improved performances offered by such devices. FPGA-based implementations of cloud radio access network (C-RAN) accelerators, network function virtualization (NFV)-based network slicers, cognitive radio systems, and multiple input multiple output (MIMO) channel characterizers are the main considered applications that can benefit from the high processing rate, power efficiency and flexibility of FPGAs. Furthermore, the implementations of encryption/decryption algorithms by employing the Xilinx Zynq Ultrascale+MPSoC ZCU102 FPGA platform are discussed, and then we introduce our high-speed and lightweight implementation of the well-known AES-128 algorithm, developed on the same FPGA platform, and comparing it with similar solutions already published in the literature. The comparison results indicate that our AES-128 implementation enables efficient hardware usage for a given data-rate (up to 28.16 Gbit/s), resulting in higher efficiency (8.64 Mbps/slice) than other considered solutions. Finally, the applications of the ZCU102 platform for high-speed processing are explored, such as image and signal processing, visual recognition, and hardware resource management

Journal of Education and Learning (EduLearn)

TELKOMNIKA (Telecommunication Computing Electronics and Control)

UAD Journal Management System

A Parallel Hardware Architecture For Quantum Annealing Algorithm Acceleration

Author: Acquaviva Andrea
Forno Evelina
Macii Enrico
Urgese Gianvito
Yuki Kobayashi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Quantum Annealing (QA) is an emerging technique, derived from Simulated Annealing, providing metaheuristics for multivariable optimisation problems. Studies have shown that it can be applied to solve NP-hard problems with faster convergence and better quality of result than other traditional heuristics, with potential applications in a variety of fields, from transport logistics to circuit synthesis and optimisation. In this paper, we present a hardware architecture implementing a QA-based solver for the Multidimensional Knapsack Problem, designed to improve the performance of the algorithm by exploiting parallelised computation. We synthesised the architecture using as a target an Altera FPGA board and simulated the execution for solving a set of benchmarks available in the literature. Simulation results show that the proposed implementation is about 100 times faster than a single-thread general-purpose CPU without impact on the accuracy of the solution

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)