Search CORE

214 research outputs found

Essentials of computing systems

Author: Fernandes João M.
Publication venue: 'University of Minho'
Publication date: 22/02/2022
Field of study

Computers were invented to “compute“, i.e., to solve all sort of mathematical problems. A computer system contains hardware and systems software that work together to run software applications. The underlying concepts that support the construction of a computer are relatively stable. In fact, (almost) all computer systems have a similar organization, i.e., their hardware and software components are arranged in hierarchical layers (or levels) and perform similar functions. This book is written for programmers and software engineers who want to understand how the components of a computer work and how they affect the correctness and performance of their programs.Publishe

Universidade do Minho: RepositoriUM

Directory of Open Access Books (DOAB)

A formally verified compiler back-end

Author: A Dold
A Dold
A Hobor
A Pnueli
ACJ Fox
AJ Chlipala
AW Appel
AW Appel
AW Appel
BK Rosen
C Lindig
CW Barrett
D Cachera
D Lacey
D Leinenbach
D Leinenbach
E Eide
F Henderson
G Barthe
G Barthe
G Barthe
G Barthe
G Clemmensen
G Goos
G Klein
G Li
G Li
G Morrisett
G Morrisett
GA Kildall
GC Necula
GC Necula
GC Necula
GC Necula
GJ Chaitin
GP Huet
H-J Boehm
IBM Corporation
J Chen
J Guttman
J Knoop
J Knoop
J McCarthy
J-B Tristan
J-B Tristan
JO Blech
JR Ellis
JS Moore
JS Moore
L Beringer
L Chirica
L George
L Rideau
LD Zuck
M Huisman
M Müller-Olm
M Strecker
MA Dave
N Benton
P Letouzey
P Letouzey
PH Hartel
PW O’Hearn
Q Huang
R Milner
R Stärk
S Beyer
S Blazy
S Blazy
S Coupet-Grimal
S Gulwani
S Lerner
SL Peyton Jones
SS Muchnick
TC Hales
WM McKeeman
X Feng
X Leroy
X Leroy
X Leroy
X Leroy
X Rival
Xavier Leroy
Y Bertot
Y Bertot
Z Shao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

This article describes the development and formal verification (proof of semantic preservation) of a compiler back-end from Cminor (a simple imperative intermediate language) to PowerPC assembly code, using the Coq proof assistant both for programming the compiler and for proving its correctness. Such a verified compiler is useful in the context of formal methods applied to the certification of critical software: the verification of the compiler guarantees that the safety properties proved on the source code hold for the executable compiled code as well

arXiv.org e-Print Archive

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

HTA: A Scalable High-Throughput Accelerator for Irregular HPC Workloads

Author: Akella V.
Fariborz M.
Fotouhi P.
Lowe-Power J.
Proietti R.
Yoo S. J. B.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

We propose a new architecture called HTA for high throughput irregular HPC applications with little data reuse. HTA reduces the contention within the memory system with the help of a partitioned memory controller that is amenable for 2.5D implementation using Silicon Photonics. In terms of scalability, HTA supports 4 × higher number of compute units compared to the state-of-the-art GPU systems. Our simulation-based evaluation on a representative set of HPC benchmarks shows that the proposed design reduces the queuing latency by 10% to 30%, and improves the variability in memory access latency by 10% to 60%. Our results show that the HTA improves the L1 miss penalty by 2.3 × to 5 × over GPUs. When compared to a multi-GPU system with the same number of compute units, our simulation results show that the HTA can provide up to 2 × speedup

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

A computer-aided design for digital filter implementation

Author: Lai P. K. M. J.
Lai P. K. M. J.
Publication venue: Department of Electrical Engineering, Imperial College London
Publication date: 01/01/1979
Field of study

Imperial Users onl

Spiral - Imperial College Digital Repository

A transprecision floating-point cluster for efficient near-sensor data analytics

Author: Benatti Simone
Benini Luca
Garofalo Angelo
Mach Stefan
Montagna Fabio
Ottavi Gianmarco
Rossi Davide
Tagliavini Giuseppe
Publication venue
Publication date: 27/08/2020
Field of study

Recent applications in the domain of near-sensor computing require the adoption of floating-point arithmetic to reconcile high precision results with a wide dynamic range. In this paper, we propose a multi-core computing cluster that leverages the fined-grained tunable principles of transprecision computing to provide support to near-sensor applications at a minimum power budget. Our design - based on the open-source RISC-V architecture - combines parallelization and sub-word vectorization with near-threshold operation, leading to a highly scalable and versatile system. We perform an exhaustive exploration of the design space of the transprecision cluster on a cycle-accurate FPGA emulator, with the aim to identify the most efficient configurations in terms of performance, energy efficiency, and area efficiency. We also provide a full-fledged software stack support, including a parallel runtime and a compilation toolchain, to enable the development of end-to-end applications. We perform an experimental assessment of our design on a set of benchmarks representative of the near-sensor processing domain, complementing the timing results with a post place-&-route analysis of the power consumption. Finally, a comparison with the state-of-the-art shows that our solution outperforms the competitors in energy efficiency, reaching a peak of 97 Gflop/s/W on single-precision scalars and 162 Gflop/s/W on half-precision vectors

arXiv.org e-Print Archive

FPGA Frequency Domain Based Gps Coarse Acquisition Processor using FFT

Author: Sajabi Cyprian D.
Publication venue: CORE Scholar
Publication date: 01/01/2006
Field of study

The Global Positioning System or GPS is a satellite based technology that has gained widespread use worldwide in civilian and military applications. Direct Sequence Spread spectrum (DSSS) is the method whereby the data transmitted by the satellite and received by user is kept secure, low power and relatively noise-immune. The first step required in the GPS operation is to perform a lock on the incoming signal, both with respect to time synchronization and frequency resolution. Because of the need for reduced time to lock and also reduced hardware, algorithms based in the frequency domain have been developed. These algorithms take advantage of the time to frequency matrix operation known as the fast Fourier transform or FFT. For this thesis, a Direct Sequence Spread Spectrum Coarse Acquisition code processor based on the FFT was implemented in VHDL and targeted to a Xilinx Virtex –II Pro Field Programmable Gate Array (FPGA). The use of the FFT allows simultaneous lock on coarse acquisition (C/A) code and carrier frequency. Because of hardware limitations, a novel technique of sub-sampling is used in this system to obtain data block sizes that match hardware limitations. In addition, design challenges related to scheduling and timing were addressed, allowing a system with 19 pipeline stages to be built. The system, which fits on a Xilinx Virtex-II pro XC2VP70 FPGA, uses 10 ms of data to perform the lock with 5.5 ms of processing time at 100 MHz and theoretically can operate on signals 20 db below the noise floor

OhioLINK Electronic Thesis and Dissertation Center

CORE

Execution model and optimizing compilation for execution migration

Author: Lebedev Ilia Andreevich
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2013
Field of study

Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2013.Cataloged from PDF version of thesis.Includes bibliographical references (pages 137-141).Although systems with hardware support for fine-grained execution migration are becoming a reality, no concrete execution model or compiler exist for these machines. This limits the complexity of software that can be written for these machines, and therefore also the scope of studies for which these machines can be used. In this thesis, we define a productive programming model for an execution migration platform by exposing migration as a set of interfaces usable with the C programming language via a custom optimizing compiler. We employ hardware-software co-design to describe a stack core architecture with support for partial context migration in order to simplify the compiler problem and improve compiler efficiency. We also consider instruction encoding in abstract terms to establish a baseline comparison of encoded instruction density to an ideal upper bound. The stack-based execution migration platform offers a new and unexplored cost model, which leads us to reevaluate the trade-offs associated with compilation for these architectures, and to explore novel algorithms, or novel applications of existing optimizations. Throughout this work, we attempt to gain a deep understanding of the costs and benefits of execution migration by aggressive design space exploration. We use the insight gained to better inform the the problem of compiling to this unorthodox architecture, and design the compiler, a library of optimized parallel primitives, and a set of compiler optimization passes to best reflect and utilize the underlying hardware.by Ilia Andreevich Lebedev.S.M

DSpace@MIT

Efficient computations in finite fields with cryptographic significance

Author: Wu Huapeng
Publication venue: 'University of Waterloo'
Publication date: 01/01/1999
Field of study

University of Waterloo's Institutional Repository