Search CORE

5,389 research outputs found

SCM : Secure Code Memory Architecture

Author: de Clercq Ruan
De Keulenaer Ronald
De Sutter Bjorn
Maene Pieter
Preneel Bart
Verbauwhede Ingrid
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2017
Field of study

An increasing number of applications implemented on a SoC (System-on-chip) require security features. This work addresses the issue of protecting the integrity of code and read-only data that is stored in memory. To this end, we propose a new architecture called SCM, which works as a standalone IP core in a SoC. To the best of our knowledge, there exist no architectural elements similar to SCM that offer the same strict security guarantees while, at the same time, not requiring any modifications to other IP cores in its SoC design. In addition, SCM has the flexibility to select the parts of the software to be protected, which eases the integration of our solution with existing software. The evaluation of SCM was done on the Zynq platform which features an ARM processor and an FPGA. The design was evaluated by executing a number of different benchmarks from memory protected by SCM, and we found that it introduces minimal overhead to the system

Ghent University Academic Bibliography

A case for merging the ILP and DLP paradigms

Author: Espasa Sans Roger
Quintana Rodríguez Francisca
Valero Cortés Mateo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1998
Field of study

The goal of this paper is to show that instruction level parallelism (ILP) and data-level parallelism (DLP) can be merged in a single architecture to execute vectorizable code at a performance level that can not be achieved using either paradigm on its own. We will show that the combination of the two techniques yields very high performance at a low cost and a low complexity. We will show that this architecture can reach a performance equivalent to a superscalar processor that sustained 10 instructions per cycle. We will see that the machine exploiting both types of parallelism improves upon the ILP-only machine by factors of 1.5-1.8. We also present a study on the scalability of both paradigms and show that, when we increase resources to reach a 16-issue machine, the advantage of the ILP+DLP machine over the ILP-only machine increases up to 2.0-3.45. While the peak achieved IPC for the ILP machine is 4, the ILP+DLP machine exceeds 10 instructions per cycle.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Hera-JVM: abstracting processor heterogeneity behind a virtual machine

Author: McIlroy R.
Sventek J.
Publication venue
Publication date: 01/03/2009
Field of study

Heterogeneous multi-core processors, such as the Cell processor, can deliver exceptional performance, however, they are notoriously difficult to program effectively. We present Hera-JVM, a runtime system which hides a processor’s heterogeneity behind a homogeneous virtual machine interface. Preliminary results of three benchmarks running under Hera-JVM are presented. These results suggest a set of application behaviour characteristics that the runtime system should take into account when placing threads on different core types.

CiteSeerX

Enlighten

Analysis of Intel's Haswell Microarchitecture Using The ECM Model and Microbenchmarks

Author: Eitzinger Jan
Fey Dietmar
Hager Georg
Hofmann Johannes
Wellein Gerhard
Publication venue
Publication date: 13/11/2015
Field of study

This paper presents an in-depth analysis of Intel's Haswell microarchitecture for streaming loop kernels. Among the new features examined is the dual-ring Uncore design, Cluster-on-Die mode, Uncore Frequency Scaling, core improvements as new and improved execution units, as well as improvements throughout the memory hierarchy. The Execution-Cache-Memory diagnostic performance model is used together with a generic set of microbenchmarks to quantify the efficiency of the microarchitecture. The set of microbenchmarks is chosen such that it can serve as a blueprint for other streaming loop kernels.Comment: arXiv admin note: substantial text overlap with arXiv:1509.0311

arXiv.org e-Print Archive

Crossref

HERO: Heterogeneous Embedded Research Platform for Exploring RISC-V Manycore Accelerators on FPGA

Author: Benini Luca
Capotondi Alessandro
Kurth Andreas
Marongiu Andrea
Vogel Pirmin
Publication venue
Publication date: 01/01/2017
Field of study

Heterogeneous embedded systems on chip (HESoCs) co-integrate a standard host processor with programmable manycore accelerators (PMCAs) to combine general-purpose computing with domain-specific, efficient processing capabilities. While leading companies successfully advance their HESoC products, research lags behind due to the challenges of building a prototyping platform that unites an industry-standard host processor with an open research PMCA architecture. In this work we introduce HERO, an FPGA-based research platform that combines a PMCA composed of clusters of RISC-V cores, implemented as soft cores on an FPGA fabric, with a hard ARM Cortex-A multicore host processor. The PMCA architecture mapped on the FPGA is silicon-proven, scalable, configurable, and fully modifiable. HERO includes a complete software stack that consists of a heterogeneous cross-compilation toolchain with support for OpenMP accelerator programming, a Linux driver, and runtime libraries for both host and PMCA. HERO is designed to facilitate rapid exploration on all software and hardware layers: run-time behavior can be accurately analyzed by tracing events, and modifications can be validated through fully automated hard ware and software builds and executed tests. We demonstrate the usefulness of HERO by means of case studies from our research

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia