Search CORE

59 research outputs found

A design methodology for soft-core platforms on FPGA with SMP Linux, OpenMP support, and distributed hardware profiling system

Author: Faccio Marco
Federici Fabio
Ferri Serenella
Muttillo Vittoriano
Pomante Luigi
Tieri Carlo
Valente Giacomo
Publication venue
Publication date: 01/01/2016
Field of study

Archivio della Ricerca - Università degli Studi di Teramo

Open Access Repository

A design methodology for soft-core platforms on FPGA with SMP Linux, OpenMP support, and distributed hardware profiling system

Author: A Gerstlauer
B Wei
Carlo Tieri
Fabio Federici
G Valente
Giacomo Valente
J Tong
L Diaz
L Shannon
L Shannon
Luigi Pomante
M Aldinucci
Marco Faccio
N Ho
S-D David
Serenella Ferri
Vittoriano Muttillo
Y Shang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Implementation of Asymmetric Multiprocessing Support in a Real-Time Operating System

Author: Elamin Elsheikh Islam Abdalla
Publication venue: 'Whiting & Birch, Ltd.'
Publication date: 01/04/2016
Field of study

The semiconductor industry can no longer afford to rely on decreasing the size of the die, and increasing the frequency of operation to achieve higher performance. An alternative that has been proven to increase performance is multiprocessing. Multiprocessing refers to the concept of running more than one application or task on more than one central processor. Multi-core processors are the main engine of multiprocessing. In asymmetric multiprocessing, each core in a multi-core systems is independent and has its own code that determines its execution. These cores must be able to communicate and synchronize access to resources

UTPedia

From plasma to beefarm: Design experience of an FPGA-based multicore prototype

Author: Arcas Abella Oriol
Cristal Kestelman Adrián
Hur Ibrahim
Sayilar Gokhan
Singh Satnam
Sonmez Nehir
Unsal Osman Sabri
Valero Cortés Mateo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

In this paper, we take a MIPS-based open-source uniprocessor soft core, Plasma, and extend it to obtain the Beefarm infrastructure for FPGA-based multiprocessor emulation, a popular research topic of the last few years both in the FPGA and the computer architecture communities. We discuss various design tradeoffs and we demonstrate superior scalability through experimental results compared to traditional software instruction set simulators. Based on our experience of designing and building a complete FPGA-based multiprocessor emulation system that supports run-time and compiler infrastructure and on the actual executions of our experiments running Software Transactional Memory (STM) benchmarks, we comment on the pros, cons and future trends of using hardware-based emulation for research.Peer ReviewedPostprint (author's final draft

CiteSeerX

UPCommons. Portal del coneixement obert de la UPC

Exploiting Hardware Abstraction for Parallel Programming Framework: Platform and Multitasking

Author: Ding Hongyuan
Publication venue: ScholarWorks@UARK
Publication date: 01/01/2017
Field of study

With the help of the parallelism provided by the fine-grained architecture, hardware accelerators on Field Programmable Gate Arrays (FPGAs) can significantly improve the performance of many applications. However, designers are required to have excellent hardware programming skills and unique optimization techniques to explore the potential of FPGA resources fully. Intermediate frameworks above hardware circuits are proposed to improve either performance or productivity by leveraging parallel programming models beyond the multi-core era. In this work, we propose the PolyPC (Polymorphic Parallel Computing) framework, which targets enhancing productivity without losing performance. It helps designers develop parallelized applications and implement them on FPGAs. The PolyPC framework implements a custom hardware platform, on which programs written in an OpenCL-like programming model can launch. Additionally, the PolyPC framework extends vendor-provided tools to provide a complete development environment including intermediate software framework, and automatic system builders. Designers\u27 programs can be either synthesized as hardware processing elements (PEs) or compiled to executable files running on software PEs. Benefiting from nontrivial features of re-loadable PEs, and independent group-level schedulers, the multitasking is enabled for both software and hardware PEs to improve the efficiency of utilizing hardware resources. The PolyPC framework is evaluated regarding performance, area efficiency, and multitasking. The results show a maximum 66 times speedup over a dual-core ARM processor and 1043 times speedup over a high-performance MicroBlaze with 125 times of area efficiency. It delivers a significant improvement in response time to high-priority tasks with the priority-aware scheduling. Overheads of multitasking are evaluated to analyze trade-offs. With the help of the design flow, the OpenCL application programs are converted into executables through the front-end source-to-source transformation and back-end synthesis/compilation to run on PEs, and the framework is generated from users\u27 specifications

ScholarWorks@UARK

UARK (University of Arkansas )

Satisfying hard real-time constraints using COTS components

Author: Betti Emiliano
Publication venue: Università degli Studi di Roma "Tor Vergata"
Publication date: 02/07/2010
Field of study

L'utilizzo di componenti COTS (Commercial-Off-The-Shelf) è sempre più comune nella produzione di sistemi embedded real-time. Prodotti commerciali, come periferiche di Input/Output e bus di sistema, vengono utilizzati in sistemi real-time al fine di ridurre i costi, il tempo di produzione, ed aumentare le performance. Sfortunatamente, hardware e sistemi operativi COTS sono progettati principalmente per ottimizzare le performance, ma con poca attenzione verso determinismo, predicibilità ed affidabilità. Per questa ragione, molte problematiche devono ancora essere affrontate prima di un loro impiego in sistemi real-time ad alta criticita'. In questa tesi abbiamo centrato la nostra attenzione su alcune delle piu' importanti sorgenti di impredicibilita' che devono essere rimosse al fine di integrare hardware e software COTS in sistemi hard real-time. Come prima cosa abbiamo sviluppato ASMP-Linux, una variante di Linux che minimizza overhead e latenza del sistema operativo. Successivamente abbiamo progettato ed implementato un nuovo sistema di gestione dell'I/O, basato sul Real-Time Bridge, un nuovo componente hardware che fornisce isolamento temporale sui bus COTS e rimuove le interferenze fra periferiche di I/O. E' stato anche sviluppato un Multi-Flow Real-Time Bridge per assicurare predicibilita' nel caso di periferiche condivise. Infine abbiamo proposto PREM, un nuovo modello di esecuzione per sistemi real-time che elimina le interferenze fra periferiche e CPU, e quelle fra processi ad alta criticita' ed interruzioni hardware. Per ognuna delle nostre soluzioni saranno descritti in dettaglio gli aspetti teorici, l'implementazione dei prototipi ed i risultati sperimentali.Real-time embedded systems are increasingly being built using Commercial Off-The-Shelf (COTS) components such as mass-produced peripherals and buses to reduce costs, time-to-market, and increase performance. Unfortunately, COTS hardware and operating systems are typically designed to optimize average performance, instead of determinism, predictability, and reliability, hence their employment in high criticality real-time systems is still a daunting task. In this thesis, we addressed some of the most important sources of unpredictability which must be removed in order to integrate COTS hardware and software into hard real-time systems. We first developed ASMP-Linux, a variant of Linux, capable of minimizing both operating system overhead and latency. Next, we designed and implemented a new I/O management system, based on real-time bridges, a novel hardware component that provides temporal isolation on the COTS bus and removes the interference among I/O peripherals. A multi-flow real-time bridge has been also developed to address interperipheral interference, allowing predictable device sharing. Finally, we propose PREM, a new execution model for real-time systems which eliminates interference between peripherals and the CPU, as well as interference between a critical task and driver interrupts. For each of our solutions, we will describe in detail theory aspects, as well as prototype implementations and experimental measurements

ART

Cycle-accurate modeling of multicore processors on FPGAs

Author: Khan Asif I. (Asif Imtiaz)
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2013
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2013.Cataloged from PDF version of thesis.Includes bibliographical references (pages 169-176).We present a novel modeling methodology which enables the generation of a high-performance, cycle-accurate simulator from a cycle-level specification of the target design. We describe Arete, a full-system multicore processor simulator, developed using our modeling methodology. We provide details on Arete's resource-efficient and high-performance implementation on multiple FPGA platforms, and the architectural experiments performed using it. We present clear evidence that the use of simplified models in architectural studies can lead to wrong conclusions. Through two experiments performed using both cycle-accurate and simplified models, we show that on one hand there are substantial quantitative and qualitative differences in results, and on the other, the results match quite well.by Asif Imtiaz Khan.Ph.D

DSpace@MIT

Programming models for many-core architectures: a co-design approach

Author: Rutgers Jochem Hendrik
Publication venue: University of Twente
Publication date: 14/05/2014
Field of study

Common many-core processors contain tens of cores and distributed memory. Compared to a multicore system, which only has a few tightly coupled cores sharing a single bus and memory, several complex problems arise. Notably, many cores require many parallel tasks to fully utilize the cores, and communication happens in a distributed and decentralized way. Therefore, programming such a processor requires the application to exhibit concurrency. In contrast to a single-core application, a concurrent application has to deal with memory state changes with an observable (non-deterministic) intermediate state. The complexity introduced by these problems makes programming a many-core system with a single-core-based programming approach notoriously hard.\ud \ud The central concept of this thesis is that abstractions, which are related to (many-core) programming, are structured in a single platform model. A platform is a layered view of the hardware, a memory model, a concurrency model, a model of computation, and compile-time and run-time tooling. Then, a programming model is a specific view on this platform, which is used by a programmer. In this view, some details can be hidden from the programmer's perspective, some details cannot. For example, an operating system presents an infinite number of parallel virtual execution units to the application whilst it hides details regarding scheduling. On the other hand, a programmer usually has balance workload among threads by hand.\ud \ud This thesis presents modifications to different abstraction layers of a many-core architecture, in order to make the system as a whole more efficient, and to reduce the programming complexity. These modifications influence other abstractions in the platform, and especially the programming model. Therefore, this thesis applies co-design on all models. Notably, co-design of the memory model, concurrency model, and model of computation is required for a scalable implementation of lambda-calculus. Moreover, only the combination of requirements of the many-core hardware from one side and the concurrency model from the other leads to a memory model abstraction. Hence, this thesis shows that to cope with the current trends in many-core architectures from a programming perspective, it is essential and feasible to inspect and adapt all abstractions collectively

University of Twente Research Information