Search CORE

982 research outputs found

Synthesis of application specific processor architectures for ultra-low energy consumption

Author: Kazmierski T J
Leech Charles
Publication venue
Publication date: 12/02/2014
Field of study

In this paper we suggest that further energy savings can be achieved by a new approach to synthesis of embedded processor cores, where the architecture is tailored to the algorithms that the core executes. In the context of embedded processor synthesis, both single-core and many-core, the types of algorithms and demands on the execution efficiency are usually known at the chip design time. This knowledge can be utilised at the design stage to synthesise architectures optimised for energy consumption. Firstly, we present an overview of both traditional energy saving techniques and new developments in architectural approaches to energy-efficient processing. Secondly, we propose a picoMIPS architecture that serves as an architectural template for energy-efficient synthesis. As a case study, we show how the picoMIPS architecture can be tailored to an energy efficient execution of the DCT algorithm

Southampton (e-Prints Soton)

Real-Time Task Migration for Dynamic Resource Management in Many-Core Systems

Author: Pourmohseni Behnaz
Smirnov Fedor
Wildermann Stefan
Publication venue: OASIcs - OpenAccess Series in Informatics. Workshop on Next Generation Real-Time Embedded Systems (NG-RES 2020)
Publication date: 01/01/2020
Field of study

Dagstuhl Research Online Publication Server

Actors that Unify Threads and Events

Author: B. Chin
C. Varela
C.E. Hewitt
C.T. Haynes
D. Lea
D.A. Thomas
E. Gamma
G.A. Agha
H.C. Lauer
J. Armstrong
J.H. Nyström
P. Haller
R.P. Draves
T. Harris
T. Lindholm
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

There is an impedance mismatch between message-passing concurrency and virtual machines, such as the JVM. VMs usually map their threads to heavyweight OS processes. Without a lightweight process abstraction, users are often forced to write parts of concurrent applications in an event-driven style which obscures control flow, and increases the burden on the programmer. In this paper we show how thread-based and event-based programming can be unified under a single actor abstraction. Using advanced abstraction mechanisms of the Scala programming language, we implemented our approach on unmodified JVMs. Our programming model integrates well with the threading model of the underlying VM

Infoscience - École polytechnique fédérale de Lausanne

Crossref

A Framework for Adaptable Operating and Runtime Systems

Author: Sterling Thomas
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date: 04/03/2014
Field of study

The emergence of new classes of HPC systems where performance improvement is enabled by Moore’s Law for technology is manifest through multi-core-based architectures including specialized GPU structures. Operating systems were originally designed for control of uniprocessor systems. By the 1980s multiprogramming, virtual memory, and network interconnection were integral services incorporated as part of most modern computers. HPC operating systems were primarily derivatives of the Unix model with Linux dominating the Top-500 list. The use of Linux for commodity clusters was first pioneered by the NASA Beowulf Project. However, the rapid increase in number of cores to achieve performance gain through technology advances has exposed the limitations of POSIX general-purpose operating systems in scaling and efficiency. This project was undertaken through the leadership of Sandia National Laboratories and in partnership of the University of New Mexico to investigate the alternative of composable lightweight kernels on scalable HPC architectures to achieve superior performance for a wide range of applications. The use of composable operating systems is intended to provide a minimalist set of services specifically required by a given application to preclude overheads and operational uncertainties (“OS noise”) that have been demonstrated to degrade efficiency and operational consistency. This project was undertaken as an exploration to investigate possible strategies and methods for composable lightweight kernel operating systems towards support for extreme scale systems

Crossref

UNT Digital Library

Composable architecture for rack scale big data computing

Author: Abali Bulent
Chang Victor
Franke Hubertus
Kesavan Mukil
Li Chung-Sheng
Parris Colin
Publication venue: 'Elsevier BV'
Publication date: 01/02/2017
Field of study

The rapid growth of cloud computing, both in terms of the spectrum and volume of cloud workloads, necessitate re-visiting the traditional rack-mountable servers based datacenter design. Next generation datacenters need to offer enhanced support for: (i) fast changing system configuration requirements due to workload constraints, (ii) timely adoption of emerging hardware technologies, and (iii) maximal sharing of systems and subsystems in order to lower costs. Disaggregated datacenters, constructed as a collection of individual resources such as CPU, memory, disks etc., and composed into workload execution units on demand, are an interesting new trend that can address the above challenges. In this paper, we demonstrated the feasibility of composable systems through building a rack scale composable system prototype using PCIe switch. Through empirical approaches, we develop assessment of the opportunities and challenges for leveraging the composable architecture for rack scale cloud datacenters with a focus on big data and NoSQL workloads. In particular, we compare and contrast the programming models that can be used to access the composable resources, and developed the implications for the network and resource provisioning and management for rack scale architecture

Southampton (e-Prints Soton)

Crossref

Revisiting Actor Programming in C++

Author: Charousset Dominik
Hiesgen Raphael
Schmidt Thomas C.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

The actor model of computation has gained significant popularity over the last decade. Its high level of abstraction makes it appealing for concurrent applications in parallel and distributed systems. However, designing a real-world actor framework that subsumes full scalability, strong reliability, and high resource efficiency requires many conceptual and algorithmic additives to the original model. In this paper, we report on designing and building CAF, the "C++ Actor Framework". CAF targets at providing a concurrent and distributed native environment for scaling up to very large, high-performance applications, and equally well down to small constrained systems. We present the key specifications and design concepts---in particular a message-transparent architecture, type-safe message interfaces, and pattern matching facilities---that make native actors a viable approach for many robust, elastic, and highly distributed developments. We demonstrate the feasibility of CAF in three scenarios: first for elastic, upscaling environments, second for including heterogeneous hardware like GPGPUs, and third for distributed runtime systems. Extensive performance evaluations indicate ideal runtime behaviour for up to 64 cores at very low memory footprint, or in the presence of GPUs. In these tests, CAF continuously outperforms the competing actor environments Erlang, Charm++, SalsaLite, Scala, ActorFoundry, and even the OpenMPI.Comment: 33 page

arXiv.org e-Print Archive

Crossref

REPOSIT

Continuation-Passing C: compiling threads to events through continuations

Author: A. Adya
A. Dunkels
A. Fischbach
A. Wijngaarden van
A.W. Appel
C. Bruggeman
C. Tismer
C.A.R. Hoare
C.P. Wadsworth
C.T. Haynes
F. Boussinot
G. Kerneis
G. Necula
G.D. Plotkin
Gabriel Kerneis
H. Thielecke
J. Berdine
J. Fischer
J. Reppy
J. Vouillon
J.C. Reynolds
Juliusz Chroboczek
K. Claessen
M. Krohn
M. Wand
M. Welsh
O. Danvy
P. Haller
P. Li
P.J. Landin
R. Behren von
R.K. Dybvig
R.S. Engelschall
S. Srinivasan
S.E. Ganz
T. Harris
T. Johnsson
T. Rompf
V.S. Pai
W.D. Clinger
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/09/2011
Field of study

In this paper, we introduce Continuation Passing C (CPC), a programming language for concurrent systems in which native and cooperative threads are unified and presented to the programmer as a single abstraction. The CPC compiler uses a compilation technique, based on the CPS transform, that yields efficient code and an extremely lightweight representation for contexts. We provide a proof of the correctness of our compilation scheme. We show in particular that lambda-lifting, a common compilation technique for functional languages, is also correct in an imperative language like C, under some conditions enforced by the CPC compiler. The current CPC compiler is mature enough to write substantial programs such as Hekate, a highly concurrent BitTorrent seeder. Our benchmark results show that CPC is as efficient, while using significantly less space, as the most efficient thread libraries available.Comment: Higher-Order and Symbolic Computation (2012). arXiv admin note: substantial text overlap with arXiv:1202.324

arXiv.org e-Print Archive

Crossref

Hal-Diderot

Programming MPSoC platforms: Road works ahead

Author: Bekooij Marco
Domer Rainer
Leupers Rainer
Nohl Achim
Soonhoi Ha
Vajda Andras
Publication venue: IEEE Computer Society Press
Publication date: 01/01/2009
Field of study

This paper summarizes a special session on multicore/multi-processor system-on-chip (MPSoC) programming challenges. The current trend towards MPSoC platforms in most computing domains does not only mean a radical change in computer architecture. Even more important from a SW developer´s viewpoint, at the same time the classical sequential von Neumann programming model needs to be overcome. Efficient utilization of the MPSoC HW resources demands for radically new models and corresponding SW development tools, capable of exploiting the available parallelism and guaranteeing bug-free parallel SW. While several standards are established in the high-performance computing domain (e.g. OpenMP), it is clear that more innovations are required for successful\ud deployment of heterogeneous embedded MPSoC. On the other hand, at least for coming years, the freedom for disruptive programming technologies is limited by the huge amount of certified sequential code that demands for a more pragmatic, gradual tool and code replacement strategy

Publikationsserver der RWTH Aachen University

University of Twente Research Information