Search CORE

11,104 research outputs found

Recommended from our members

Reconfigurable resource scheduling

Author: Sun Yu, doctor of computer sciences
Publication venue
Publication date: 01/01/2007
Field of study

Multi-core and multi-processor environments are increasingly used to support a wide range of applications. These environments host multiple services simultaneously. The set of processors configured to support a particular service depends upon the associated workload; fluctuations in workload require changes in processor allocation. In these systems, reallocating a processor from one service to another tends to incur a nonnegligible overhead. Motivated by these applications, this dissertation considers a class of scheduling problems that we refer to as reconfiguration resource scheduling. The salient features of this class are as follows: There are jobs of different categories, and resources can be reconfigured to process jobs of a certain category, where a reconfiguration incurs an overhead, in terms of cost or time. In our initial investigation, we study the following subclass of the class of reconfigurable resource scheduling problems. We are given a finite set of resources, each of which has an associated category, and a sequence of requests, each of which is a set of unit jobs. Each job has an associated category, and needs to be executed on a resource of the same category within a specified delay bound of its arrival, or else it is dropped at a specified drop cost. At any time, a resource can be reconfigured to a different category at a specified reconfiguration cost. The goal is to schedule the reconfigurations of the resources, and the executions of the jobs, in a way that minimizes the total cost. We design efficient online algorithms with provably good performance for two main problems in this subclass, one allowing category-specific drop costs, which we refer to as reconfigurable resource scheduling with variable drop costs, and the other allowing category-specific delay bounds, which we refer to as reconfigurable resource scheduling with variable delay bounds. Reconfigurable resource scheduling with variable drop costs is motivated by certain applications in which some jobs are more important than others. We solve this problem using a layered approach, where in each layer we reduce to a scheduling problem defined over a more constrained set of possible inputs. In the first layer, we reduce to the special case in which all job arrivals are batched. In the second layer, we reduce to the special case in which the job arrival rate is limited. In the third layer, we reduce the rate-limited problem to two cases: large reconfiguration cost, and small reconfiguration cost. We use a traffic reshaping technique to smooth out the job arrivals, and thereby reduce the case with large reconfiguration cost to the special case of unit delay, and reduce the case with small reconfiguration cost to the special case of rate-limited unit delay. In the fourth layer, we reduce unit delay with large reconfiguration cost to a caching problem which we refer to as file caching with remote reads, and reduce rate-limited unit delay with small reconfiguration cost to a variant of disk paging problem which we refer to as prefix paging. In the fifth layer, we solve the file caching with remote reads problem by generalizing certain existing work in the area of file caching, and we solve prefix paging using a kind of marking algorithm. Reconfigurable resource scheduling with variable delay bounds is motivated by applications in which jobs are required to be processed within category-specific delay guarantees. Once again, we use a layered approach. The first two layers are analogous to the first two layers in our solution for reconfigurable resource scheduling with variable drop costs, respectively, but are more involved due to the variable delay bounds. In the third layer, we solve the rate-limited problem using a novel combination of the EDF and LRU scheduling principles.Computer Science

Texas ScholarWorks

Coarse-grained reconfigurable array architectures

Author: A Lambrechts
B Bougard
B Bougard
B Mei
B Mei
B Mei
B Sutter De
G Venkataramani
H Park
H Park
J Lee
JMP Cardoso
JW Waerdt van de
K Berkel van
K Bondalapati
K Sankaralingam
KE Coons
LH Lee
M Ahn
M Gebhart
M Schlansker
M Taylor
M Woh
MD Galanis
MH Lee
S Friedman
SA Mahlke
T Oh
Y Kim
Y Kim
Y Kim
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Coarse-Grained Reconﬁgurable Array (CGRA) architectures accelerate the same inner loops that beneﬁt from the high ILP support in VLIW architectures. By executing non-loop code on other cores, however, CGRAs can focus on such loops to execute them more efﬁciently. This chapter discusses the basic principles of CGRAs, and the wide range of design options available to a CGRA designer, covering a large number of existing CGRA designs. The impact of different options on ﬂexibility, performance, and power-efﬁciency is discussed, as well as the need for compiler support. The ADRES CGRA design template is studied in more detail as a use case to illustrate the need for design space exploration, for compiler support and for the manual ﬁne-tuning of source code

Crossref

Ghent University Academic Bibliography

Runtime Scheduling, Allocation, and Execution of Real-Time Hardware Tasks onto Xilinx FPGAs Subject to Fault Occurrence

Author: Arslan Tughrul
Benkrid Khaled
Ebrahim Ali
Hong Chuan
Iturbe Xabier
Martinez Imanol
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2013
Field of study

This paper describes a novel way to exploit the computation capabilities delivered by modern Field-Programmable Gate Arrays (FPGAs), not only towards a higher performance, but also towards an improved reliability. Computation-specific pieces of circuitry are dynamically scheduled and allocated to different resources on the chip based on a set of novel algorithms which are described in detail in this article. These algorithms consider most of the technological constraints existing in modern partially reconfigurable FPGAs as well as spontaneously occurring faults and emerging permanent damage in the silicon substrate of the chip. In addition, the algorithms target other important aspects such as communications and synchronization among the different computations that are carried out, either concurrently or at different times. The effectiveness of the proposed algorithms is tested by means of a wide range of synthetic simulations, and, notably, a proof-of-concept implementation of them using real FPGA hardware is outlined

Crossref

Directory of Open Access Journals

Edinburgh Research Explorer

Implementing Reconfigurable Wireless Sensor Networks: The Embedded Operating System Approach

Author: Eronu E
Misra Sanjay
Publication venue
Publication date: 01/01/2012
Field of study

IntechOpen

Covenant University Repository

Crossref

Mapping and Scheduling of Directed Acyclic Graphs on An FPFA Tile

Author: Guo Y.
Smit G.J.M.
Publication venue: STW Technology Foundation
Publication date: 01/01/2002
Field of study

An architecture for a hand-held multimedia device requires components that are energy-efficient, flexible, and provide high performance. In the CHAMELEON [4] project we develop a coarse grained reconfigurable device for DSP-like algorithms, the so-called Field Programmable Function Array (FPFA). The FPFA devices are reminiscent to FPGAs, but with a matrix of Processing Parts (PP) instead of CLBs. The design of the FPFA focuses on: (1) Keeping each PP small to maximize the number of PPs that can fit on a chip; (2) providing sufficient flexibility; (3) Low energy consumption; (4) Exploiting the maximum amount of parallelism; (5) A strong support tool for FPFA-based applications. The challenge in providing compiler support for the FPFA-based design stems from the flexibility of the FPFA structure. If we do not use the characteristics of the FPFA structure properly, the advantages of an FPFA may become its disadvantages. The GECKO1project focuses on this problem. In this paper, we present a mapping and scheduling scheme for applications running on one FPFA tile. Applications are written in C and C code is translated to a Directed Acyclic Graphs (DAG) [4]. This scheme can map a DAG directly onto the reconfigurable PPs of an FPFA tile. It tries to achieve low power consumption by exploiting locality of reference and high performance by exploiting maximum parallelism

University of Twente Research Information

Exploiting partial reconfiguration through PCIe for a microphone array network emulator

Author: Braeken An
da Silva Gomes Bruno
Domínguez Federico
Touhafi Abdellah
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2018
Field of study

The current Microelectromechanical Systems (MEMS) technology enables the deployment of relatively low-cost wireless sensor networks composed of MEMS microphone arrays for accurate sound source localization. However, the evaluation and the selection of the most accurate and power-efficient network’s topology are not trivial when considering dynamic MEMS microphone arrays. Although software simulators are usually considered, they consist of high-computational intensive tasks, which require hours to days to be completed. In this paper, we present an FPGA-based platform to emulate a network of microphone arrays. Our platform provides a controlled simulated acoustic environment, able to evaluate the impact of different network configurations such as the number of microphones per array, the network’s topology, or the used detection method. Data fusion techniques, combining the data collected by each node, are used in this platform. The platform is designed to exploit the FPGA’s partial reconfiguration feature to increase the flexibility of the network emulator as well as to increase performance thanks to the use of the PCI-express high-bandwidth interface. On the one hand, the network emulator presents a higher flexibility by partially reconfiguring the nodes’ architecture in runtime. On the other hand, a set of strategies and heuristics to properly use partial reconfiguration allows the acceleration of the emulation by exploiting the execution parallelism. Several experiments are presented to demonstrate some of the capabilities of our platform and the benefits of using partial reconfiguration

Crossref

Ghent University Academic Bibliography

Directory of Open Access Journals

A Comparative Study of Scheduling Techniques for Multimedia Applications on SIMD Pipelines

Author: Arslan Mehmet Ali
Gruian Flavius
Kuchcinski Krzysztof
Publication venue
Publication date: 01/01/2015
Field of study

Parallel architectures are essential in order to take advantage of the parallelism inherent in streaming applications. One particular branch of these employ hardware SIMD pipelines. In this paper, we analyse several scheduling techniques, namely ad hoc overlapped execution, modulo scheduling and modulo scheduling with unrolling, all of which aim to efficiently utilize the special architecture design. Our investigation focuses on improving throughput while analysing other metrics that are important for streaming applications, such as register pressure, buffer sizes and code size. Through experiments conducted on several media benchmarks, we present and discuss trade-offs involved when selecting any one of these scheduling techniques.Comment: Presented at DATE Friday Workshop on Heterogeneous Architectures and Design Methods for Embedded Image Systems (HIS 2015) (arXiv:1502.07241

arXiv.org e-Print Archive

Lund University Publications