3,222 research outputs found

    Runtime Scheduling, Allocation, and Execution of Real-Time Hardware Tasks onto Xilinx FPGAs Subject to Fault Occurrence

    Get PDF
    This paper describes a novel way to exploit the computation capabilities delivered by modern Field-Programmable Gate Arrays (FPGAs), not only towards a higher performance, but also towards an improved reliability. Computation-specific pieces of circuitry are dynamically scheduled and allocated to different resources on the chip based on a set of novel algorithms which are described in detail in this article. These algorithms consider most of the technological constraints existing in modern partially reconfigurable FPGAs as well as spontaneously occurring faults and emerging permanent damage in the silicon substrate of the chip. In addition, the algorithms target other important aspects such as communications and synchronization among the different computations that are carried out, either concurrently or at different times. The effectiveness of the proposed algorithms is tested by means of a wide range of synthetic simulations, and, notably, a proof-of-concept implementation of them using real FPGA hardware is outlined

    The Chameleon Architecture for Streaming DSP Applications

    Get PDF
    We focus on architectures for streaming DSP applications such as wireless baseband processing and image processing. We aim at a single generic architecture that is capable of dealing with different DSP applications. This architecture has to be energy efficient and fault tolerant. We introduce a heterogeneous tiled architecture and present the details of a domain-specific reconfigurable tile processor called Montium. This reconfigurable processor has a small footprint (1.8 mm2^2 in a 130 nm process), is power efficient and exploits the locality of reference principle. Reconfiguring the device is very fast, for example, loading the coefficients for a 200 tap FIR filter is done within 80 clock cycles. The tiles on the tiled architecture are connected to a Network-on-Chip (NoC) via a network interface (NI). Two NoCs have been developed: a packet-switched and a circuit-switched version. Both provide two types of services: guaranteed throughput (GT) and best effort (BE). For both NoCs estimates of power consumption are presented. The NI synchronizes data transfers, configures and starts/stops the tile processor. For dynamically mapping applications onto the tiled architecture, we introduce a run-time mapping tool

    Dynamic Scheduling, Allocation, and Compaction Scheme for Real-Time Tasks on FPGAs

    Get PDF
    Run-time reconfiguration (RTR) is a method of computing on reconfigurable logic, typically FPGAs, changing hardware configurations from phase to phase of a computation at run-time. Recent research has expanded from a focus on a single application at a time to encompass a view of the reconfigurable logic as a resource shared among multiple applications or users. In real-time system design, task deadlines play an important role. Real-time multi-tasking systems not only need to support sharing of the resources in space, but also need to guarantee execution of the tasks. At the operating system level, sharing logic gates, wires, and I/O pins among multiple tasks needs to be managed. From the high level standpoint, access to the resources needs to be scheduled according to task deadlines. This thesis describes a task allocator for scheduling, placing, and compacting tasks on a shared FPGA under real-time constraints. Our consideration of task deadlines is novel in the setting of handling multiple simultaneous tasks in RTR. Software simulations have been conducted to evaluate the performance of the proposed scheme. The results indicate significant improvement by decreasing the number of tasks rejected

    An Approach to Manage Reconfigurations and Reduce Area Cost in Hard Real-Time Reconfigurable Systems

    Get PDF
    This article presents a methodology to build real-time reconfigurable systems that ensure that all the temporal constraints of a set of applications are met, while optimizing the utilization of the available reconfigurable resources. Starting from a static platform that meets all the real-time deadlines, our approach takes advantage of run-time reconfiguration in order to reduce the area needed while guaranteeing that all the deadlines are still met. This goal is achieved by identifying which tasks must be always ready for execution in order to meet the deadlines, and by means of a methodology that also allows reducing the area requirements

    Task scheduling and placement for reconfigurable devices

    Get PDF
    Partially reconfigurable devices allow the execution of different tasks at the same time, removing tasks when they finish and inserting new tasks when they arrive. This dissertation investigates scheduling and placing real-time tasks (tasks with deadline) on reconfigurable devices. One basic scheduler is the First-Fit scheduler. By allowing the First-Fit scheduler to retry tasks while they can satisfy their deadlines, we found that its performance can be enhanced to be better than other schedulers. We also proposed a placement idea based on partitioning the reconfigurable area into regions of various widths, assigning a task to a region based on its width. This idea has a similar rejection rate to a First-Fit scheduler that retries placing tasks and performs better than the First-Fit that does not retry tasks. Also, this regions-based scheduling method has a better running time. Managing how the space will be shared among tasks is a problems of interest. The main function of the free-space manager is to maintain information about the free space (areas not used by active tasks) after any placement or deletion of a task. Speed and efficiency of the free-space data structure are important as well as its effect on scheduler performance. We introduce the use of maximal horizontal strips and maximal vertical strips to represent free space. This resulted in a faster free space manager compared to what has been used in the area. Most researchers in the area of scheduling on reconfigurable devices assumed a homogeneous FPGA with only CLBs in the reconfigurable area. Most reconfigurable devices offered in the market, however, are not homogeneous but heterogeneous with other components between CLBs. We studied the effect of heterogeneity on the performance of schedulers designed for a homogeneous structure. We found that current schedulers result in worse performance when applied to a heterogeneous structure, but by simple modifications, we can apply them to a heterogeneous structure and achieve good performance. Consequently, the approach of studying homogeneous FPGAs is a valid one, as the scheduling ideas discovered there do carry over to heterogeneous FPGAs

    MURAC: A unified machine model for heterogeneous computers

    Get PDF
    Includes bibliographical referencesHeterogeneous computing enables the performance and energy advantages of multiple distinct processing architectures to be efficiently exploited within a single machine. These systems are capable of delivering large performance increases by matching the applications to architectures that are most suited to them. The Multiple Runtime-reconfigurable Architecture Computer (MURAC) model has been proposed to tackle the problems commonly found in the design and usage of these machines. This model presents a system-level approach that creates a clear separation of concerns between the system implementer and the application developer. The three key concepts that make up the MURAC model are a unified machine model, a unified instruction stream and a unified memory space. A simple programming model built upon these abstractions provides a consistent interface for interacting with the underlying machine to the user application. This programming model simplifies application partitioning between hardware and software and allows the easy integration of different execution models within the single control ow of a mixed-architecture application. The theoretical and practical trade-offs of the proposed model have been explored through the design of several systems. An instruction-accurate system simulator has been developed that supports the simulated execution of mixed-architecture applications. An embedded System-on-Chip implementation has been used to measure the overhead in hardware resources required to support the model, which was found to be minimal. An implementation of the model within an operating system on a tightly-coupled reconfigurable processor platform has been created. This implementation is used to extend the software scheduler to allow for the full support of mixed-architecture applications in a multitasking environment. Different scheduling strategies have been tested using this scheduler for mixed-architecture applications. The design and implementation of these systems has shown that a unified abstraction model for heterogeneous computers provides important usability benefits to system and application designers. These benefits are achieved through a consistent view of the multiple different architectures to the operating system and user applications. This allows them to focus on achieving their performance and efficiency goals by gaining the benefits of different execution models during runtime without the complex implementation details of the system-level synchronisation and coordination

    A Hardware Implementation of a Run-Time Scheduler for Reconfigurable Systems

    Get PDF
    New generation embedded systems demand high performance, efficiency and flexibility. Reconfigurable hardware can provide all these features. However the costly reconfiguration process and the lack of management support have prevented a broader use of these resources. To solve these issues we have developed a scheduler that deals with task-graphs at run-time, steering its execution in the reconfigurable resources while carrying out both prefetch and replacement techniques that cooperate to hide most of the reconfiguration delays. In our scheduling environment task-graphs are analyzed at design-time to extract useful information. This information is used at run-time to obtain near-optimal schedules, escaping from local-optimum decisions, while only carrying out simple computations. Moreover, we have developed a hardware implementation of the scheduler that applies all the optimization techniques while introducing a delay of only a few clock cycles. In the experiments our scheduler clearly outperforms conventional run-time schedulers based on As-Soon-As-Possible techniques. In addition, our replacement policy, specially designed for reconfigurable systems, achieves almost optimal results both regarding reuse and performance

    Low Power Processor Architectures and Contemporary Techniques for Power Optimization – A Review

    Get PDF
    The technological evolution has increased the number of transistors for a given die area significantly and increased the switching speed from few MHz to GHz range. Such inversely proportional decline in size and boost in performance consequently demands shrinking of supply voltage and effective power dissipation in chips with millions of transistors. This has triggered substantial amount of research in power reduction techniques into almost every aspect of the chip and particularly the processor cores contained in the chip. This paper presents an overview of techniques for achieving the power efficiency mainly at the processor core level but also visits related domains such as buses and memories. There are various processor parameters and features such as supply voltage, clock frequency, cache and pipelining which can be optimized to reduce the power consumption of the processor. This paper discusses various ways in which these parameters can be optimized. Also, emerging power efficient processor architectures are overviewed and research activities are discussed which should help reader identify how these factors in a processor contribute to power consumption. Some of these concepts have been already established whereas others are still active research areas. © 2009 ACADEMY PUBLISHER
    corecore