213 research outputs found

    Runtime Scheduling, Allocation, and Execution of Real-Time Hardware Tasks onto Xilinx FPGAs Subject to Fault Occurrence

    Get PDF
    This paper describes a novel way to exploit the computation capabilities delivered by modern Field-Programmable Gate Arrays (FPGAs), not only towards a higher performance, but also towards an improved reliability. Computation-specific pieces of circuitry are dynamically scheduled and allocated to different resources on the chip based on a set of novel algorithms which are described in detail in this article. These algorithms consider most of the technological constraints existing in modern partially reconfigurable FPGAs as well as spontaneously occurring faults and emerging permanent damage in the silicon substrate of the chip. In addition, the algorithms target other important aspects such as communications and synchronization among the different computations that are carried out, either concurrently or at different times. The effectiveness of the proposed algorithms is tested by means of a wide range of synthetic simulations, and, notably, a proof-of-concept implementation of them using real FPGA hardware is outlined

    An FPGA Task Placement Algorithm Using Reflected Binary Gray Space Filling Curve

    Get PDF
    With the arrival of partial reconfiguration technology, modern FPGAs support tasks that can be loaded in (removed from) the FPGA individually without interrupting other tasks already running on the same FPGA. Many online task placement algorithms designed for such partially reconfigurable systems have been proposed to provide efficient and fast task placement. A new approach for online placement of modules on reconfigurable devices, by managing the free space using a run-length based representation. This representation allows the algorithm to insert or delete tasks quickly and also to calculate the fragmentation easily. In the proposed FPGA model, the CLBs are numbered according to reflected binary gray space filling curve model. The search algorithm will quickly identify a placement for the incoming task based on first fit mode or a fragmentation aware best fit mode. Simulation experiments indicate that the proposed techniques result in a low ratio of task rejection and high FPGA utilization compared to existing techniques

    Task scheduling and placement for reconfigurable devices

    Get PDF
    Partially reconfigurable devices allow the execution of different tasks at the same time, removing tasks when they finish and inserting new tasks when they arrive. This dissertation investigates scheduling and placing real-time tasks (tasks with deadline) on reconfigurable devices. One basic scheduler is the First-Fit scheduler. By allowing the First-Fit scheduler to retry tasks while they can satisfy their deadlines, we found that its performance can be enhanced to be better than other schedulers. We also proposed a placement idea based on partitioning the reconfigurable area into regions of various widths, assigning a task to a region based on its width. This idea has a similar rejection rate to a First-Fit scheduler that retries placing tasks and performs better than the First-Fit that does not retry tasks. Also, this regions-based scheduling method has a better running time. Managing how the space will be shared among tasks is a problems of interest. The main function of the free-space manager is to maintain information about the free space (areas not used by active tasks) after any placement or deletion of a task. Speed and efficiency of the free-space data structure are important as well as its effect on scheduler performance. We introduce the use of maximal horizontal strips and maximal vertical strips to represent free space. This resulted in a faster free space manager compared to what has been used in the area. Most researchers in the area of scheduling on reconfigurable devices assumed a homogeneous FPGA with only CLBs in the reconfigurable area. Most reconfigurable devices offered in the market, however, are not homogeneous but heterogeneous with other components between CLBs. We studied the effect of heterogeneity on the performance of schedulers designed for a homogeneous structure. We found that current schedulers result in worse performance when applied to a heterogeneous structure, but by simple modifications, we can apply them to a heterogeneous structure and achieve good performance. Consequently, the approach of studying homogeneous FPGAs is a valid one, as the scheduling ideas discovered there do carry over to heterogeneous FPGAs

    Parallel and Distributed Computing

    Get PDF
    The 14 chapters presented in this book cover a wide variety of representative works ranging from hardware design to application development. Particularly, the topics that are addressed are programmable and reconfigurable devices and systems, dependability of GPUs (General Purpose Units), network topologies, cache coherence protocols, resource allocation, scheduling algorithms, peertopeer networks, largescale network simulation, and parallel routines and algorithms. In this way, the articles included in this book constitute an excellent reference for engineers and researchers who have particular interests in each of these topics in parallel and distributed computing

    Compiler and Architecture Design for Coarse-Grained Programmable Accelerators

    Get PDF
    abstract: The holy grail of computer hardware across all market segments has been to sustain performance improvement at the same pace as silicon technology scales. As the technology scales and the size of transistors shrinks, the power consumption and energy usage per transistor decrease. On the other hand, the transistor density increases significantly by technology scaling. Due to technology factors, the reduction in power consumption per transistor is not sufficient to offset the increase in power consumption per unit area. Therefore, to improve performance, increasing energy-efficiency must be addressed at all design levels from circuit level to application and algorithm levels. At architectural level, one promising approach is to populate the system with hardware accelerators each optimized for a specific task. One drawback of hardware accelerators is that they are not programmable. Therefore, their utilization can be low as they perform one specific function. Using software programmable accelerators is an alternative approach to achieve high energy-efficiency and programmability. Due to intrinsic characteristics of software accelerators, they can exploit both instruction level parallelism and data level parallelism. Coarse-Grained Reconfigurable Architecture (CGRA) is a software programmable accelerator consists of a number of word-level functional units. Motivated by promising characteristics of software programmable accelerators, the potentials of CGRAs in future computing platforms is studied and an end-to-end CGRA research framework is developed. This framework consists of three different aspects: CGRA architectural design, integration in a computing system, and CGRA compiler. First, the design and implementation of a CGRA and its instruction set is presented. This design is then modeled in a cycle accurate system simulator. The simulation platform enables us to investigate several problems associated with a CGRA when it is deployed as an accelerator in a computing system. Next, the problem of mapping a compute intensive region of a program to CGRAs is formulated. From this formulation, several efficient algorithms are developed which effectively utilize CGRA scarce resources very well to minimize the running time of input applications. Finally, these mapping algorithms are integrated in a compiler framework to construct a compiler for CGRADissertation/ThesisDoctoral Dissertation Computer Science 201

    Towards the development of a reliable reconfigurable real-time operating system on FPGAs

    Get PDF
    In the last two decades, Field Programmable Gate Arrays (FPGAs) have been rapidly developed from simple “glue-logic” to a powerful platform capable of implementing a System on Chip (SoC). Modern FPGAs achieve not only the high performance compared with General Purpose Processors (GPPs), thanks to hardware parallelism and dedication, but also better programming flexibility, in comparison to Application Specific Integrated Circuits (ASICs). Moreover, the hardware programming flexibility of FPGAs is further harnessed for both performance and manipulability, which makes Dynamic Partial Reconfiguration (DPR) possible. DPR allows a part or parts of a circuit to be reconfigured at run-time, without interrupting the rest of the chip’s operation. As a result, hardware resources can be more efficiently exploited since the chip resources can be reused by swapping in or out hardware tasks to or from the chip in a time-multiplexed fashion. In addition, DPR improves fault tolerance against transient errors and permanent damage, such as Single Event Upsets (SEUs) can be mitigated by reconfiguring the FPGA to avoid error accumulation. Furthermore, power and heat can be reduced by removing finished or idle tasks from the chip. For all these reasons above, DPR has significantly promoted Reconfigurable Computing (RC) and has become a very hot topic. However, since hardware integration is increasing at an exponential rate, and applications are becoming more complex with the growth of user demands, highlevel application design and low-level hardware implementation are increasingly separated and layered. As a consequence, users can obtain little advantage from DPR without the support of system-level middleware. To bridge the gap between the high-level application and the low-level hardware implementation, this thesis presents the important contributions towards a Reliable, Reconfigurable and Real-Time Operating System (R3TOS), which facilitates the user exploitation of DPR from the application level, by managing the complex hardware in the background. In R3TOS, hardware tasks behave just like software tasks, which can be created, scheduled, and mapped to different computing resources on the fly. The novel contributions of this work are: 1) a novel implementation of an efficient task scheduler and allocator; 2) implementation of a novel real-time scheduling algorithm (FAEDF) and two efficacious allocating algorithms (EAC and EVC), which schedule tasks in real-time and circumvent emerging faults while maintaining more compact empty areas. 3) Design and implementation of a faulttolerant microprocessor by harnessing the existing FPGA resources, such as Error Correction Code (ECC) and configuration primitives. 4) A novel symmetric multiprocessing (SMP)-based architectures that supports shared memory programing interface. 5) Two demonstrations of the integrated system, including a) the K-Nearest Neighbour classifier, which is a non-parametric classification algorithm widely used in various fields of data mining; and b) pairwise sequence alignment, namely the Smith Waterman algorithm, used for identifying similarities between two biological sequences. R3TOS gives considerably higher flexibility to support scalable multi-user, multitasking applications, whereby resources can be dynamically managed in respect of user requirements and hardware availability. Benefiting from this, not only the hardware resources can be more efficiently used, but also the system performance can be significantly increased. Results show that the scheduling and allocating efficiencies have been improved up to 2x, and the overall system performance is further improved by ~2.5x. Future work includes the development of Network on Chip (NoC), which is expected to further increase the communication throughput; as well as the standardization and automation of our system design, which will be carried out in line with the enablement of other high-level synthesis tools, to allow application developers to benefit from the system in a more efficient manner

    Spatial-temporal reasoning applications of computational intelligence in the game of Go and computer networks

    Get PDF
    Spatial-temporal reasoning is the ability to reason with spatial images or information about space over time. In this dissertation, computational intelligence techniques are applied to computer Go and computer network applications. Among four experiments, the first three are related to the game of Go, and the last one concerns the routing problem in computer networks. The first experiment represents the first training of a modified cellular simultaneous recurrent network (CSRN) trained with cellular particle swarm optimization (PSO). Another contribution is the development of a comprehensive theoretical study of a 2x2 Go research platform with a certified 5 dan Go expert. The proposed architecture successfully trains a 2x2 game tree. The contribution of the second experiment is the development of a computational intelligence algorithm calledcollective cooperative learning (CCL). CCL learns the group size of Go stones on a Go board with zero knowledge by communicating only with the immediate neighbors. An analysis determines the lower bound of a design parameter that guarantees a solution. The contribution of the third experiment is the proposal of a unified system architecture for a Go robot. A prototype Go robot is implemented for the first time in the literature. The last experiment tackles a disruption-tolerant routing problem for a network suffering from link disruption. This experiment represents the first time that the disruption-tolerant routing problem has been formulated with a Markov Decision Process. In addition, the packet delivery rate has been improved under a range of link disruption levels via a reinforcement learning approach --Abstract, page iv

    Design and Management of Manufacturing Systems

    Get PDF
    Although the design and management of manufacturing systems have been explored in the literature for many years now, they still remain topical problems in the current scientific research. The changing market trends, globalization, the constant pressure to reduce production costs, and technical and technological progress make it necessary to search for new manufacturing methods and ways of organizing them, and to modify manufacturing system design paradigms. This book presents current research in different areas connected with the design and management of manufacturing systems and covers such subject areas as: methods supporting the design of manufacturing systems, methods of improving maintenance processes in companies, the design and improvement of manufacturing processes, the control of production processes in modern manufacturing systems production methods and techniques used in modern manufacturing systems and environmental aspects of production and their impact on the design and management of manufacturing systems. The wide range of research findings reported in this book confirms that the design of manufacturing systems is a complex problem and that the achievement of goals set for modern manufacturing systems requires interdisciplinary knowledge and the simultaneous design of the product, process and system, as well as the knowledge of modern manufacturing and organizational methods and techniques

    Autonomous Operation of a Reconfigurable Multi-Robot System for Planetary Space Missions

    Get PDF
    Reconfigurable robots can physically merge and form new types of composite systems. This ability leads to additional degrees of freedom for robot operations especially when dynamically composed robotic systems offer capabilities that none of the individual systems have. Research in the area of reconfigurable multi-robot systems has mainly been focused on swarm-based robots and thereby to systems with a high degree of modularity but a heavily restricted set of capabilities. In contrast, this thesis deals with heterogeneous robot teams comprising individually capable robots which are also modular and reconfigurable. In particular, the autonomous application of such reconfigurable multi-robot systems to enhance robotic space exploration missions is investigated. Exploiting the flexibility of a reconfigurable multi-robot system requires an appropriate system model and reasoner. Hence, this thesis introduces a special organisation model. This model accounts for the key characteristics of reconfigurable robots which are constrained by the availability and compatibility of hardware interfaces. A newly introduced mapping function between resource structures and functional properties permits to characterise dynamically created agent compositions. Since a combinatorial challenge lies in the identification of feasible and functionally suitable agents, this thesis further suggests bounding strategies to reason efficiently with composite robotic systems. This thesis proposes a mission planning algorithm which permits to exploit the flexibility of reconfigurable multi-robot systems. The implemented planner builds upon the developed organisation model so that multi-robot missions can be specified by high-level functionality constraints which are resolved to suitable combinations of robots. Furthermore, the planner synchronises robot activities over time and characterises plans according to three objectives: efficacy, efficiency and safety. The plannera s evaluation demonstrates an optimization of an exemplary space mission. This research is based on the parallel development of theoretical concepts and practical solutions while working with three reconfigurable multi-robot teams. The operation of a reconfigurable robotic team comes with practical constraints. Therefore, this thesis composes and evaluates an operational infrastructure which can serve as reference implementation. The identification and combination of applicable state-of-the-art technologies result in a distributed and dynamically extensible communication infrastructure which can maintain the properties of reconfigurable multi-robot systems. Field tests covering semi-autonomous and autonomous operation have been performed to characterise multi-robot missions and validate the autonomous control approach for reconfigurable multi-robot systems. The practical evaluation identified critical constraints and design elements for a successful application of reconfigurable multi-robot systems. Furthermore, the experiments point to improvements for the organisation model. This thesis is a wholistic approach to automate reconfigurable multi-robot systems. It identifies theoretical as well as practical challenges and it suggests effective solutions which permit an exploitation of an increased level of flexibility in future robotics missions

    FieldPlacer - A flexible, fast and unconstrained force-directed placement method for heterogeneous reconfigurable logic architectures

    Get PDF
    The field of placement methods for components of integrated circuits, especially in the domain of reconfigurable chip architectures, is mainly dominated by a handful of concepts. While some of these are easy to apply but difficult to adapt to new situations, others are more flexible but rather complex to realize. This work presents the FieldPlacer framework, a flexible, fast and unconstrained force-directed placement method for heterogeneous reconfigurable logic architectures, in particular for the ever important heterogeneous FPGAs. In contrast to many other force-directed placers, this approach is called ‘unconstrained’ as it does not require a priori fixed logic elements in order to calculate a force equilibrium as the solution to a system of equations. Instead, it is based on a free spring embedder simulation of a graph representation which includes all logic block types of a design simultaneously. The FieldPlacer framework offers a huge amount of flexibility in applying different distance norms (e. g., the Manhattan distance) for the force-directed layout and aims at creating adapted layouts for various objective functions, e. g., highest performance or improved routability. Depending on the individual situation, a runtime-quality trade-off can be considered to either produce a decent placement in a very short time or to generate an exceptionally good placement, which takes longer. An extensive comparison with the latest simulated annealing placement method from the well-known Versatile Place and Route (VPR) framework shows that the FieldPlacer approach can create placements of comparable quality much faster than VPR or, alternatively, generate better placements in the same time. The flexibility in defining arbitrary objective functions and the intuitive adaptability of the method, which, among others, includes different concepts from the field of graph drawing, should facilitate further developments with this framework, e. g., for new upcoming optimization targets like the energy consumption of an implemented design
    corecore