3 research outputs found

    Automatically fused instructions: algorithms for the customization of the instruction, set of a recon?gurable architecture

    No full text
    In this dissertation, we address the design of algorithms for the automatic identi?cation and selection of complex application-speci?c instructions used to speed up the execution of applications on recon?gurable architectures. The computationally intensive portions of an application are analyzed and partitioned in segments of code to execute in software and segments of code to atomically execute in hardware as single instructions. These instructions extend the instruction-set of the recon?gurable architecture in use and they are application-speci?c. The main goal of the work presented in this dissertation is the identi?cation of application-speci?c instructions with multiple-inputs and multiple-outputs. The instructions are generated in two consecutive steps: ?rst, the application is partitioned in non-overlapping single-output instructions and, then, these instructions are further combined in multiple-output instructions following different policies. We propose different algorithms for the partitioning of an application in both single-output and multiple-output instructions. A number of approaches have been proposed in both academia and industry for extending a given instruction-set with application-speci?c instructions to speed up the execution of applications. The proposed solutions usually have a high computational complexity. The algorithms proposed in this dissertation provide quality solutions and have linear computational complexity in all cases but one, in which case the proposed solution is optimal. Additionally, compared with existing approaches, the new application-speci?c instructions are atomically executable in hardware by construction, whereas existing approaches increase the computational complexity by testing each generated instructions. The proposed algorithms are tested on the Molen recon?gurable architecture. The experimental results on well-known benchmarks show that a considerable speed up can be obtained in the execution of an application by using the application-speci?c instructions identi?ed by the proposed algorithms.Computer EngineeringElectrical Engineering, Mathematics and Computer Scienc

    A real-time hybrid neuron network for highly parallel cognitive systems

    No full text
    For comprehensive understanding of how neurons communicate with each other, new tools need to be developed that can accurately mimic the behaviour of such neurons and neuron networks under `real-time' constraints. In this paper, we propose an easily customisable, highly pipelined, neuron network design, which executes optimally scheduled floating-point operations for maximal amount of biophysically plausible neurons per FPGA family type. To reduce the required amount of resources without adverse effect on the calculation latency, a single exponent instance is used for multiple neuron calculation operations. Experimental results indicate that the proposed network design allows the simulation of up to 1188 neurons on Virtex7 (XC7VX550T) device in brain real-time yielding a speed-up of x12.4 compared to the state-of-the art.Education and Research SupportCircuits and System

    Evaluation of runtime task mapping using the rSesame framework

    Get PDF
    Performing runtime evaluation together with design time exploration enables a system to be more efficient in terms of various design constraints, such as performance, chip area, and power consumption. rSesame is a generic modeling and simulation framework, which can explore and evaluate reconfigurable systems at both design time and runtime. In this paper, we use the rSesame framework to perform a thorough evaluation (at design time and at runtime) of various task mapping heuristics from the state of the art. An extended Motion-JPEG (MJPEG) application is mapped, using the different heuristics, on a reconfigurable architecture, where different Field Programmable Gate Array (FPGA) resources and various nonfunctional design parameters, such as the execution time, the number of reconfigurations, the area usage, reusability efficiency, and other parameters, are taken into consideration. The experimental results suggest that such an extensive evaluation can provide a useful insight both into the characteristics of the reconfigurable architecture and on the efficiency of the task mapping.Software Computer TechnologyElectrical Engineering, Mathematics and Computer Scienc
    corecore