62 research outputs found

    Towards Dilated Placement of Dynamic NoC Cores

    Get PDF
    Instead of mapping application task graphs in a compact manner onto reconfigurable devices using a network-on-chip for interconnecting application cores, we propose dilating the mappings as much as the available latencies on critical connections allow. In a dilated mapping, the unused resources between an application\u27s configured components can be used to provide additional flexibility when the configuration needs to change. We motivate the reasons for dilating application task graphs targeted at reconfigurable devices; derive a simulated annealing approach to dilating the placement of such graphs; and present preliminary results of applying the algorithm to synthetic test cases. The method appears to result in successful and meaningful graph dilation and could be further tuned to satisfy desired power constraints

    Module Graph Merging and Placement to Reduce Reconfiguration Overheads in Paged FPGA Devices

    Full text link
    Reconfiguration time in dynamically-reconfigurable modular systems can severely limit application run-time compared to the critical path delay. In this paper we present a novel method to reduce reconfiguration time by maximising wire use and minimising wire reconfiguration. This builds upon our previously-presented methodology for creating modular, dynamically-reconfigurable applications targeted to an FPGA. The application of our techniques is demonstrated on an optical flow problem and show that graph merging can reduce reconfiguration delay by 50%. 1

    A configuration memory architecture for fast run-time reconfiguration of FPGAs

    Full text link
    This paper presents a configuration memory architecture that offers fast FPGA reconfiguration. The underlying principle behind the design is the use of fine-grained partial reconfiguration that allows significant configuration re-use while switching from one circuit to another. The proposed configuration memory works by reading on-chip configuration data into a buffer, modifying them based on the externally supplied data and writing them back to their original registers. A prototype implementation of the proposed design in a 90nm cell library indicates that the new memory adds less than 1% area to a commercially available FPGA implemented using the same library. The proposed design reduces the reconfiguration time for a wide set of benchmark circuits by 63%. However, power consumption during reconfiguration increases by a factor of 2.5 because the read-modify-write strategy results in more switching in the memory array

    A Configuration System Architecture Supporting Bit-Stream Compression for FPGAs

    Full text link
    This paper presents an investigation and design of an enhanced on-chip configuration memory system that can reduce the time to (re)configure an FPGA. The proposed system accepts configuration data in a compressed form and performs decompression internally, The resulting FPCA can be (re)configured in time proportional to the size of the compressed bit-stream. The compression technique exploits the redundancy present in typical configuration data. An analysis of configurations corresponding to a set of benchmark circuits reveals that data that controls the same types of configurable elements have a common byte that occurs at a significantly higher frequency. This common byte is simply broadcast to all instances of that element. This step is followed by byte updates if required. The new configuration system has modest hardware requirements and was observed to reduce reconfiguration time for the benchmark set by two-thirds on average

    Population based Ant Colony Optmization on FPGA

    Full text link
    We propose to modify a type of ant algorithm called Population based Ant Colony Optimization (P-ACO) to allow implementation on an FPGA architecture. Ant algorithms are adapted from the natural behavior of ants and used to find good solutions to combinatorial optimization problems. General layout on the FPGA and algorithmic description are covered. The most notable achievements featured in this paper are a runtime reduction and including the approximation of the heuristic function by a small set of favored decisions which changes over time

    The CUAVA-1 CubeSat—A Pathfinder Satellite for Remote Sensing and Earth Observation

    Get PDF
    In this paper we report a 3U CubeSat named CUAVA-1 designed by the ARC Training Centre for CubeSats, UAVs, and Their Applications (CUAVA). CUAVA, funded by the Australian Research Council, aims to train students, develop new instruments and technology to solve crucial problems, and help develop a world-class Australian industry in CubeSats, UAVs, and related products. The CUAVA-1 project is the Centre’s first CubeSat mission, following on from the 2 Australian satellites INSPIRE-2 and UNSW-EC0 CubeSats that launched in 2017. The mission is designed to serve as a precursor for a series of Earth observations missions and to demonstrate new technologies developed by our partners. We also intend to use the satellite to provide students hands-on experiences and to gain experience for our engineering, science and industry teams for future, more complex, missions

    Run-Time Compaction of FPGA Designs

    No full text
    Controllers for partially reconfigurable FPGAs that are capable of supporting multiple independent tasks simultaneously need to be able to place designs at run--time when the sequence of tasks is not known in advance, or the designs are not fixed. As tasks arrive and depart the cells available for placement become fragmented, thereby affecting the controller's ability to place new tasks. The response times of tasks and the utilization of FPGA cells consequently suffers. In this paper, we describe and assess a task migration heuristic to alleviate the problems of external fragmentation. Our task compaction strategy moves the designs placed in a given region of the chip closer together by suspending the tasks and reloading their configuration bit--streams with new offsets. We show by simulation that significant performance improvements are possible, and that for reasonable assumptions about the relative lengths of the configuration delay and the service period of tasks, the penalty for r..

    Ordered partial task compaction on mesh connected computers

    No full text
    Task compaction has been examined as a means of reducing fragmentation in partitionable machines based on multi{stage, hypercube, mesh and linear array interconnection networks. Ordered partial task compaction involves moving a subset of executing tasks without permuting their relative order to accommodate a request for a large submesh that would otherwise be blocked from entering the system. In this paper we develop the algorithms needed to nd good allocation sites and to perform one{dimensional ordered partial compactions simply on mesh of processor and recon gurable mesh architectures. We nd that signi cant performance gains can be obtained in heavily loaded systems even when link delays are large
    • …
    corecore