16 research outputs found
Presynthesis area estimation of reconfigurable streaming accelerators
In this paper, we propose algorithms for presynthesis estimation of hardware cost of a streaming accelerator. Our proposed estimation method helps to accelerate the design-spaceexploration phase by orders of magnitude by eliminating the need to perform logic and physical synthesis in each iteration. We present algorithms to perform early cost estimation of resources that are specific to a streaming accelerator, and we evaluate our techniques using an industrial tool flow and a set of streaming benchmarks. For the register-queue sizes, our estimations are in the range of 28%-9% of actual synthesis results on average, depending on the given resource constraints, while the datapath area estimations are within 14%. A typical estimation requires less than a minute, while generating the configuration bitstream of a streaming accelerator can take as much as 30 min according to our experiments. Considering several repetitions of the synthesis stage for the design space exploration, our estimation framework yields an order of magnitude speedup. © 2008 IEEE
Recommended from our members
A scheduling algorithm for optimization and early planning in high-level synthesis
Complexities of applications implemented on embedded and programmable systems grow with the advances in capacities and capabilities of these systems. Mapping applications onto them manually is becoming a very tedious task. This draws attention to using high-level synthesis within design flows. Meanwhile, it is essential to provide a flexible formulation of optimization objectives as well as to perform efficient planning for various design objectives early on in the design flow. In this work, we address these issues in the context of data flow graph (DFG) scheduling, which is an essential element within the high-level synthesis flow. We present an algorithm that schedules a chain of operations with data dependencies among consecutive operations at a single step. This local problem is repeated to generate the schedule for the whole DFG. The local problem is formulated as a maximum weight noncrossing bipartite matching. We use a technique from the computational geometry domain to solve the matching problem. This technique provides a theoretical guarantee on the solution quality for scheduling a single chain of operations. Although still being local, this provides a relatively wider perspective on the global scheduling objectives. In our experiments we compared the latencies obtained using our algorithm with the optimal latencies given by the exact solution to the integer linear programming (ILP) formulation of the problem. In 9 out of 14 DFGs tested, our algorithm found the optimal solution, while generating latencies comparable to the optimal solution in the remaining five benchmarks. The formulation of the objective function in our algorithm provides flexibility to incorporate different optimization goals. We present examples of how to exploit the versatility of our algorithm with specific examples of objective functions and experimental results on the ability of our algorithm to capture these objectives efficiently in the final schedules
Speeding-up heuristic allocation, scheduling and binding with SAT-based abstraction/refinement techniques
Hardware Synthesis is the process by which system-level, Register Transfer (RT) level or behavioral descriptions can be turned into real implementations, in terms of logic gates. Scheduling is one of the most time-consuming steps in the overall design flow, and may become much more complex when performing hardware synthesis from high-level specifications. Exploiting a single scheduling strategy on very large designs is often reductive and potentially inadequate. Furthermore, finding the "best" single candidate among all possible scheduling algorithms is practically infeasible. In this paper we introduce a hybrid scheduling approach, that is a preliminary step towards a comprehensive solution, not yet provided by industrial or by academic solutions. Our method relies on an abstract symbolic representation of data flow nodes (operations) bound to control flow paths: it produces a more realistic lower bound during the pre-scheduling resource estimation step and speeds up slower but accurate heuristic scheduling techniques, thus achieving a globally improved resul