Search CORE

2,798 research outputs found

An Efficient Data Structure for Dynamic Two-Dimensional Reconfiguration

Author: A Bendersky
B Brubach
DS Hirschberg
G Bromley
I Kuon
JA Hinds
K Compton
KC Knowlton
KK Shen
M Koester
MA Bender
MA Bender
MA Bender
MA Bender
MG Gericota
SP Fekete
Publication venue
Publication date: 01/01/2016
Field of study

In the presence of dynamic insertions and deletions into a partially reconfigurable FPGA, fragmentation is unavoidable. This poses the challenge of developing efficient approaches to dynamic defragmentation and reallocation. One key aspect is to develop efficient algorithms and data structures that exploit the two-dimensional geometry of a chip, instead of just one. We propose a new method for this task, based on the fractal structure of a quadtree, which allows dynamic segmentation of the chip area, along with dynamically adjusting the necessary communication infrastructure. We describe a number of algorithmic aspects, and present different solutions. We also provide a number of basic simulations that indicate that the theoretical worst-case bound may be pessimistic.Comment: 11 pages, 12 figures; full version of extended abstract that appeared in ARCS 201

arXiv.org e-Print Archive

Crossref

Co-Scheduling Algorithms for High-Throughput Workload Execution

Author: Aupy Guillaume
Benoit Anne
Raghavan Padma
Robert Yves
Shantharam Manu
Publication venue
Publication date: 29/04/2013
Field of study

This paper investigates co-scheduling algorithms for processing a set of parallel applications. Instead of executing each application one by one, using a maximum degree of parallelism for each of them, we aim at scheduling several applications concurrently. We partition the original application set into a series of packs, which are executed one by one. A pack comprises several applications, each of them with an assigned number of processors, with the constraint that the total number of processors assigned within a pack does not exceed the maximum number of available processors. The objective is to determine a partition into packs, and an assignment of processors to applications, that minimize the sum of the execution times of the packs. We thoroughly study the complexity of this optimization problem, and propose several heuristics that exhibit very good performance on a variety of workloads, whose application execution times model profiles of parallel scientific codes. We show that co-scheduling leads to to faster workload completion time and to faster response times on average (hence increasing system throughput and saving energy), for significant benefits over traditional scheduling from both the user and system perspectives

arXiv.org e-Print Archive

HAL-ENS-LYON

CiteSeerX

INRIA a CCSD electronic archive server

Hal-Diderot

Defragmenting the Module Layout of a Partially Reconfigurable Device

Author: Ali Ahmadinia
Christophe Bobda
Frank Hannig
Jan C. Van Der Veen
Jürgen Teich
Mateusz Majer
Sándor P. Fekete
Publication venue
Publication date: 01/01/2005
Field of study

Modern generations of field-programmable gate arrays (FPGAs) allow for partial reconfiguration. In an online context, where the sequence of modules to be loaded on the FPGA is unknown beforehand, repeated insertion and deletion of modules leads to progressive fragmentation of the available space, making defragmentation an important issue. We address this problem by propose an online and an offline component for the defragmentation of the available space. We consider defragmenting the module layout on a reconfigurable device. This corresponds to solving a two-dimensional strip packing problem. Problems of this type are NP-hard in the strong sense, and previous algorithmic results are rather limited. Based on a graph-theoretic characterization of feasible packings, we develop a method that can solve two-dimensional defragmentation instances of practical size to optimality. Our approach is validated for a set of benchmark instances.Comment: 10 pages, 11 figures, 1 table, Latex, to appear in "Engineering of Reconfigurable Systems and Algorithms" as a "Distinguished Paper

arXiv.org e-Print Archive

CiteSeerX

Approximating Geometric Knapsack via L-packings

Author: Grandoni Fabrizio
Gálvez Waldo
Heydrich Sandy
Ingala Salvatore
Khan Arindam
Wiese Andreas
Publication venue
Publication date: 01/01/2017
Field of study

We study the two-dimensional geometric knapsack problem (2DK) in which we are given a set of n axis-aligned rectangular items, each one with an associated profit, and an axis-aligned square knapsack. The goal is to find a (non-overlapping) packing of a maximum profit subset of items inside the knapsack (without rotating items). The best-known polynomial-time approximation factor for this problem (even just in the cardinality case) is (2 + \epsilon) [Jansen and Zhang, SODA 2004]. In this paper, we break the 2 approximation barrier, achieving a polynomial-time (17/9 + \epsilon) < 1.89 approximation, which improves to (558/325 + \epsilon) < 1.72 in the cardinality case. Essentially all prior work on 2DK approximation packs items inside a constant number of rectangular containers, where items inside each container are packed using a simple greedy strategy. We deviate for the first time from this setting: we show that there exists a large profit solution where items are packed inside a constant number of containers plus one L-shaped region at the boundary of the knapsack which contains items that are high and narrow and items that are wide and thin. As a second major and the main algorithmic contribution of this paper, we present a PTAS for this case. We believe that this will turn out to be useful in future work in geometric packing problems. We also consider the variant of the problem with rotations (2DKR), where items can be rotated by 90 degrees. Also, in this case, the best-known polynomial-time approximation factor (even for the cardinality case) is (2 + \epsilon) [Jansen and Zhang, SODA 2004]. Exploiting part of the machinery developed for 2DK plus a few additional ideas, we obtain a polynomial-time (3/2 + \epsilon)-approximation for 2DKR, which improves to (4/3 + \epsilon) in the cardinality case.Comment: 64pages, full version of FOCS 2017 pape

arXiv.org e-Print Archive

Crossref

Repositorio Académico de la Universidad de Chile

MPG.PuRe

Survey on Combinatorial Register Allocation and Instruction Scheduling

Author: Lozano Roberto Castañeda
Schulte Christian
Publication venue
Publication date: 01/01/2018
Field of study

Register allocation (mapping variables to processor registers or memory) and instruction scheduling (reordering instructions to increase instruction-level parallelism) are essential tasks for generating efficient assembly code in a compiler. In the last three decades, combinatorial optimization has emerged as an alternative to traditional, heuristic algorithms for these two tasks. Combinatorial optimization approaches can deliver optimal solutions according to a model, can precisely capture trade-offs between conflicting decisions, and are more flexible at the expense of increased compilation time. This paper provides an exhaustive literature review and a classification of combinatorial optimization approaches to register allocation and instruction scheduling, with a focus on the techniques that are most applied in this context: integer programming, constraint programming, partitioned Boolean quadratic programming, and enumeration. Researchers in compilers and combinatorial optimization can benefit from identifying developments, trends, and challenges in the area; compiler practitioners may discern opportunities and grasp the potential benefit of applying combinatorial optimization

arXiv.org e-Print Archive

Publikationer från KTH

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Resource-efficient dynamic partial reconfiguration on FPGAs for space instruments

Author: Dörflinger Alexander
Fekete Sandor P.
Fiethe Björn
Keldenich Phillip
Michalik Harald
Scheffer Christian
Publication venue: IEEE
Publication date: 01/01/2017
Field of study

Field-Programmable Gate Arrays (FPGAs) provide highly flexible platforms to implement sophisticated data processing for scientific space instruments. The dynamic partial reconfiguration (DPR) capability of FPGAs allows it to schedule HW tasks. While this feature adds another dimension of processing power that can be exploited without significantly increasing system complexity and power consumption, there are still several challenges for an efficient DPR use. State-of-the-art concepts concentrate either on resource-efficient implementations at design time or flexible HW task scheduling at runtime. In this paper we propose a balanced algorithm that considers both optimization goals and is well suited for resource-limited space applications

Digitale Bibliothek Braunschweig

Optimization Modulo Theories with Linear Rational Costs

Author: Sebastiani Roberto
Tomasi Silvia
Publication venue
Publication date: 22/10/2014
Field of study

In the contexts of automated reasoning (AR) and formal verification (FV), important decision problems are effectively encoded into Satisfiability Modulo Theories (SMT). In the last decade efficient SMT solvers have been developed for several theories of practical interest (e.g., linear arithmetic, arrays, bit-vectors). Surprisingly, little work has been done to extend SMT to deal with optimization problems; in particular, we are not aware of any previous work on SMT solvers able to produce solutions which minimize cost functions over arithmetical variables. This is unfortunate, since some problems of interest require this functionality. In the work described in this paper we start filling this gap. We present and discuss two general procedures for leveraging SMT to handle the minimization of linear rational cost functions, combining SMT with standard minimization techniques. We have implemented the procedures within the MathSAT SMT solver. Due to the absence of competitors in the AR, FV and SMT domains, we have experimentally evaluated our implementation against state-of-the-art tools for the domain of linear generalized disjunctive programming (LGDP), which is closest in spirit to our domain, on sets of problems which have been previously proposed as benchmarks for the latter tools. The results show that our tool is very competitive with, and often outperforms, these tools on these problems, clearly demonstrating the potential of the approach.Comment: Submitted on january 2014 to ACM Transactions on Computational Logic, currently under revision. arXiv admin note: text overlap with arXiv:1202.140

arXiv.org e-Print Archive

CiteSeerX

Timing-Driven Macro Placement

Author: Ochsendorf Philipp
Publication venue: Universitäts- und Landesbibliothek Bonn
Publication date
Field of study

Placement is an important step in the process of finding physical layouts for electronic computer chips. The basic task during placement is to arrange the building blocks of the chip, the circuits, disjointly within a given chip area. Furthermore, such positions should result in short circuit interconnections which can be routed easily and which ensure all signals arrive in time. This dissertation mostly focuses on macros, the largest circuits on a chip. In order to optimize timing characteristics during macro placement, we propose a new optimistic timing model based on geometric distance constraints. This model can be computed and evaluated efficiently in order to predict timing traits accurately in practice. Packing rectangles disjointly remains strongly NP-hard under slack maximization in our timing model. Despite of this we develop an exact, linear time algorithm for special cases. The proposed timing model is incorporated into BonnMacro, the macro placement component of the BonnTools physical design optimization suite developed at the Research Institute for Discrete Mathematics. Using efficient formulations as mixed-integer programs we can legalize macros locally while optimizing timing. This results in the first timing-aware macro placement tool. In addition, we provide multiple enhancements for the partitioning-based standard circuit placement algorithm BonnPlace. We find a model of partitioning as minimum-cost flow problem that is provably as small as possible using which we can avoid running time intensive instances. Moreover we propose the new global placement flow Self-Stabilizing BonnPlace. This approach combines BonnPlace with a force-directed placement framework. It provides the flexibility to optimize the two involved objectives, routability and timing, directly during placement. The performance of our placement tools is confirmed on a large variety of academic benchmarks as well as real-world designs provided by our industrial partner IBM. We reduce running time of partitioning significantly and demonstrate that Self-Stabilizing BonnPlace finds easily routable placements for challenging designs – even when simultaneously optimizing timing objectives. BonnMacro and Self-Stabilizing BonnPlace can be combined to the first timing-driven mixed-size placement flow. This combination often finds placements with competitive timing traits and even outperforms solutions that have been determined manually by experienced designers

bonndoc – Der Publikationsserver der Universität Bonn

Approximation algorithms for 2d packing problems

Author: Gerber Olga
Publication venue
Publication date: 01/01/2005
Field of study

In this thesis we address such 2-dimensional packing problems as strip packing, bin packing and storage packing. These problems play an important role in many application areas, e.g. cutting stock, VLSI design, image processing, and multiprocessor scheduling. The larger part of work is devoted to the storage packing problem, that is the problem of packing weighted rectangles into a single rectangle so as to maximize the total weight of the packed rectangles. Despite the practical importance of the problem, there are just few known results in the literature. The main objective was to fill this gap and also to build the bridges to already known algorithmic solutions for strip packing and bin packing problems. This was successfully achieved. Considering natural relaxations of the storage packing problem we proposed a number of efficient algorithms which are able to find solutions within a factor of (1-\epsilon) of the optimum in polynomial time

MACAU: Open Access Repository of Kiel University