3,229 research outputs found
Approximating Geometric Knapsack via L-packings
We study the two-dimensional geometric knapsack problem (2DK) in which we are
given a set of n axis-aligned rectangular items, each one with an associated
profit, and an axis-aligned square knapsack. The goal is to find a
(non-overlapping) packing of a maximum profit subset of items inside the
knapsack (without rotating items). The best-known polynomial-time approximation
factor for this problem (even just in the cardinality case) is (2 + \epsilon)
[Jansen and Zhang, SODA 2004].
In this paper, we break the 2 approximation barrier, achieving a
polynomial-time (17/9 + \epsilon) < 1.89 approximation, which improves to
(558/325 + \epsilon) < 1.72 in the cardinality case. Essentially all prior work
on 2DK approximation packs items inside a constant number of rectangular
containers, where items inside each container are packed using a simple greedy
strategy. We deviate for the first time from this setting: we show that there
exists a large profit solution where items are packed inside a constant number
of containers plus one L-shaped region at the boundary of the knapsack which
contains items that are high and narrow and items that are wide and thin. As a
second major and the main algorithmic contribution of this paper, we present a
PTAS for this case. We believe that this will turn out to be useful in future
work in geometric packing problems.
We also consider the variant of the problem with rotations (2DKR), where
items can be rotated by 90 degrees. Also, in this case, the best-known
polynomial-time approximation factor (even for the cardinality case) is (2 +
\epsilon) [Jansen and Zhang, SODA 2004]. Exploiting part of the machinery
developed for 2DK plus a few additional ideas, we obtain a polynomial-time (3/2
+ \epsilon)-approximation for 2DKR, which improves to (4/3 + \epsilon) in the
cardinality case.Comment: 64pages, full version of FOCS 2017 pape
An Efficient Data Structure for Dynamic Two-Dimensional Reconfiguration
In the presence of dynamic insertions and deletions into a partially
reconfigurable FPGA, fragmentation is unavoidable. This poses the challenge of
developing efficient approaches to dynamic defragmentation and reallocation.
One key aspect is to develop efficient algorithms and data structures that
exploit the two-dimensional geometry of a chip, instead of just one. We propose
a new method for this task, based on the fractal structure of a quadtree, which
allows dynamic segmentation of the chip area, along with dynamically adjusting
the necessary communication infrastructure. We describe a number of algorithmic
aspects, and present different solutions. We also provide a number of basic
simulations that indicate that the theoretical worst-case bound may be
pessimistic.Comment: 11 pages, 12 figures; full version of extended abstract that appeared
in ARCS 201
Defragmenting the Module Layout of a Partially Reconfigurable Device
Modern generations of field-programmable gate arrays (FPGAs) allow for
partial reconfiguration. In an online context, where the sequence of modules to
be loaded on the FPGA is unknown beforehand, repeated insertion and deletion of
modules leads to progressive fragmentation of the available space, making
defragmentation an important issue. We address this problem by propose an
online and an offline component for the defragmentation of the available space.
We consider defragmenting the module layout on a reconfigurable device. This
corresponds to solving a two-dimensional strip packing problem. Problems of
this type are NP-hard in the strong sense, and previous algorithmic results are
rather limited. Based on a graph-theoretic characterization of feasible
packings, we develop a method that can solve two-dimensional defragmentation
instances of practical size to optimality. Our approach is validated for a set
of benchmark instances.Comment: 10 pages, 11 figures, 1 table, Latex, to appear in "Engineering of
Reconfigurable Systems and Algorithms" as a "Distinguished Paper
Co-Scheduling Algorithms for High-Throughput Workload Execution
This paper investigates co-scheduling algorithms for processing a set of
parallel applications. Instead of executing each application one by one, using
a maximum degree of parallelism for each of them, we aim at scheduling several
applications concurrently. We partition the original application set into a
series of packs, which are executed one by one. A pack comprises several
applications, each of them with an assigned number of processors, with the
constraint that the total number of processors assigned within a pack does not
exceed the maximum number of available processors. The objective is to
determine a partition into packs, and an assignment of processors to
applications, that minimize the sum of the execution times of the packs. We
thoroughly study the complexity of this optimization problem, and propose
several heuristics that exhibit very good performance on a variety of
workloads, whose application execution times model profiles of parallel
scientific codes. We show that co-scheduling leads to to faster workload
completion time and to faster response times on average (hence increasing
system throughput and saving energy), for significant benefits over traditional
scheduling from both the user and system perspectives
Survey on Combinatorial Register Allocation and Instruction Scheduling
Register allocation (mapping variables to processor registers or memory) and
instruction scheduling (reordering instructions to increase instruction-level
parallelism) are essential tasks for generating efficient assembly code in a
compiler. In the last three decades, combinatorial optimization has emerged as
an alternative to traditional, heuristic algorithms for these two tasks.
Combinatorial optimization approaches can deliver optimal solutions according
to a model, can precisely capture trade-offs between conflicting decisions, and
are more flexible at the expense of increased compilation time.
This paper provides an exhaustive literature review and a classification of
combinatorial optimization approaches to register allocation and instruction
scheduling, with a focus on the techniques that are most applied in this
context: integer programming, constraint programming, partitioned Boolean
quadratic programming, and enumeration. Researchers in compilers and
combinatorial optimization can benefit from identifying developments, trends,
and challenges in the area; compiler practitioners may discern opportunities
and grasp the potential benefit of applying combinatorial optimization
Dagstuhl Reports : Volume 1, Issue 2, February 2011
Online Privacy: Towards Informational Self-Determination on the Internet (Dagstuhl Perspectives Workshop 11061) : Simone Fischer-Hübner, Chris Hoofnagle, Kai Rannenberg, Michael Waidner, Ioannis Krontiris and Michael Marhöfer Self-Repairing Programs (Dagstuhl Seminar 11062) : Mauro Pezzé, Martin C. Rinard, Westley Weimer and Andreas Zeller Theory and Applications of Graph Searching Problems (Dagstuhl Seminar 11071) : Fedor V. Fomin, Pierre Fraigniaud, Stephan Kreutzer and Dimitrios M. Thilikos Combinatorial and Algorithmic Aspects of Sequence Processing (Dagstuhl Seminar 11081) : Maxime Crochemore, Lila Kari, Mehryar Mohri and Dirk Nowotka Packing and Scheduling Algorithms for Information and Communication Services (Dagstuhl Seminar 11091) Klaus Jansen, Claire Mathieu, Hadas Shachnai and Neal E. Youn
Matched filtering of gravitational waves from inspiraling compact binaries: Computational cost and template placement
We estimate the number of templates, computational power, and storage
required for a one-step matched filtering search for gravitational waves from
inspiraling compact binaries. These estimates should serve as benchmarks for
the evaluation of more sophisticated strategies such as hierarchical searches.
We use waveform templates based on the second post-Newtonian approximation for
binaries composed of nonspinning compact bodies in circular orbits. We present
estimates for six noise curves: LIGO (three configurations), VIRGO, GEO600, and
TAMA. To search for binaries with components more massive than 0.2M_o while
losing no more than 10% of events due to coarseness of template spacing,
initial LIGO will require about 1*10^11 flops (floating point operations per
second) for data analysis to keep up with data acquisition. This is several
times higher than estimated in previous work by Owen, in part because of the
improved family of templates and in part because we use more realistic (higher)
sampling rates. Enhanced LIGO, GEO600, and TAMA will require computational
power similar to initial LIGO. Advanced LIGO will require 8*10^11 flops, and
VIRGO will require 5*10^12 flops. If the templates are stored rather than
generated as needed, storage requirements range from 1.5*10^11 real numbers for
TAMA to 6*10^14 for VIRGO. We also sketch and discuss an algorithm for placing
the templates in the parameter space.Comment: 15 pages, 4 figures, submitted to Phys. Rev.
- …