5,172 research outputs found
AFPTAS results for common variants of bin packing: A new method to handle the small items
We consider two well-known natural variants of bin packing, and show that
these packing problems admit asymptotic fully polynomial time approximation
schemes (AFPTAS). In bin packing problems, a set of one-dimensional items of
size at most 1 is to be assigned (packed) to subsets of sum at most 1 (bins).
It has been known for a while that the most basic problem admits an AFPTAS. In
this paper, we develop methods that allow to extend this result to other
variants of bin packing. Specifically, the problems which we study in this
paper, for which we design asymptotic fully polynomial time approximation
schemes, are the following. The first problem is "Bin packing with cardinality
constraints", where a parameter k is given, such that a bin may contain up to k
items. The goal is to minimize the number of bins used. The second problem is
"Bin packing with rejection", where every item has a rejection penalty
associated with it. An item needs to be either packed to a bin or rejected, and
the goal is to minimize the number of used bins plus the total rejection
penalty of unpacked items. This resolves the complexity of two important
variants of the bin packing problem. Our approximation schemes use a novel
method for packing the small items. This new method is the core of the improved
running times of our schemes over the running times of the previous results,
which are only asymptotic polynomial time approximation schemes (APTAS)
SLO-aware Colocation of Data Center Tasks Based on Instantaneous Processor Requirements
In a cloud data center, a single physical machine simultaneously executes
dozens of highly heterogeneous tasks. Such colocation results in more efficient
utilization of machines, but, when tasks' requirements exceed available
resources, some of the tasks might be throttled down or preempted. We analyze
version 2.1 of the Google cluster trace that shows short-term (1 second) task
CPU usage. Contrary to the assumptions taken by many theoretical studies, we
demonstrate that the empirical distributions do not follow any single
distribution. However, high percentiles of the total processor usage (summed
over at least 10 tasks) can be reasonably estimated by the Gaussian
distribution. We use this result for a probabilistic fit test, called the
Gaussian Percentile Approximation (GPA), for standard bin-packing algorithms.
To check whether a new task will fit into a machine, GPA checks whether the
resulting distribution's percentile corresponding to the requested service
level objective, SLO is still below the machine's capacity. In our simulation
experiments, GPA resulted in colocations exceeding the machines' capacity with
a frequency similar to the requested SLO.Comment: Author's version of a paper published in ACM SoCC'1
A Parallel Monte Carlo Code for Simulating Collisional N-body Systems
We present a new parallel code for computing the dynamical evolution of
collisional N-body systems with up to N~10^7 particles. Our code is based on
the the Henon Monte Carlo method for solving the Fokker-Planck equation, and
makes assumptions of spherical symmetry and dynamical equilibrium. The
principal algorithmic developments involve optimizing data structures, and the
introduction of a parallel random number generation scheme, as well as a
parallel sorting algorithm, required to find nearest neighbors for interactions
and to compute the gravitational potential. The new algorithms we introduce
along with our choice of decomposition scheme minimize communication costs and
ensure optimal distribution of data and workload among the processing units.
The implementation uses the Message Passing Interface (MPI) library for
communication, which makes it portable to many different supercomputing
architectures. We validate the code by calculating the evolution of clusters
with initial Plummer distribution functions up to core collapse with the number
of stars, N, spanning three orders of magnitude, from 10^5 to 10^7. We find
that our results are in good agreement with self-similar core-collapse
solutions, and the core collapse times generally agree with expectations from
the literature. Also, we observe good total energy conservation, within less
than 0.04% throughout all simulations. We analyze the performance of the code,
and demonstrate near-linear scaling of the runtime with the number of
processors up to 64 processors for N=10^5, 128 for N=10^6 and 256 for N=10^7.
The runtime reaches a saturation with the addition of more processors beyond
these limits which is a characteristic of the parallel sorting algorithm. The
resulting maximum speedups we achieve are approximately 60x, 100x, and 220x,
respectively.Comment: 53 pages, 13 figures, accepted for publication in ApJ Supplement
- …