71,892 research outputs found
Learning Scheduling Algorithms for Data Processing Clusters
Efficiently scheduling data processing jobs on distributed compute clusters
requires complex algorithms. Current systems, however, use simple generalized
heuristics and ignore workload characteristics, since developing and tuning a
scheduling policy for each workload is infeasible. In this paper, we show that
modern machine learning techniques can generate highly-efficient policies
automatically. Decima uses reinforcement learning (RL) and neural networks to
learn workload-specific scheduling algorithms without any human instruction
beyond a high-level objective such as minimizing average job completion time.
Off-the-shelf RL techniques, however, cannot handle the complexity and scale of
the scheduling problem. To build Decima, we had to develop new representations
for jobs' dependency graphs, design scalable RL models, and invent RL training
methods for dealing with continuous stochastic job arrivals. Our prototype
integration with Spark on a 25-node cluster shows that Decima improves the
average job completion time over hand-tuned scheduling heuristics by at least
21%, achieving up to 2x improvement during periods of high cluster load
Robust Estimators are Hard to Compute
In modern statistics, the robust estimation of parameters of a regression hyperplane is a central problem. Robustness means that the estimation is not or only slightly affected by outliers in the data. In this paper, it is shown that the following robust estimators are hard to compute: LMS, LQS, LTS, LTA, MCD, MVE, Constrained M estimator, Projection Depth (PD) and Stahel-Donoho. In addition, a data set is presented such that the ltsReg-procedure of R has probability less than 0.0001 of finding a correct answer. Furthermore, it is described, how to design new robust estimators. --Computational statistics,complexity theory,robust statistics,algorithms,search heuristics
Heuristic standards for universal design in the face of technological diversity.
CENTRAL PRINCIPLE
Important technologies require validated standards for the design heuristics that are used to design and evaluate them, but not necessarily identical heuristics for every technology.
BACKGROUND
Heuristic standards provide a valuable toolkit with which to evaluate the accessibility of modern information society technologies (IST). But can we apply the same heuristic, generic standards to all types of technological platforms, in the face of their growing diversity e.g. websites, social websites, blogs, virtual reality applications, ambient intelligence etc (Adams, 2007)? Or would it be wiser to expect that different technologies might require different, if overlapping, standards? Can we really expect to design the interface of a modern cell phone on the same basis as for a table computer? Most impartial observers would probably say “no”.
How can we introduce a systematic and thorough approach to the diverse technologies that are seen or predicted to be seen? Work in our laboratory has explored two useful questions. First, how to computer literate users perceive the different technologies? Second, how can different heuristic standards be developed where needed
An Atypical Survey of Typical-Case Heuristic Algorithms
Heuristic approaches often do so well that they seem to pretty much always
give the right answer. How close can heuristic algorithms get to always giving
the right answer, without inducing seismic complexity-theoretic consequences?
This article first discusses how a series of results by Berman, Buhrman,
Hartmanis, Homer, Longpr\'{e}, Ogiwara, Sch\"{o}ening, and Watanabe, from the
early 1970s through the early 1990s, explicitly or implicitly limited how well
heuristic algorithms can do on NP-hard problems. In particular, many desirable
levels of heuristic success cannot be obtained unless severe, highly unlikely
complexity class collapses occur. Second, we survey work initiated by Goldreich
and Wigderson, who showed how under plausible assumptions deterministic
heuristics for randomized computation can achieve a very high frequency of
correctness. Finally, we consider formal ways in which theory can help explain
the effectiveness of heuristics that solve NP-hard problems in practice.Comment: This article is currently scheduled to appear in the December 2012
issue of SIGACT New
Sequential Parameter Optimization
We provide a comprehensive, effective and very efficient methodology for the design and experimental analysis of algorithms.
We rely on modern statistical techniques for tuning and understanding algorithms
from an experimental perspective. Therefore, we make use of the sequential parameter optimization (SPO) method that has been successfully applied as a tuning procedure to numerous heuristics for practical and theoretical
optimization problems.
Two case studies, which illustrate the applicability of SPO to algorithm tuning
and model selection, are presented
Ant Colony Heuristic for Mapping and Scheduling Tasks and Communications on Heterogeneous Embedded Systems
To exploit the power of modern heterogeneous multiprocessor embedded platforms on partitioned applications, the designer usually needs to efficiently map and schedule all the tasks and the communications of the application, respecting the constraints imposed by the target architecture. Since the problem is heavily constrained, common methods used to explore such design space usually fail, obtaining low-quality solutions. In this paper, we propose an ant colony optimization (ACO) heuristic that, given a model of the target architecture and the application, efficiently executes both scheduling and mapping to optimize the application performance. We compare our approach with several other heuristics, including simulated annealing, tabu search, and genetic algorithms, on the performance to reach the optimum value and on the potential to explore the design space. We show that our approach obtains better results than other heuristics by at least 16% on average, despite an overhead in execution time. Finally, we validate the approach by scheduling and mapping a JPEG encoder on a realistic target architecture
- …