Search CORE

71,892 research outputs found

Learning Scheduling Algorithms for Data Processing Clusters

Author: Abadi Martín
Addanki Ravichandra
Dai Hanjun
Finn Chelsea
Ghodsi Ali
Gog Ionel
Grandl Robert
Greensmith Evan
Hindman Benjamin
Kingma Diederik P
Mao Hongzi
Mao Hongzi
Marcus Ryan
Mirhoseini Azalia
Mirhoseini Azalia
Pinto Lerrel
Schulman John
Spark Apache
Sutton S.
Weaver Lex
Zaharia Matei
Publication venue
Publication date: 21/08/2019
Field of study

Efficiently scheduling data processing jobs on distributed compute clusters requires complex algorithms. Current systems, however, use simple generalized heuristics and ignore workload characteristics, since developing and tuning a scheduling policy for each workload is infeasible. In this paper, we show that modern machine learning techniques can generate highly-efficient policies automatically. Decima uses reinforcement learning (RL) and neural networks to learn workload-specific scheduling algorithms without any human instruction beyond a high-level objective such as minimizing average job completion time. Off-the-shelf RL techniques, however, cannot handle the complexity and scale of the scheduling problem. To build Decima, we had to develop new representations for jobs' dependency graphs, design scalable RL models, and invent RL training methods for dealing with continuous stochastic job arrivals. Our prototype integration with Spark on a 25-node cluster shows that Decima improves the average job completion time over hand-tuned scheduling heuristics by at least 21%, achieving up to 2x improvement during periods of high cluster load

arXiv.org e-Print Archive

Robust Estimators are Hard to Compute

Author: Bernholt Thorsten
Publication venue
Publication date
Field of study

In modern statistics, the robust estimation of parameters of a regression hyperplane is a central problem. Robustness means that the estimation is not or only slightly affected by outliers in the data. In this paper, it is shown that the following robust estimators are hard to compute: LMS, LQS, LTS, LTA, MCD, MVE, Constrained M estimator, Projection Depth (PD) and Stahel-Donoho. In addition, a data set is presented such that the ltsReg-procedure of R has probability less than 0.0001 of finding a correct answer. Furthermore, it is described, how to design new robust estimators. --Computational statistics,complexity theory,robust statistics,algorithms,search heuristics

Heuristic standards for universal design in the face of technological diversity.

Author: Adams R.
Adams R.
Comley R.
Comley R.
Publication venue
Publication date: 01/01/2009
Field of study

CENTRAL PRINCIPLE Important technologies require validated standards for the design heuristics that are used to design and evaluate them, but not necessarily identical heuristics for every technology. BACKGROUND Heuristic standards provide a valuable toolkit with which to evaluate the accessibility of modern information society technologies (IST). But can we apply the same heuristic, generic standards to all types of technological platforms, in the face of their growing diversity e.g. websites, social websites, blogs, virtual reality applications, ambient intelligence etc (Adams, 2007)? Or would it be wiser to expect that different technologies might require different, if overlapping, standards? Can we really expect to design the interface of a modern cell phone on the same basis as for a table computer? Most impartial observers would probably say “no”. How can we introduce a systematic and thorough approach to the diverse technologies that are seen or predicted to be seen? Work in our laboratory has explored two useful questions. First, how to computer literate users perceive the different technologies? Second, how can different heuristic standards be developed where needed

An Atypical Survey of Typical-Case Heuristic Algorithms

Author: Hemaspaandra Lane A.
Williams Ryan
Publication venue
Publication date: 30/10/2012
Field of study

Heuristic approaches often do so well that they seem to pretty much always give the right answer. How close can heuristic algorithms get to always giving the right answer, without inducing seismic complexity-theoretic consequences? This article first discusses how a series of results by Berman, Buhrman, Hartmanis, Homer, Longpr\'{e}, Ogiwara, Sch\"{o}ening, and Watanabe, from the early 1970s through the early 1990s, explicitly or implicitly limited how well heuristic algorithms can do on NP-hard problems. In particular, many desirable levels of heuristic success cannot be obtained unless severe, highly unlikely complexity class collapses occur. Second, we survey work initiated by Goldreich and Wigderson, who showed how under plausible assumptions deterministic heuristics for randomized computation can achieve a very high frequency of correctness. Finally, we consider formal ways in which theory can help explain the effectiveness of heuristics that solve NP-hard problems in practice.Comment: This article is currently scheduled to appear in the December 2012 issue of SIGACT New

arXiv.org e-Print Archive

Sequential Parameter Optimization

Author: Bartz-Beielstein Thomas
Publication venue: Dagstuhl Seminar Proceedings. 09181 - Sampling-based Optimization in the Presence of Uncertainty
Publication date: 01/01/2009
Field of study

We provide a comprehensive, effective and very efficient methodology for the design and experimental analysis of algorithms. We rely on modern statistical techniques for tuning and understanding algorithms from an experimental perspective. Therefore, we make use of the sequential parameter optimization (SPO) method that has been successfully applied as a tuning procedure to numerous heuristics for practical and theoretical optimization problems. Two case studies, which illustrate the applicability of SPO to algorithm tuning and model selection, are presented

Dagstuhl Research Online Publication Server

Ant Colony Heuristic for Mapping and Scheduling Tasks and Communications on Heterogeneous Embedded Systems

Author: Ferrandi F.
Lanzi P.L.
Pilato C.
Sciuto D.
Tumeo A.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

To exploit the power of modern heterogeneous multiprocessor embedded platforms on partitioned applications, the designer usually needs to efficiently map and schedule all the tasks and the communications of the application, respecting the constraints imposed by the target architecture. Since the problem is heavily constrained, common methods used to explore such design space usually fail, obtaining low-quality solutions. In this paper, we propose an ant colony optimization (ACO) heuristic that, given a model of the target architecture and the application, efficiently executes both scheduling and mapping to optimize the application performance. We compare our approach with several other heuristics, including simulated annealing, tabu search, and genetic algorithms, on the performance to reach the optimum value and on the potential to explore the design space. We show that our approach obtains better results than other heuristics by at least 16% on average, despite an overhead in execution time. Finally, we validate the approach by scheduling and mapping a JPEG encoder on a realistic target architecture

Archivio istituzionale della ricerca - Politecnico di Milano