Search CORE

2 research outputs found

An adaptive prefix-assignment technique for symmetry reduction

Author: A Itzhakov
A Kerber
A Kerber
BD McKay
BD McKay
BD McKay
G Audemard
G Chu
J Devriendt
J Håstad
JD Dixon
JD Laderman
JE Hopcroft
JF Humphreys
JS Leon
M Bläser
M Bläser
M Codish
M Hall Jr
MJH Heule
P Kaski
RC Read
S Winograd
V Strassen
VB Alekseev
VB Alekseev
VB Alekseyev
Á Seress
Publication venue
Publication date: 08/09/2018
Field of study

This paper presents a technique for symmetry reduction that adaptively assigns a prefix of variables in a system of constraints so that the generated prefix-assignments are pairwise nonisomorphic under the action of the symmetry group of the system. The technique is based on McKay's canonical extension framework [J.~Algorithms 26 (1998), no.~2, 306--324]. Among key features of the technique are (i) adaptability---the prefix sequence can be user-prescribed and truncated for compatibility with the group of symmetries; (ii) parallelizability---prefix-assignments can be processed in parallel independently of each other; (iii) versatility---the method is applicable whenever the group of symmetries can be concisely represented as the automorphism group of a vertex-colored graph; and (iv) implementability---the method can be implemented relying on a canonical labeling map for vertex-colored graphs as the only nontrivial subroutine. To demonstrate the practical applicability of our technique, we have prepared an experimental open-source implementation of the technique and carry out a set of experiments that demonstrate ability to reduce symmetry on hard instances. Furthermore, we demonstrate that the implementation effectively parallelizes to compute clusters with multiple nodes via a message-passing interface.Comment: Updated manuscript submitted for revie

arXiv.org e-Print Archive

Crossref

Hierarchical Scheduling for Multicores with Multilevel Cache Hierarchies

Author: Agrawal Kunal
Sukha Jim
Publication venue: Washington University Open Scholarship
Publication date: 01/01/2011
Field of study

Cache-locality is an important consideration for the performance in multicore systems. In modern and future multicore systems with multilevel cache hierarchies, caches may be arranged in a tree of caches, where a level k cache is shared between Pk processors, called a processor group, and Pk increases with k. In order to get good performance, as much as possible, subcomputations that share more data should execute on processors which share a lower-level cache. Therefore, the number of cache misses in these systems depends on the scheduling decisions, and a scheduler is responsible for not just achieving good load-balance and low overheads, but also good cache complexity. However, these can be competing criteria. In this paper, we explore the tension between these criteria for online hierarchical schedulers. Formally, we consider a system with P processors, arranged in a multilevel hierarchy according to a hierarchy tree, where each of the P processors forms a leaf of the tree, and an internal node at level-k corresponds corresponds to a processor group. In addition, we assume that computations have locality regions, that represent parallel subcomputations that share data. Each locality region has a particular level, and the scheduler must ensure that a level-k locality region is executed by processors in the same level-k processor group, since they share a level k cache. Thus locality regions can improve cache performance. However, they may also impair load-balance and increase scheduling overheads since the scheduler must obey the restrictions posed by locality regions. In this paper, we present a framework of hierarchical computations, that is, computations with locality regions at multiple levels of nesting. We describe the hierarchical greedy scheduler, where each locality region is scheduled using a greedy scheduler which attempts to use as many processors as possible while obeying the restrictions posed by the locality regions. We derive a recurrence for the time complexity for a region in terms of its nested regions. We also describe how a more realistic hierarchical work-stealing scheduler can get the same bounds apart from constant factors for an important subclass of computations called homogenous computations. Finally, we also analyze the cache complexity of the hierarchical work-stealing scheduler for a system with a multilevel cache hierarchy

Washington University St. Louis: Open Scholarship