69,603 research outputs found

    Dynamic load balancing in parallel KD-tree k-means

    Get PDF
    One among the most influential and popular data mining methods is the k-Means algorithm for cluster analysis. Techniques for improving the efficiency of k-Means have been largely explored in two main directions. The amount of computation can be significantly reduced by adopting geometrical constraints and an efficient data structure, notably a multidimensional binary search tree (KD-Tree). These techniques allow to reduce the number of distance computations the algorithm performs at each iteration. A second direction is parallel processing, where data and computation loads are distributed over many processing nodes. However, little work has been done to provide a parallel formulation of the efficient sequential techniques based on KD-Trees. Such approaches are expected to have an irregular distribution of computation load and can suffer from load imbalance. This issue has so far limited the adoption of these efficient k-Means variants in parallel computing environments. In this work, we provide a parallel formulation of the KD-Tree based k-Means algorithm for distributed memory systems and address its load balancing issue. Three solutions have been developed and tested. Two approaches are based on a static partitioning of the data set and a third solution incorporates a dynamic load balancing policy

    Multi-level algorithms for modularity clustering

    Full text link
    Modularity is one of the most widely used quality measures for graph clusterings. Maximizing modularity is NP-hard, and the runtime of exact algorithms is prohibitive for large graphs. A simple and effective class of heuristics coarsens the graph by iteratively merging clusters (starting from singletons), and optionally refines the resulting clustering by iteratively moving individual vertices between clusters. Several heuristics of this type have been proposed in the literature, but little is known about their relative performance. This paper experimentally compares existing and new coarsening- and refinement-based heuristics with respect to their effectiveness (achieved modularity) and efficiency (runtime). Concerning coarsening, it turns out that the most widely used criterion for merging clusters (modularity increase) is outperformed by other simple criteria, and that a recent algorithm by Schuetz and Caflisch is no improvement over simple greedy coarsening for these criteria. Concerning refinement, a new multi-level algorithm is shown to produce significantly better clusterings than conventional single-level algorithms. A comparison with published benchmark results and algorithm implementations shows that combinations of coarsening and multi-level refinement are competitive with the best algorithms in the literature.Comment: 12 pages, 10 figures, see http://www.informatik.tu-cottbus.de/~rrotta/ for downloading the graph clustering softwar

    Magic-State Functional Units: Mapping and Scheduling Multi-Level Distillation Circuits for Fault-Tolerant Quantum Architectures

    Full text link
    Quantum computers have recently made great strides and are on a long-term path towards useful fault-tolerant computation. A dominant overhead in fault-tolerant quantum computation is the production of high-fidelity encoded qubits, called magic states, which enable reliable error-corrected computation. We present the first detailed designs of hardware functional units that implement space-time optimized magic-state factories for surface code error-corrected machines. Interactions among distant qubits require surface code braids (physical pathways on chip) which must be routed. Magic-state factories are circuits comprised of a complex set of braids that is more difficult to route than quantum circuits considered in previous work [1]. This paper explores the impact of scheduling techniques, such as gate reordering and qubit renaming, and we propose two novel mapping techniques: braid repulsion and dipole moment braid rotation. We combine these techniques with graph partitioning and community detection algorithms, and further introduce a stitching algorithm for mapping subgraphs onto a physical machine. Our results show a factor of 5.64 reduction in space-time volume compared to the best-known previous designs for magic-state factories.Comment: 13 pages, 10 figure

    Parallel symbolic state-space exploration is difficult, but what is the alternative?

    Full text link
    State-space exploration is an essential step in many modeling and analysis problems. Its goal is to find the states reachable from the initial state of a discrete-state model described. The state space can used to answer important questions, e.g., "Is there a dead state?" and "Can N become negative?", or as a starting point for sophisticated investigations expressed in temporal logic. Unfortunately, the state space is often so large that ordinary explicit data structures and sequential algorithms cannot cope, prompting the exploration of (1) parallel approaches using multiple processors, from simple workstation networks to shared-memory supercomputers, to satisfy large memory and runtime requirements and (2) symbolic approaches using decision diagrams to encode the large structured sets and relations manipulated during state-space generation. Both approaches have merits and limitations. Parallel explicit state-space generation is challenging, but almost linear speedup can be achieved; however, the analysis is ultimately limited by the memory and processors available. Symbolic methods are a heuristic that can efficiently encode many, but not all, functions over a structured and exponentially large domain; here the pitfalls are subtler: their performance varies widely depending on the class of decision diagram chosen, the state variable order, and obscure algorithmic parameters. As symbolic approaches are often much more efficient than explicit ones for many practical models, we argue for the need to parallelize symbolic state-space generation algorithms, so that we can realize the advantage of both approaches. This is a challenging endeavor, as the most efficient symbolic algorithm, Saturation, is inherently sequential. We conclude by discussing challenges, efforts, and promising directions toward this goal
    • …
    corecore