32,093 research outputs found

    Mapping of portable parallel programs

    Get PDF
    An efficient parallel program designed for a parallel architecture includes a detailed outline of accurate assignments of concurrent computations onto processors, and data transfers onto communication links, such that the overall execution time is minimized. This process may be complex depending on the application task and the target multiprocessor architecture. Furthermore, this process is to be repeated for every different architecture even though the application task may be the same. Consequently, this has a major impact on the ever increasing cost of software development for multiprocessor systems. A remedy for this problem would be to design portable parallel programs which can be mapped efficiently onto any computer system. In this dissertation, we present a portable programming tool called Cluster-M. The three components of Cluster-M are the Specification Module, the Representation Module, and the Mapping Module. In the Specification Module, for a given problem, a machine-independent program is generated and represented in the form of a clustered task graph called Spec graph. Similarly, in the Representation Module, for a given architecture or heterogeneous suite of computers, a clustered system graph called Rep graph is generated. The Mapping Module is responsible for efficient mapping of Spec graphs onto Rep graphs. As part of this module, we present the first algorithm which produces a near-optimal mapping of an arbitrary non-uniform machine-independent task graph with M modules, onto an arbitrary non-uniform task-independent system graph having N processors, in 0(M P) time, where P = max(M, N). Our experimental results indicate that Cluster-M produces better or similar mapping results compared to other leading techniques which work only for restricted task or system graphs

    Termination Detection of Local Computations

    Full text link
    Contrary to the sequential world, the processes involved in a distributed system do not necessarily know when a computation is globally finished. This paper investigates the problem of the detection of the termination of local computations. We define four types of termination detection: no detection, detection of the local termination, detection by a distributed observer, detection of the global termination. We give a complete characterisation (except in the local termination detection case where a partial one is given) for each of this termination detection and show that they define a strict hierarchy. These results emphasise the difference between computability of a distributed task and termination detection. Furthermore, these characterisations encompass all standard criteria that are usually formulated : topological restriction (tree, rings, or triangu- lated networks ...), topological knowledge (size, diameter ...), and local knowledge to distinguish nodes (identities, sense of direction). These results are now presented as corollaries of generalising theorems. As a very special and important case, the techniques are also applied to the election problem. Though given in the model of local computations, these results can give qualitative insight for similar results in other standard models. The necessary conditions involve graphs covering and quasi-covering; the sufficient conditions (constructive local computations) are based upon an enumeration algorithm of Mazurkiewicz and a stable properties detection algorithm of Szymanski, Shi and Prywes

    Theory and design of portable parallel programs for heterogeneous computing systems and networks

    Get PDF
    A recurring problem with high-performance computing is that advanced architectures generally achieve only a small fraction of their peak performance on many portions of real applications sets. The Amdahl\u27s law corollary of this is that such architectures often spend most of their time on tasks (codes/algorithms and the data sets upon which they operate) for which they are unsuited. Heterogeneous Computing (HC) is needed in the mid 90\u27s and beyond due to ever increasing super-speed requirements and the number of projects with these requirements. HC is defined as a special form of parallel and distributed computing that performs computations using a single autonomous computer operating in both SIMD and MIMD modes, or using a number of connected autonomous computers. Physical implementation of a heterogeneous network or system is currently possible due to the existing technological advances in networking and supercomputing. Unfortunately, software solutions for heterogeneous computing are still in their infancy. Theoretical models, software tools, and intelligent resource-management schemes need to be developed to support heterogeneous computing efficiently. In this thesis, we present a heterogeneous model of computation which encapsulates all the essential parameters for designing efficient software and hardware for HC. We also study a portable parallel programming tool, called Cluster-M, which implements this model. Furthermore, we study and analyze the hardware and software requirements of HC and show that, Cluster-M satisfies the requirements of HC environments

    Arbitrary Dimensional Majorana Dualities and Network Architectures for Topological Matter

    Get PDF
    Motivated by the prospect of attaining Majorana modes at the ends of nanowires, we analyze interacting Majorana systems on general networks and lattices in an arbitrary number of dimensions, and derive various universal spin duals. Such general complex Majorana architectures (other than those of simple square or other crystalline arrangements) might be of empirical relevance. As these systems display low-dimensional symmetries, they are candidates for realizing topological quantum order. We prove that (a) these Majorana systems, (b) quantum Ising gauge theories, and (c) transverse-field Ising models with annealed bimodal disorder are all dual to one another on general graphs. As any Dirac fermion (including electronic) operator can be expressed as a linear combination of two Majorana fermion operators, our results further lead to dualities between interacting Dirac fermionic systems. The spin duals allow us to predict the feasibility of various standard transitions as well as spin-glass type behavior in {\it interacting} Majorana fermion or electronic systems. Several new systems that can be simulated by arrays of Majorana wires are further introduced and investigated: (1) the {\it XXZ honeycomb compass} model (intermediate between the classical Ising model on the honeycomb lattice and Kitaev's honeycomb model), (2) a checkerboard lattice realization of the model of Xu and Moore for superconducting (p+ip)(p+ip) arrays, and a (3) compass type two-flavor Hubbard model with both pairing and hopping terms. By the use of dualities, we show that all of these systems lie in the 3D Ising universality class. We discuss how the existence of topological orders and bounds on autocorrelation times can be inferred by the use of symmetries and also propose to engineer {\it quantum simulators} out of these Majorana networks.Comment: v3,19 pages, 18 figures, submitted to Physical Review B. 11 new figures, new section on simulating the Hubbard model with nanowire systems, and two new appendice

    Learning by stochastic serializations

    Full text link
    Complex structures are typical in machine learning. Tailoring learning algorithms for every structure requires an effort that may be saved by defining a generic learning procedure adaptive to any complex structure. In this paper, we propose to map any complex structure onto a generic form, called serialization, over which we can apply any sequence-based density estimator. We then show how to transfer the learned density back onto the space of original structures. To expose the learning procedure to the structural particularities of the original structures, we take care that the serializations reflect accurately the structures' properties. Enumerating all serializations is infeasible. We propose an effective way to sample representative serializations from the complete set of serializations which preserves the statistics of the complete set. Our method is competitive or better than state of the art learning algorithms that have been specifically designed for given structures. In addition, since the serialization involves sampling from a combinatorial process it provides considerable protection from overfitting, which we clearly demonstrate on a number of experiments.Comment: Submission to NeurIPS 201

    Construction of and efficient sampling from the simplicial configuration model

    Get PDF
    Simplicial complexes are now a popular alternative to networks when it comes to describing the structure of complex systems, primarily because they encode multi-node interactions explicitly. With this new description comes the need for principled null models that allow for easy comparison with empirical data. We propose a natural candidate, the simplicial configuration model. The core of our contribution is an efficient and uniform Markov chain Monte Carlo sampler for this model. We demonstrate its usefulness in a short case study by investigating the topology of three real systems and their randomized counterparts (using their Betti numbers). For two out of three systems, the model allows us to reject the hypothesis that there is no organization beyond the local scale.Comment: 6 pages, 4 figure
    • …
    corecore