66 research outputs found

    MPICH-G2: A Grid-Enabled Implementation of the Message Passing Interface

    Full text link
    Application development for distributed computing "Grids" can benefit from tools that variously hide or enable application-level management of critical aspects of the heterogeneous environment. As part of an investigation of these issues, we have developed MPICH-G2, a Grid-enabled implementation of the Message Passing Interface (MPI) that allows a user to run MPI programs across multiple computers, at the same or different sites, using the same commands that would be used on a parallel computer. This library extends the Argonne MPICH implementation of MPI to use services provided by the Globus Toolkit for authentication, authorization, resource allocation, executable staging, and I/O, as well as for process creation, monitoring, and control. Various performance-critical operations, including startup and collective operations, are configured to exploit network topology information. The library also exploits MPI constructs for performance management; for example, the MPI communicator construct is used for application-level discovery of, and adaptation to, both network topology and network quality-of-service mechanisms. We describe the MPICH-G2 design and implementation, present performance results, and review application experiences, including record-setting distributed simulations.Comment: 20 pages, 8 figure

    Resolution strategies for serverless computing in information centric networking

    Get PDF
    Named Function Networking (NFN) offers to compute and deliver results of computations in the context of Information Centric Networking (ICN). While ICN offers data delivery without specifying the location where these data are stored, NFN offers the production of results without specifying where the actual computation is executed. In NFN, computation workflows are encoded in (ICN style) Interest Messages using the lambda calculus and based on these workflows, the network will distribute computations and find execution locations. Depending on the use case of the actual network, the decision where to execute a compuation can be different: A resolution strategy running on each node decides if a computation should be forwarded, split into subcomputations or executed locally. This work focuses on the design of resolution strategies for selected scenarios and the online derivation of "execution plans" based on network status and history. Starting with a simple resolution strategy suitable for data centers, we focus on improving load distribution within the data center or even between multiple data centers. We have designed resolution strategies that consider the size of input data and the load on nodes, leading to priced execution plans from which one can select the ones with the least costs. Moreover, we use these plans to create execution templates: Templates can be used to create a resolution strategy by simulating the execution using the planning system, tailored to the specific use case at hand. Finally we designed a resolution strategy for edge computing which is able to handle mobile scenarios typical for vehicular networking. This “mobile edge computing resolution strategy” handles the problem of frequent handovers to a sequence of road-side units without creating additional overhead for the non-mobile use case. All these resolution strategies were evaluated using a simulation system and were compared to the state of the art behavior of data center execution environments and/or cloud configurations. In the case of the vehicular networking strategy, we enhanced existing road-side units and implemented our NFN-based system and plan derivation such that we were able to run and validate our solution in real world tests for mobile edge computing

    An Answer to the Bose-Nelson Sorting Problem for 11 and 12 Channels

    Full text link
    We show that 11-channel sorting networks have at least 35 comparators and that 12-channel sorting networks have at least 39 comparators. This positively settles the optimality of the corresponding sorting networks given in The Art of Computer Programming vol. 3 and closes the two smallest open instances of the Bose-Nelson sorting problem. We obtain these bounds by generalizing a result of Van Voorhis from sorting networks to a more general class of comparator networks. From this we derive a dynamic programming algorithm that computes the optimal size for a sorting network with a given number of channels. From an execution of this algorithm we construct a certificate containing a derivation of the corresponding lower size bound, which we check using a program formally verified using the Isabelle/HOL proof assistant.Comment: Revised attribution of previous results in the introductio

    Cilk : efficient multithreaded computing

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1998.Includes bibliographical references (p. 170-179).by Keith H. Randall.Ph.D

    The Cilk system for parallel multithreaded computing

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1996.Includes bibliographical references (p. 187-199).by Christopher F. Joerg.Ph.D

    Portable high-performance programs

    Get PDF
    Thesis (Ph.D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1999.Includes bibliographical references (p. 159-169).by Matteo Frigo.Ph.D

    Provably good data-race detector that runs in parallel

    Get PDF
    Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2005.Includes bibliographical references (leaves 47-49).This thesis describes the implementation of a provably good data-race detector, called the Nondeterminator-3, which runs efficiently in parallel. A data race occurs in a multithreaded program when two logically parallel threads access the same location while holding no common locks and at least one of the accesses is a write. The Nondeterminator-3 checks for data races in programs coded in Cilk [3, 10], a shared-memory multithreaded programming language. A key capability of data-race detectors is in determining the series-parallel (SP) relationship between two threads. The Nondeterminator-3 is based on a provably good parallel SP-maintenance algorithm known as SP-hybrid [2]. For a program with n threads, T1 work, and critical-path length To, the SP-hybrid algorithm runs in O((T1/P + PTO) lg n) expected time when executed on P processors. A data-race detector must also maintain an access-history, which consists of, for each shared memory location, a representative subset of memory accesses to that location. The Nondeterminator-3 uses an extension of the ALL-SETS [4] access-history algorithm used by its serially running predecessor, the Nondeterminator-2. First, the ALL-SETS algorithm was extended to correctly support the inlet feature of Cilk.(cont.) This extension increases the memory-access cost by only a constant factor. Then, this extended ALL-SETS algorithm was parallelized, so that it can be combined with the SP-hybrid algorithm to obtain a data-race detector. Assuming that the cost of locking the access-history can be ignored, this parallelization also inflates the memory-access cost by only a constant factor. I tested the Nondeterminator-3 on several programs to verify the accuracy of the implementation. I have also observed that the Nondeterminator-3 achieves good speed-up when run on a multiprocessor machine.by Tushara C. Karunaratna.M.Eng

    Scaling up algorithmic debugging with virtual execution trees

    Full text link
    Declarative debugging is a powerful debugging technique that has been adapted to practically all programming languages. However, the technique suffers from important scalability problems in both time and memory. With realistic programs the huge size of the execution tree handled makes the debugging session impractical and too slow to be productive. In this work, we present a new architecture for declarative debuggers in which we adapt the technique to work with incomplete execution trees. This allows us to avoid the problem of loading the whole execution tree in main memory and solve the memory scalability problems. We also provide the technique with the ability to debug execution trees that are only partially generated. This allows the programmer to start the debugging session even before the execution tree is computed. This solves the time scalability problems. We have implemented the technique and show its practicality with several experiments conducted with real applications.Insa Cabrera, D.; Silva Galiana, JF. (2011). Scaling up algorithmic debugging with virtual execution trees. En Logic-Based Program Synthesis and Transformation. Springer Verlag (Germany). 6564:149-163. doi:10.1007/978-3-642-20551-4_10S1491636564Av-Ron, E.: Top-Down Diagnosis of Prolog Programs. PhD thesis, Weizmanm Institute (1984)Binks, D.: Declarative Debugging in Gödel. PhD thesis, University of Bristol (1995)Caballero, R.: A Declarative Debugger of Incorrect Answers for Constraint Functional-Logic Programs. In: Proc. of the 2005 ACM SIGPLAN Workshop on Curry and Functional Logic Programming (WCFLP 2005), pp. 8–13. ACM Press, New York (2005)Caballero, R.: Algorithmic Debugging of Java Programs. In: Proc. of the 2006 Workshop on Functional Logic Programming (WFLP 2006). Electronic Notes in Theoretical Computer Science, pp. 63–76 (2006)Caballero, R., Martí-Oliet, N., Riesco, A., Verdejo, A.: A declarative debugger for maude functional modules. Electronic Notes Theoretical Computer Science 238(3), 63–81 (2009)Davie, T., Chitil, O.: Hat-delta: One Right Does Make a Wrong. In: Seventh Symposium on Trends in Functional Programming, TFP 2006 (April 2006)Girgis, H., Jayaraman, B.: JavaDD: a Declarative Debugger for Java. Technical Report 2006-07, University at Buffalo (March 2006)Kokai, G., Nilson, J., Niss, C.: GIDTS: A Graphical Programming Environment for Prolog. In: Workshop on Program Analysis For Software Tools and Engineering (PASTE 1999), pp. 95–104. ACM Press, New York (1999)MacLarty, I.: Practical Declarative Debugging of Mercury Programs. PhD thesis, Department of Computer Science and Software Engineering, The University of Melbourne (2005)Sun Microsystems. Java Platform Debugger Architecture - JPDA (2010), http://java.sun.com/javase/technologies/core/toolsapis/jpda/Nilsson, H., Fritzson, P.: Algorithmic Debugging for Lazy Functional Languages. Journal of Functional Programming 4(3), 337–370 (1994)Shapiro, E.Y.: Algorithmic Program Debugging. MIT Press, Cambridge (1982)Silva, J.: An Empirical Evaluation of Algorithmic Debugging Strategies. Technical Report DSIC-II/10/09, UPV (2009), http://www.dsic.upv.es/~jsilva/research.htm#techsSilva, J.: Algorithmic debugging strategies. In: Proc. of International Symposium on Logic-based Program Synthesis and Transformation (LOPSTR 2006), pp. 134–140 (2006)Silva, J.: A Comparative Study of Algorithmic Debugging Strategies. In: Puebla, G. (ed.) LOPSTR 2006. LNCS, vol. 4407, pp. 143–159. Springer, Heidelberg (2007
    • …
    corecore