174 research outputs found

    A countable widely connected Hausdorff space

    Get PDF
    AbstractWe construct a countable connected Hausdorff space in which every connected subset containing more than one point is dense. We prove that every regularly open-maximal topology of such a space also has this property, and in addition it admits no decomposition into two connected disjoint proper subsets containing more than one point

    Enhancing Productivity and Performance Portability of General-Purpose Parallel Programming

    Get PDF
    This work focuses on compiler and run-time techniques for improving the productivity and the performance portability of general-purpose parallel programming. More specifically, we focus on shared-memory task-parallel languages, where the programmer explicitly exposes parallelism in the form of short tasks that may outnumber the cores by orders of magnitude. The compiler, the run-time, and the platform (henceforth the system) are responsible for harnessing this unpredictable amount of parallelism, which can vary from none to excessive, towards efficient execution. The challenge arises from the aspiration to support fine-grained irregular computations and nested parallelism. This work is even more ambitious by also aspiring to lay the foundations to efficiently support declarative code, where the programmer exposes all available parallelism, using high-level language constructs such as parallel loops, reducers or futures. The appeal of declarative code is twofold for general-purpose programming: it is often easier for the programmer who does not have to worry about the granularity of the exposed parallelism, and it achieves better performance portability by avoiding overfitting to a small range of platforms and inputs for which the programmer is coarsening. Furthermore, PRAM algorithms, an important class of parallel algorithms, naturally lend themselves to declarative programming, so supporting it is a necessary condition for capitalizing on the wealth of the PRAM theory. Unfortunately, declarative codes often expose such an overwhelming number of fine-grained tasks that existing systems fail to deliver performance. Our contributions can be partitioned into three components. First, we tackle the issue of coarsening, which declarative code leaves to the system. We identify two goals of coarsening and advocate tackling them separately, using static compiler transformations for one and dynamic run-time approaches for the other. Additionally, we present evidence that the current practice of burdening the programmer with coarsening either leads to codes with poor performance-portability, or to a significantly increased programming effort. This is a ``show-stopper'' for general-purpose programming. To compare the performance portability among approaches, we define an experimental framework and two metrics, and we demonstrate that our approaches are preferable. We close the chapter on coarsening by presenting compiler transformations that automatically coarsen some types of very fine-grained codes. Second, we propose Lazy Scheduling, an innovative run-time scheduling technique that infers the platform load at run-time, using information already maintained. Based on the inferred load, Lazy Scheduling adapts the amount of available parallelism it exposes for parallel execution and, thus, saves parallelism overheads that existing approaches pay. We implement Lazy Scheduling and present experimental results on four different platforms. The results show that Lazy Scheduling is vastly superior for declarative codes and competitive, if not better, for coarsened codes. Moreover, Lazy Scheduling is also superior in terms of performance-portability, supporting our thesis that it is possible to achieve reasonable efficiency and performance portability with declarative codes. Finally, we also implement Lazy Scheduling on XMT, an experimental manycore platform developed at the University of Maryland, which was designed to support codes derived from PRAM algorithms. On XMT, we manage to harness the existing hardware support for scheduling flat parallelism to compose it with Lazy Scheduling, which supports nested parallelism. In the resulting hybrid scheduler, the hardware and software work in synergy to overcome each other's weaknesses. We show the performance composability of the hardware and software schedulers, both in an abstract cost model and experimentally, as the hybrid always performs better than the software scheduler alone. Furthermore, the cost model is validated by using it to predict if it is preferable to execute a code sequentially, with outer parallelism, or with nested parallelism, depending on the input, the available hardware parallelism and the calling context of the parallel code

    Maximum Entropy Reconstruction Of Moment Coded Images

    Get PDF
    The maximum entropy principle (MEP) is applied to the problem of reconstructing an image from knowledge of a finite set of its moments. This new approach is compared to the existing method of moments approach and is shown to have a clear edge in performance in all of the applications attempted. Compression ratios more than twice as high as those previously achieved are possible with the new MEP method

    Two countable Hausdorff almost regular spaces every contiunous map of which into every Urysohn space is constant

    Get PDF
    We construct two countable, Hausdorff, almost regular spaces I(S), I(T) having the following properties: (1) Every continuous map of I(S) (resp, I(T)) into every Urysohn space is constant (hence, both spaces are connected). (2) For every point of I(S) (resp. of I(T)) and for every open neighbourhood U of this point there exists an open neighbourhood V of it such that V⫅U and every continuous map of V into every Urysohn space is constant (hence both spaces are locally connected). (3) The space I(S) is first countable and the space I(T) nowhere first countable. A consequence of the above is the construction of two countable, (connected) Hausdorff, almost regular spaces with a dispersion point and similar properties. Unfortunately, none of these spaces is Urysohn

    Two Moore spaces on which every continuous real-valued function is constant

    Get PDF
    We construct two Moore spaces on which every continuous real-valued function is constant. The first is Moore, screenable and the second, Moore separable. As corollaries we obtain two more Moore spaces on which every continuous real-valued function is constant (a Moore separable and a Moore, screenable) and having a dispersion point

    On countable connected Hausdorff spaces in which the intersection of every pair of connected subsets in connected

    Get PDF
    We prove that a countable connected Hausdorff space in which the intersection of every pair of connected subsets is connected, cannot be locally connected, and also that every continuous function from a countable connected, locally connected Hausdorff space, to a countable connected Hausdorff space in which the intersection of every pair of connected subsets is connected, is constant

    The compiler for the XMTC parallel language: Lessons for compiler developers and in-depth description

    Get PDF
    In this technical report, we present information on the XMTC compiler and language. We start by presenting the XMTC Memory Model and the issues we encountered when using GCC, the popular GNU compiler for C and other sequential languages, as the basis for a compiler for XMTC, a parallel language. These topics, along with some information on XMT specific optimizations were presented in [10]. Then, we proceed to give some more details on how outer spawn statements (i.e., parallel loops) are compiled to take advantage of XMT’s unique hardware primitives for scheduling flat parallelism and how we incremented this basic compiler to support nested parallelism

    Technical Report: Region and Effect Inference for Safe Parallelism

    Get PDF
    In this paper, we present the first full regions-and-effects inference algorithm for explicitly parallel fork-join programs. We infer annotations inspired by Deterministic Parallel Java (DPJ) for a type-safe subset of C++. We chose the DPJ annotations because they give the \emph{strongest} safety guarantees of any existing concurrency-checking approach we know of, static or dynamic, and it is also the most expressive static checking system we know of that gives strong safety guarantees. This expressiveness, however, makes manual annotation difficult and tedious, which motivates the need for automatic inference, but it also makes the inference problem very challenging: the code may use region polymorphism, imperative updates with complex aliasing, arbitrary recursion, hierarchical region specifications, and wildcard elements to describe potentially infinite sets of regions. We express the inference as a constraint satisfaction problem and develop, implement, and evaluate an algorithm for solving it. The region and effect annotations inferred by the algorithm constitute a checkable proof of safe parallelism, and it can be recorded both for documentation and for fast and modular safety checking.Ope
    corecore