1,449 research outputs found

    Using the High Productivity Language Chapel to Target GPGPU Architectures

    Get PDF
    It has been widely shown that GPGPU architectures offer large performance gains compared to their traditional CPU counterparts for many applications. The downside to these architectures is that the current programming models present numerous challenges to the programmer: lower-level languages, explicit data movement, loss of portability, and challenges in performance optimization. In this paper, we present novel methods and compiler transformations that increase productivity by enabling users to easily program GPGPU architectures using the high productivity programming language Chapel. Rather than resorting to different parallel libraries or annotations for a given parallel platform, we leverage a language that has been designed from first principles to address the challenge of programming for parallelism and locality. This also has the advantage of being portable across distinct classes of parallel architectures, including desktop multicores, distributed memory clusters, large-scale shared memory, and now CPU-GPU hybrids. We present experimental results from the Parboil benchmark suite which demonstrate that codes written in Chapel achieve performance comparable to the original versions implemented in CUDA.NSF CCF 0702260Cray Inc. Cray-SRA-2010-016962010-2011 Nvidia Research Fellowshipunpublishednot peer reviewe

    Parallel machine architecture and compiler design facilities

    Get PDF
    The objective is to provide an integrated simulation environment for studying and evaluating various issues in designing parallel systems, including machine architectures, parallelizing compiler techniques, and parallel algorithms. The status of Delta project (which objective is to provide a facility to allow rapid prototyping of parallelized compilers that can target toward different machine architectures) is summarized. Included are the surveys of the program manipulation tools developed, the environmental software supporting Delta, and the compiler research projects in which Delta has played a role

    Towards an Achievable Performance for the Loop Nests

    Full text link
    Numerous code optimization techniques, including loop nest optimizations, have been developed over the last four decades. Loop optimization techniques transform loop nests to improve the performance of the code on a target architecture, including exposing parallelism. Finding and evaluating an optimal, semantic-preserving sequence of transformations is a complex problem. The sequence is guided using heuristics and/or analytical models and there is no way of knowing how close it gets to optimal performance or if there is any headroom for improvement. This paper makes two contributions. First, it uses a comparative analysis of loop optimizations/transformations across multiple compilers to determine how much headroom may exist for each compiler. And second, it presents an approach to characterize the loop nests based on their hardware performance counter values and a Machine Learning approach that predicts which compiler will generate the fastest code for a loop nest. The prediction is made for both auto-vectorized, serial compilation and for auto-parallelization. The results show that the headroom for state-of-the-art compilers ranges from 1.10x to 1.42x for the serial code and from 1.30x to 1.71x for the auto-parallelized code. These results are based on the Machine Learning predictions.Comment: Accepted at the 31st International Workshop on Languages and Compilers for Parallel Computing (LCPC 2018

    Almond orchard management using multi-temporal UAV data: a proof of concept

    Get PDF
    In the last decade Unmanned Aerial Systems (UAS) have become a reference tool for agriculture applications. The integration of multispectral sensors that can capture near infrared (NIR) and red edge spectral reflectance allows the creation of vegetation indices, which are fundamental for crop monitoring process. In this study, we propose a methodology to analyze the vegetative state of almond crops using multi-temporal data acquired by a multispectral sensor accoupled to an Unmanned Aerial Vehicle (UAV). The methodology implemented allowed individual tree parameters extraction, such as number of trees, tree height, and tree crown area. This also allowed the acquisition of Normalized Difference Vegetation Index (NDVI) information for each tree. The multitemporal data showed significant variations in the vegetative state of almond crops.The author acknowledges the financial support provided by the FCT-Portuguese Foundation for Science and Technology (UI/BD/150727/2020), under the Doctoral Programme “Agricultural Production Chains – from fork to farm” (PD/00122/2012) and under the project UIDB/04033/2020.info:eu-repo/semantics/publishedVersio

    Glutathione-s-transferase pi expression in leukaemia: a comparative analysis with mdr-1 data.

    Get PDF
    Drug resistance in haemopoietic cells may be partly related to the expression of the glutathione-s-transferase (GST) pi and mdr-1 genes. We have used RNA slot blotting techniques to investigate the expression of GST pi in peripheral blood and bone marrow of eleven normal subjects, nine patients with myelodysplastic syndrome (MDS), eighteen patients with acute myeloblastic leukaemia (AML), and thirty-two patients with chronic lymphocyte leukaemia (CLL). We found increased expression of GST pi in 8 of 9 MDS, (7 peripheral blood, 1 bone marrow) 12 of 18 AML (5 peripheral blood, 7 bone marrow; 4 of 5 untreated, 1 of 5 secondary, 7 of 11 relapse or refractory) and in the peripheral blood of 24 of 32 CLL (3 of 7 untreated, 21 of 25 treated) relative to normal controls. Increased expression of GST pi can occur at any stage of disease and shows no clear relation to mdr-1 expression except, possibly, in CLL. In 3 AML patients GST pi transcript levels were the same or lower on relapse compared to presentation. Upregulation of the GST pi gene could not be demonstrated in 2 CLL patients in response to treatment with intermittent chlorambucil
    corecore