142 research outputs found

    Characterization of message-passing overhead on the AP3000 multicomputer

    Get PDF
    This is a post-peer-review, pre-copyedit version. The final authenticated version is available online at: http://dx.doi.org/10.1109/ICPP.2001.952077[Abstract] The performance of the communication primitives of parallel computers is critical for the overall system performance. The characterization of the communication overhead is very important to estimate the global performance of parallel applications and to detect possible bottlenecks. In this paper, we evaluate, model and compare the performance of the message-passing libraries provided by the Fujitsu AP3000 multicomputer: MPI/AP, PVM/AP and APlib. Our aim is to fairly characterize the communication primitives using general models and performance metrics.Ministerio de Ciencia y Tecnología; 1FD97-0118-C02

    A PVM Based Library for Sparse Matrix Factorizations

    Get PDF
    This is a post-peer-review, pre-copyedit version of an article published in Lecture Notes in Computer Science. The final authenticated version is available online at: https://doi.org/10.1007/BFb0056589[Abstract] We present 3LM, a C Linked List Management Library for parallel sparse factorizations on a PVM environment which takes into account the fill-in, an important drawback of sparse computations. It is restricted to a mesh topology and is based on an SPMD paradigm. Our goal is to facilitate the programming in such environments by means of a set of list and vector-oriented operations. The result is a pseudo-sequential code, in which the interprocessor communications and the sparse data structures are hidden from the programmer.Ministerio de Educación; CICYT TIC96-1125-C03Xunta de Galicia; XUGA20605B9

    Big Data Geospatial Processing for Massive Aerial LiDAR Datasets

    Get PDF
    [Abstract] For years, Light Detection and Ranging (LiDAR) technology has been considered as a challenge when it comes to developing efficient software to handle the extremely large volumes of data this surveying method is able to collect. In contexts such as this, big data technologies have been providing powerful solutions for distributed storage and computing. In this work, a big data approach on geospatial processing for massive aerial LiDAR point clouds is presented. By using Cassandra and Spark, our proposal is intended to support the execution of any kind of heavy time-consuming process; nonetheless, as an initial case of study, we have focused on fast ground-only rasters obtention to generate digital terrain models (DTMs) from massive LiDAR datasets. Filtered clouds obtained from the isolated processing of adjacent zones may exhibit errors located on the boundaries of the zones in the form of misclassified points. Usually, this type of error is corrected through manual or semi-automatic procedures. In this work, we also present an automated strategy for correcting errors of this type, improving the quality of the classification process and the DTMs obtained while minimizing user intervention. The autonomous nature of all computing stages, along with the low processing times achieved, opens the possibility of considering the system as a highly scalable service-oriented solution for on-demand DTM generation or any other geospatial process. Said solution would be a highly useful and unique service for many users in the LiDAR field, and one which could get near to real-time processing with appropriate computational resources.Xunta de Galicia; ED431C 2017/04Consolidation Programme of Competitive Research Units; R2016/037Xunta de Galicia; ED431G/01Ministerio de Economía y Competitividad; TIN2016-75845-

    Supporting multi-resolution out-of-core rendering of massive LiDAR point clouds through non-redundant data structures

    Get PDF
    This is an Accepted Manuscript of an article published by Taylor & Francis in INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE on 28 Nov 2018, available at: https://doi.org/10.1080/13658816.2018.1549734[Abstract]: In recent years, the evolution and improvement of LiDAR (Light Detection and Ranging) hardware has increased the quality and quantity of the gathered data, making the storage, processing and management thereof particularly challenging. In this work we present a novel, multi-resolution, out-of-core technique, used for web-based visualization and implemented through a non-redundant, data point organization method, which we call Hierarchically Layered Tiles (HLT), and a tree-like structure called Tile Grid Partitioning Tree (TGPT). The design of these elements is mainly focused on attaining very low levels of memory consumption, disk storage usage and network traffic on both, client and server-side, while delivering high-performance interactive visualization of massive LiDAR point clouds (up to 28 billion points) on multiplatform environments (mobile devices or desktop computers). HLT and TGPT were incorporated and tested in ViLMA (Visualization for LiDAR data using a Multi-resolution Approach), our own web-based visualization software specially designed to work with massive LiDAR point clouds.This research was supported by Xunta de Galicia under the Consolidation Programme of Competitive Reference Groups, co-founded by ERDF funds from the EU [Ref. ED431C 2017/04]; Consolidation Programme of Competitive Research Units, co-founded by ERDF funds from the EU [Ref. R2016/037]; Xunta de Galicia (Centro Singular de Investigación de Galicia accreditation 2016/2019) and the European Union (European Regional Development Fund, ERDF) under Grant [Ref. ED431G/01]; and the Ministry of Economy and Competitiveness of Spain and ERDF funds from the EU [TIN2016-75845-P].Xunta de Galicia; ED431C 2017/04Xunta de Galicia; R2016/037Xunta de Galicia; ED431G/0

    Sparse Householder QR factorization on a mesh

    Get PDF
    This is a post-peer-review, pre-copyedit version of an article published in Proceedings of 4th Euromicro Workshop on Parallel and Distributed Processing. The final authenticated version is available online at: http://dx.doi.org/10.1109/EMPDP.1996.500566.[Abstract] We analyze the parallelization of QR factorization by means of Householder transformations. This parallelization is carried out on a machine with a mesh topology (a 2-D torus to be more precise). We use a cyclic distribution of the elements of the sparse matrix M we want to decompose over the processors. Each processor represents the nonzero elements of its part of the matrix by a one-dimensional doubly linked list data structure. Then, we describe the different procedures that constitute the parallel algorithm. As an application of QR factorization, we concentrate on the least squares problem and finally we present an evaluation of the efficiency of this algorithm for a set of test matrices from the Harwell-Boeing sparse matrix collection

    F-MPJ: scalable Java message-passing communications on parallel systems

    Get PDF
    This is a post-peer-review, pre-copyedit version of an article published in The Journal of Supercomputing. The final authenticated version is available online at: https://doi.org/10.1007/s11227-009-0270-0[Abstract] This paper presents F-MPJ (Fast MPJ), a scalable and efficient Message-Passing in Java (MPJ) communication middleware for parallel computing. The increasing interest in Java as the programming language of the multi-core era demands scalable performance on hybrid architectures (with both shared and distributed memory spaces). However, current Java communication middleware lacks efficient communication support. F-MPJ boosts this situation by: (1) providing efficient non-blocking communication, which allows communication overlapping and thus scalable performance; (2) taking advantage of shared memory systems and high-performance networks through the use of our high-performance Java sockets implementation (named JFS, Java Fast Sockets); (3) avoiding the use of communication buffers; and (4) optimizing MPJ collective primitives. Thus, F-MPJ significantly improves the scalability of current MPJ implementations. A performance evaluation on an InfiniBand multi-core cluster has shown that F-MPJ communication primitives outperform representative MPJ libraries up to 60 times. Furthermore, the use of F-MPJ in communication-intensive MPJ codes has increased their performance up to seven times.Ministerio de Educación y Ciencia; TIN2004-07797-C02Ministerio de Educación y Ciencia; TIN2007-67537-C03-2Xunta de Galicia; PGIDIT06PXIB105228P

    Techniques for Autotuning Algorithms on Heterogenous Platforms

    Get PDF
    Proceedings of the First PhD Symposium on Sustainable Ultrascale Computing Systems (NESUS PhD 2016) Timisoara, Romania. February 8-11, 2016.Current GPUs (Graphic Processing Units) can obtain high computational performance in scientific applications. Nevertheless, programmers have to use suitable parallel algorithms for these architectures and have to consider optimization techniques in the implementation in order to achieve that performance. This thesis is focused on designing and implementing parallel prefix algorithms into GPU architectures with little effort. For that, we have developed a very optimized library called BPLG (Tuning Butterfly Processing Library for GPUs) and based on a set of building blocks that enable to easily design well-known algorithms such as FFT, tridiagonal systems solvers, scan operator, sorting or signal processing. This library is designed under a tuning methodology based on two-stages indentified as GPU resource analysis and operator string manipulation. Specifically, this strategy is focused on a set of parallel prefix algorithms that can be represented according to a set of common permutations of the digits of each of its element indices [4], denoted as Index-Digit (ID) algorithms. So far, the proposed methodology has obtained very good results with respect to state-of-art libraries, as CUFFT, CUSPARSE, CUDPP or ModernGPU.European Cooperation in Science and Technology. COS

    Tree Partitioning Reduction: A New Parallel Partition Method for Solving Tridiagonal Systems

    Get PDF
    © 2019 Copyright held by the owner/author(s). This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in ACM Transactions on Mathematical Software, https://doi.org/10.1145/3328731This work was cofunded by the Government of Galicia and ERDF funds from the EU, under the Consolidation Programme of Competitive Reference Groups [ED431C 2017/04], by the Ministry of Economy and Competitiveness of Spain and ERDF funds [TIN2016-75845-P], and by the Ministry of Education of Spain (FPU14/02801). Additionally, it has been also supported by the Xunta de Galicia (Centro Singular de Investigación de Galicia accreditation 2016-2019) and ERDF funds [ED7431G/01]Xunta de Galicia; ED431C 2017/04Xunta de Galicia; ED7431G/0

    Innovación docente en el EEES de cara a la práctica profesional a través del aprendizaje basado en proyectos

    Get PDF
    En este artículo se describe nuestra experiencia en la docencia de Arquitectura e Ingeniería de Computadores en el Máster en Informática de la Universidade da Coruña, en la cual concurrían las circunstancias de titulación EEES de nueva implantación y un número reducido de alumnos. La orientación profesionalizante del máster nos motivó a explorar en innovación docente de cara a la práctica profesional, fundamentalmente a través de metologías de aprendizaje basado en proyectos (project-based learning) combinado con las acciones de: (1) sustitución de docencia teórica por trabajos académicamente dirigidos; (2) impartición de seminarios profesionales; (3) uso de técnicas de role playing; y (4) desarrollo de habilidades comunicativas. La valoración global es que esta metodología y sus acciones asociadas han resultado tremendamente positivas en la docencia de la materia.Peer Reviewe
    corecore