88 research outputs found

    Processing Radio Astronomical Data Using the PROCESS Software Ecosystem

    Get PDF
    In this paper we discuss our efforts in "unlocking" the Long Term Archive (LTA) of the LOFAR radio telescope using the software ecosystem developed in the PROCESS project. The LTA is a large (>50 PB) archive that expands with about 7 PB per year by the ingestion of new observations. It consists of coarsely calibrated "visibilities", i.e. correlations between signals from LOFAR stations. Converting these observations into sky maps (images), which are needed for astronomy research, can be challenging due to the data sizes of the observations and the complexity and compute requirements of the software involved. Using the PROCESS software environment and testbed, we enable a simple point-and-click-reduction of LOFAR observations into sky maps for users of this archive. This work was performed as part of the PROCESS project which aims to provide generalizable open source solutions for user friendly exascale data processing

    Rocket: Efficient and Scalable All-Pairs Computations on Heterogeneous Platforms

    Get PDF
    All-pairs compute problems apply a user-defined function to each combination of two items of a given data set. Although these problems present an abundance of parallelism, data reuse must be exploited to achieve good performance. Several researchers considered this problem, either resorting to partial replication with static work distribution or dynamic scheduling with full replication. In contrast, we present a solution that relies on hierarchical multi-level software-based caches to maximize data reuse at each level in the distributed memory hierarchy, combined with a divide-and-conquer approach to exploit data locality, hierarchical work-stealing to dynamically balance the workload, and asynchronous processing to maximize resource utilization. We evaluate our solution using three real-world applications (from digital forensics, localization microscopy, and bioinformatics) on different platforms (from a desktop machine to a supercomputer). Results shows excellent efficiency and scalability when scaling to 96 GPUs, even obtaining super-linear speedups due to a distributed cache

    Reference Exascale Architecture (Extended Version)

    Get PDF
    While political commitments for building exascale systems have been made, turning these systems into platforms for a wide range of exascale applications faces several technical, organisational and skills-related challenges. The key technical challenges are related to the availability of data. While the first exascale machines are likely to be built within a single site, the input data is in many cases impossible to store within a single site. Alongside handling of extreme-large amount of data, the exascale system has to process data from different sources, support accelerated computing, handle high volume of requests per day, minimize the size of data flows, and be extensible in terms of continuously increasing data as well as an increase in parallel requests being sent. These technical challenges are addressed by the general reference exascale architecture. It is divided into three main blocks: virtualization layer, distributed virtual file system, and manager of computing resources. Its main property is modularity which is achieved by containerization at two levels: 1) application containers - containerization of scientific workflows, 2) micro-infrastructure - containerization of extreme-large data service-oriented infrastructure. The paper also presents an instantiation of the reference architecture - the architecture of the PROCESS project (PROviding Computing solutions for ExaScale ChallengeS) and discusses its relation to the reference exascale architecture. The PROCESS architecture has been used as an exascale platform within various exascale pilot applications. This paper also presents performance modelling of exascale platform with its validation

    A many-analysts approach to the relation between religiosity and well-being

    Get PDF
    The relation between religiosity and well-being is one of the most researched topics in the psychology of religion, yet the directionality and robustness of the effect remains debated. Here, we adopted a many-analysts approach to assess the robustness of this relation based on a new cross-cultural dataset (N=10,535 participants from 24 countries). We recruited 120 analysis teams to investigate (1) whether religious people self-report higher well-being, and (2) whether the relation between religiosity and self-reported well-being depends on perceived cultural norms of religion (i.e., whether it is considered normal and desirable to be religious in a given country). In a two-stage procedure, the teams first created an analysis plan and then executed their planned analysis on the data. For the first research question, all but 3 teams reported positive effect sizes with credible/confidence intervals excluding zero (median reported β=0.120). For the second research question, this was the case for 65% of the teams (median reported β=0.039). While most teams applied (multilevel) linear regression models, there was considerable variability in the choice of items used to construct the independent variables, the dependent variable, and the included covariates

    A Many-analysts Approach to the Relation Between Religiosity and Well-being

    Get PDF
    The relation between religiosity and well-being is one of the most researched topics in the psychology of religion, yet the directionality and robustness of the effect remains debated. Here, we adopted a many-analysts approach to assess the robustness of this relation based on a new cross-cultural dataset (N = 10, 535 participants from 24 countries). We recruited 120 analysis teams to investigate (1) whether religious people self-report higher well-being, and (2) whether the relation between religiosity and self-reported well-being depends on perceived cultural norms of religion (i.e., whether it is considered normal and desirable to be religious in a given country). In a two-stage procedure, the teams first created an analysis plan and then executed their planned analysis on the data. For the first research question, all but 3 teams reported positive effect sizes with credible/confidence intervals excluding zero (median reported β = 0.120). For the second research question, this was the case for 65% of the teams (median reported β = 0.039). While most teams applied (multilevel) linear regression models, there was considerable variability in the choice of items used to construct the independent variables, the dependent variable, and the included covariates

    NLeSC/eSalsa-MPI: First EYRg release of eSalsa-MPI

    No full text
    This is a release of the eSalsa-MPI wrapper used in the EYRg project. For more information on EYRg see our wiki at: https://github.com/jmaassen/EYRg-wiki/wiki For this project, this library is used in combination with CESM which can be found at: http://www2.cesm.ucar.edu/ Note that this library is highly experimental, and although we have used it successfully in a number of experiments, we do not claim that it is production ready. Use at your own risk! We will continue to develop this library as part of the eSalsa project, and hope to release a more stable version at a later date

    VRIJE UNIVERSITEIT Method Invocation BasedCommunication Models forParallel Programming in Java ACADEMISCH PROEFSCHRIFT

    No full text
    ter verkrijging van de graad van doctor aande Vrije Universiteit Amsterdam, op gezag van de rector magnificusprof.dr. T. Sminia

    The Research Software Directory - a brief introduction for Cite.Software

    No full text
    This presentation was used to introduce the Research Software Directory at the Cite.Software workshop.Research Software Funders Workshop Date: 18 September, 2023, 13:00-16:00 EDT Location: Palais des congrès de Montréal, Montréal, Canada, and virtua

    Massive Semantic Web data compression with MapReduce

    No full text
    The Semantic Web consists of many billions of statements made of terms that are either URIs or literals. Since these terms usually consist of long sequences of characters, an effective compression technique must be used to reduce the data size and increase the application performance. One of the best known techniques for data compression is dictionary encoding. In this paper we propose a MapReduce algorithm that efficiently compresses and decompresses a large amount of Semantic Web data. We have implemented a prototype using the Hadoop framework and we report an evaluation of the performance. The evaluation shows that our approach is able to efficiently compress a large amount of data and that it scales linearly regarding the input size and number of nodes. 1
    corecore