445 research outputs found

    Coarse-grain Load Distribution in Heterogeneous Computing

    Get PDF
    HPC heterogeneous clusters are composed by different type of machines (various types of component manufacturers, varying computational capacities), and different hardware accelerators. TThe most common type of data distributions is the equal division of the data across all the nodes. A more sophisticated policy of data distribution is needed to explode the computational capacity of the entire system

    TuCCompi: a multi-layer model for distributed heterogeneous computing with tuning capabilities

    Get PDF
    Producción CientíficaDuring the last decade, parallel processing architectures have become a powerful tool to deal with massively-parallel problems that require High Performance Computing (HPC). The last trend of HPC is the use of heterogeneous environments, that combine different computational processing devices, such as CPU-cores and GPUs (Graphics Processing Units). Maximizing the performance of any GPU parallel implementation of an algorithm requires an in-depth knowledge about the GPU underlying architecture, becoming a tedious manual effort only suited for experienced programmers. In this paper, we present TuCCompi, a multi-layer abstract model that simplifies the programming on heterogeneous systems including hardware accelerators, by hiding the details of synchronization, deployment, and tuning. TuCCompi chooses optimal values for their configuration parameters using a kernel characterization provided by the programmer. This model is very useful to tackle problems characterized by independent, high computational-load independent tasks, such as embarrassingly-parallel problems. We have evaluated TuCCompi in different, real-world, heterogeneous environments using the All-Pair Shortest-Path problem as a case study.Ministerio de Economía y Competitividad and ERDF program of the European Union: CAPAP-H5 network (TIN2014-53522-REDT), MOGECOPP project (TIN2011-25639); Junta de Castilla y Leon (Spain): ATLAS project (VA172A12-2); and the COST Program Action IC1305: NESUS

    UVaFTLE: Lagrangian finite time Lyapunov exponent extraction for fluid dynamic applications

    Get PDF
    Producción CientíficaThe determination of Lagrangian Coherent Structures (LCS) is becoming very important in several disciplines, including cardiovascular engineering, aerodynam- ics, and geophysical fluid dynamics. From the computational point of view, the extraction of LCS consists of two main steps: The flowmap computation and the resolution of Finite Time Lyapunov Exponents (FTLE). In this work, we focus on the design, implementation, and parallelization of the FTLE resolution. We offer an in-depth analysis of this procedure, as well as an open source C implementation (UVaFTLE) parallelized using OpenMP directives to attain a fair parallel efficiency in shared-memory environments. We have also implemented CUDA kernels that allow UVaFTLE to leverage as many NVIDIA GPU devices as desired in order to reach the best parallel efficiency. For the sake of reproducibility and in order to con- tribute to open science, our code is publicly available through GitHub. Moreover, we also provide Docker containers to ease its usage.Ministerio de Economía, Industria y Competitividad, Consejo Asesor de Educación de Castilla y León y Programas del Fondo de Desarrollo (FEDER): Proyecto PCAS (TIN2017-88614-R) y Proyecto PROPHET-2 (VA226P20).Ministerio de Ciencia e Innovación, Agencia Estatal de Investigación y “European Union NextGenerationEU/PRTR” : (MCIN/ AEI/10.13039/501100011033) - (grant TED2021-130367B-I00)Junta de Castilla y León (project VA182P20)Red Española de Supercomputación (RES) (projects IM-2022-2-0015 and IM-2022-3-0021)Publicación en abierto financiada por el Consorcio de Bibliotecas Universitarias de Castilla y León (BUCLE), con cargo al Programa Operativo 2014ES16RFOP009 FEDER 2014-2020 DE CASTILLA Y LEÓN, Actuación:20007-CL - Apoyo Consorcio BUCL

    Supporting efficient overlapping of host-device operations for heterogeneous programming with CtrlEvents

    Get PDF
    Producción CientíficaHeterogeneous systems with several kinds of devices, such as multi-core CPUs, GPUs, FPGAs, among others, are now commonplace. Exploiting all these devices with device-oriented programming models, such as CUDA or OpenCL, requires expertise and knowledge about the underlying hardware to tailor the application to each specific device, thus degrading performance portability. Higher-level proposals simplify the programming of these devices, but their current implementations do not have an efficient support to solve problems that include frequent bursts of computation and communication, or input/output operations. In this work we present CtrlEvents, a new heterogeneous runtime solution which automatically overlaps computation and communication whenever possible, simplifying and improving the efficiency of data-dependency analysis and the coordination of both device computations and host tasks that include generic I/O operations. Our solution outperforms other state-of-the-art implementations for most situations, presenting a good balance between portability, programmability and efficiency.Ministerio de Ciencia e Innovación - FEDER (TIN2017-88614-R)Junta de Castilla y León (VA226P20)Ministerio de Ciencia e Innovación - AEI and European Union NextGenerationEU/PRTR (TED2021–130367B–I00 and MCIN/AEI/10.13039/501100011033

    EPSILOD: efficient parallel skeleton for generic iterative stencil computations in distributed GPUs

    Get PDF
    Producción CientíficaIterative stencil computations are widely used in numerical simulations. They present a high degree of parallelism, high locality and mostly-coalesced memory access patterns. Therefore, GPUs are good candidates to speed up their computa- tion. However, the development of stencil programs that can work with huge grids in distributed systems with multiple GPUs is not straightforward, since it requires solv- ing problems related to the partition of the grid across nodes and devices, and the synchronization and data movement across remote GPUs. In this work, we present EPSILOD, a high-productivity parallel programming skeleton for iterative stencil computations on distributed multi-GPUs, of the same or different vendors that sup- ports any type of n-dimensional geometric stencils of any order. It uses an abstract specification of the stencil pattern (neighbors and weights) to internally derive the data partition, synchronizations and communications. Computation is split to better overlap with communications. This paper describes the underlying architecture of EPSILOD, its main components, and presents an experimental evaluation to show the benefits of our approach, including a comparison with another state-of-the-art solution. The experimental results show that EPSILOD is faster and shows good strong and weak scalability for platforms with both homogeneous and heterogene- ous types of GPUJunta de Castilla y León, Ministerio de Economía, Industria y Competitividad, y Fondo Europeo de Desarrollo Regional (FEDER): Proyecto PCAS (TIN2017-88614-R) y Proyecto PROPHET-2 (VA226P20).Ministerio de Ciencia e Innovación, Agencia Estatal de Investigación y “European Union NextGenerationEU/PRTR” : (MCIN/ AEI/10.13039/501100011033) - grant TED2021-130367B-I00CTE-POWER and Minotauro and the technical support provided by Barcelona Supercomputing Center (RES-IM-2021-2-0005, RES-IM-2021-3-0024, RES- IM-2022-1-0014).Publicación en abierto financiada por el Consorcio de Bibliotecas Universitarias de Castilla y León (BUCLE), con cargo al Programa Operativo 2014ES16RFOP009 FEDER 2014-2020 DE CASTILLA Y LEÓN, Actuación:20007-CL - Apoyo Consorcio BUCL

    Electropolymerized polypyrrole silver nanocomposite coatings on porous Ti substrates with enhanced corrosion and antibacterial behavior for biomedical applications

    Get PDF
    Producción CientíficaThis work proposes an innovative strategy that combines two well-known easy and economical preparation techniques: powder metallurgy based space-holder (SH) technique to provide porous biocompatible Ti substrates with balanced biomechanical behavior (for cortical bone tissue substitution) promoting bone ingrowth and biocoating infiltration, and electropolymerization to coat these substrates with the polypyrrole–silver nanoparticles (PPy-AgNPs) composite conductive polymers improving its corrosion resistance, biocompatibility with enhanced antibacterial activity. The deposited PPy-based coatings present a cauliflower-like structure well adhered to the porous substrates. The macroporosity of and rough inner pore surface of Ti SH substrates are responsible for the superior adhesion of the conductive polymer comparing to typical denser substrates obtained by powder metallurgy or forging. The corrosion protection properties of the coatings were investigated by open circuit potential and Anodic Polarization in PBS media to simulate possible implant conditions, revealing improved corrosion resistance for the composite coatings. The bioactivity of the coatings was evaluated by immersion tests, revealing the formation of Hydroxyapatite after 90-day immersion in PBS. In both PPy and PPy-AgNPs composite coatings, a displacement of the polarization curves to more noble potentials and a decrease in the current density, indicated that the coating's protective character is maintained after 90-day immersion in PBS. The antibacterial activity was assessed by using the Kirby–Bauer disk-diffusion method against Staphylococcus aureus (ATCC 25923). The inhibition halo increased from 5.5 ± 0.4 mm for the bare substrate to 8.2 ± 0.6 mm for the PPy-coated substrate and to 12.5 ± 0.7 mm for the PPy-AgNPs-coated porous Ti. This feature associated to the improved corrosion protection and biocompatibility would significantly contribute to the success of the potential use of porous Ti implants by SH technique envisaging substitution of small damaged bone tissues for example in tumors.Ministerio de Ciencia e Innovación (RTI2018-097990-B-I00 y PID2019-109371 GB-I00)Junta de Andalucía–FEDER (US-1259771)Junta de Castilla y León (VA275P18 y VA044G19

    Aplicación de gamificación competitiva y colaborativa en asignaturas básicas de arquitectura de computadoras

    Get PDF
    La gamificación competitiva se ha utilizado previamente con éxito para mejorar la experiencia de aprendizaje en asignaturas de programación. Puesto que la competitividad durante los concursos afecta a la colaboración habitual entre los estudiantes, algunos trabajos han propuesto alternativas para combinar y mejorar la interacción entre gamificación competitiva y colaborativa. En este trabajo se describe una experiencia para aplicar dicha combinación a asignaturas donde la programación es instrumental, concretamente de arquitectura de computadoras del Grado en Informática. Presentamos una modificación tanto de herramientas como de metodología para realizar concursos de programación en lenguaje ensamblador, basados en diversos parámetros o criterios más apropiados para este tipo de materias. Se presenta un estudio experimental donde se muestra que los alumnos que participaron en las prácticas gamificadas con la metodología propuesta mejoraron sus resultados de aprendizaje en comparación con los alumnos pertenecientes a grupos de control. Los resultados de este trabajo abren la posibilidad de ampliar el uso de este tipo de estrategias a otros tipos de prácticas y asignaturas relacionadas con la informática.Competitive gamification has been successfully used to improve the learning experience in programming courses. Since the contest competitiveness affects the collaboration between students, some works have proposed a combination of competitive and collaborative gamification. This work discusses an experience to apply this gamification combination in courses where programming is simply a tool, specifically computer architecture courses. We present a modification of tools and methodology to conduct machine-code programming contests, based on parameters and criteria more appropriate for these courses. We present an experimental study showing that the students involved in the gamified exercises improve their learning results compared to the control groups. The results of this work open the possibility of extending the use of these strategies to other types of Computer Science exercises or courses.Este trabajo se ha realizado con el apoyo de la Universidad de Valladolid, Proyectos de Innovación Docente PID1819-67 y PID1920-59

    Comprehensive evaluation of a New GPU-based approach to the shortest-path problem

    Get PDF
    Producción CientíficaThe Single-Source Shortest Path (SSSP) problem arises in many different fields. In this paper, we present a GPU SSSP algorithm implementation. Our work significantly speeds up the computation of the SSSP, not only with respect to a CPU-based version, but also to other state-of-the-art GPU implementations based on Dijkstra. Both GPU implementations have been evaluated using the latest NVIDIA architectures. The graphs chosen as input sets vary in nature, size, and fan-out degree, in order to evaluate the behavior of the algorithms for different data classes. Additionally, we have enhanced our GPU algorithm implementation using two optimization techniques: The use of a proper choice of threadblock size; and the modification of the GPU L1 cache memory state of NVIDIA devices. These optimizations lead to performance improvements of up to 23% with respect to the non-optimized versions. In addition, we have made a platform comparison of several NVIDIA boards in order to distinguish which one is better for each class of graphs, depending on their features. Finally, we compare our results with an optimized sequential implementation of Dijkstra's algorithm included in the reference Boost library, obtaining an improvement ratio of up to 19x for some graph families, using less memory space.Ministerio de Economía y Competitividad (Spain) and ERDF program of the European Union: CAPAP-H5 network (TIN2014-53522- REDT), MOGECOPP project (TIN2011-25639); Junta de Castilla y León (Spain): ATLAS project (VA172A12-2); and the COST Program Action IC1305: NESUS

    Antimicrobial resistance among migrants in Europe: a systematic review and meta-analysis

    Get PDF
    BACKGROUND: Rates of antimicrobial resistance (AMR) are rising globally and there is concern that increased migration is contributing to the burden of antibiotic resistance in Europe. However, the effect of migration on the burden of AMR in Europe has not yet been comprehensively examined. Therefore, we did a systematic review and meta-analysis to identify and synthesise data for AMR carriage or infection in migrants to Europe to examine differences in patterns of AMR across migrant groups and in different settings. METHODS: For this systematic review and meta-analysis, we searched MEDLINE, Embase, PubMed, and Scopus with no language restrictions from Jan 1, 2000, to Jan 18, 2017, for primary data from observational studies reporting antibacterial resistance in common bacterial pathogens among migrants to 21 European Union-15 and European Economic Area countries. To be eligible for inclusion, studies had to report data on carriage or infection with laboratory-confirmed antibiotic-resistant organisms in migrant populations. We extracted data from eligible studies and assessed quality using piloted, standardised forms. We did not examine drug resistance in tuberculosis and excluded articles solely reporting on this parameter. We also excluded articles in which migrant status was determined by ethnicity, country of birth of participants' parents, or was not defined, and articles in which data were not disaggregated by migrant status. Outcomes were carriage of or infection with antibiotic-resistant organisms. We used random-effects models to calculate the pooled prevalence of each outcome. The study protocol is registered with PROSPERO, number CRD42016043681. FINDINGS: We identified 2274 articles, of which 23 observational studies reporting on antibiotic resistance in 2319 migrants were included. The pooled prevalence of any AMR carriage or AMR infection in migrants was 25·4% (95% CI 19·1-31·8; I2 =98%), including meticillin-resistant Staphylococcus aureus (7·8%, 4·8-10·7; I2 =92%) and antibiotic-resistant Gram-negative bacteria (27·2%, 17·6-36·8; I2 =94%). The pooled prevalence of any AMR carriage or infection was higher in refugees and asylum seekers (33·0%, 18·3-47·6; I2 =98%) than in other migrant groups (6·6%, 1·8-11·3; I2 =92%). The pooled prevalence of antibiotic-resistant organisms was slightly higher in high-migrant community settings (33·1%, 11·1-55·1; I2 =96%) than in migrants in hospitals (24·3%, 16·1-32·6; I2 =98%). We did not find evidence of high rates of transmission of AMR from migrant to host populations. INTERPRETATION: Migrants are exposed to conditions favouring the emergence of drug resistance during transit and in host countries in Europe. Increased antibiotic resistance among refugees and asylum seekers and in high-migrant community settings (such as refugee camps and detention facilities) highlights the need for improved living conditions, access to health care, and initiatives to facilitate detection of and appropriate high-quality treatment for antibiotic-resistant infections during transit and in host countries. Protocols for the prevention and control of infection and for antibiotic surveillance need to be integrated in all aspects of health care, which should be accessible for all migrant groups, and should target determinants of AMR before, during, and after migration. FUNDING: UK National Institute for Health Research Imperial Biomedical Research Centre, Imperial College Healthcare Charity, the Wellcome Trust, and UK National Institute for Health Research Health Protection Research Unit in Healthcare-associated Infections and Antimictobial Resistance at Imperial College London

    Measurement of nuclear modification factors of gamma(1S)), gamma(2S), and gamma(3S) mesons in PbPb collisions at root s(NN)=5.02 TeV

    Get PDF
    The cross sections for ϒ(1S), ϒ(2S), and ϒ(3S) production in lead-lead (PbPb) and proton-proton (pp) collisions at √sNN = 5.02 TeV have been measured using the CMS detector at the LHC. The nuclear modification factors, RAA, derived from the PbPb-to-pp ratio of yields for each state, are studied as functions of meson rapidity and transverse momentum, as well as PbPb collision centrality. The yields of all three states are found to be significantly suppressed, and compatible with a sequential ordering of the suppression, RAA(ϒ(1S)) > RAA(ϒ(2S)) > RAA(ϒ(3S)). The suppression of ϒ(1S) is larger than that seen at √sNN = 2.76 TeV, although the two are compatible within uncertainties. The upper limit on the RAA of ϒ(3S) integrated over pT, rapidity and centrality is 0.096 at 95% confidence level, which is the strongest suppression observed for a quarkonium state in heavy ion collisions to date. © 2019 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). Funded by SCOAP3.Peer reviewe
    corecore