80 research outputs found

    Practical considerations for acoustic source localization in the IoT era: Platforms, energy efficiency, and performance

    Get PDF
    The rapid development of the Internet of Things (IoT) has posed important changes in the way emerging acoustic signal processing applications are conceived. While traditional acoustic processing applications have been developed taking into account high-throughput computing platforms equipped with expensive multichannel audio interfaces, the IoT paradigm is demanding the use of more flexible and energy-efficient systems. In this context, algorithms for source localization and ranging in wireless acoustic sensor networks can be considered an enabling technology for many IoT-based environments, including security, industrial, and health-care applications. This paper is aimed at evaluating important aspects dealing with the practical deployment of IoT systems for acoustic source localization. Recent systems-on-chip composed of low-power multicore processors, combined with a small graphics accelerator (or GPU), yield a notable increment of the computational capacity needed in intensive signal processing algorithms while partially retaining the appealing low power consumption of embedded systems. Different algorithms and implementations over several state-of-the-art platforms are discussed, analyzing important aspects, such as the tradeoffs between performance, energy efficiency, and exploitation of parallelism by taking into account real-time constraintsThis work was supported in part by the Post-Doctoral Fellowship from Generalitat Valenciana under Grant APOSTD/2016/069, in part by the Spanish Government under Grant TIN2014-53495-R, Grant TIN2015-65277-R, and Grant BIA2016-76957-C3-1-R, and in part by the Universidad Jaume I under Project UJI-B2016-20.Publicad

    Strategies to parallelize a finite element mesh truncation technique on multi-core and many-core architectures

    Get PDF
    Achieving maximum parallel performance on multi-core CPUs and many-core GPUs is a challenging task depending on multiple factors. These include, for example, the number and granularity of the computations or the use of the memories of the devices. In this paper, we assess those factors by evaluating and comparing different parallelizations of the same problem on a multiprocessor containing a CPU with 40 cores and four P100 GPUs with Pascal architecture. We use, as study case, the convolutional operation behind a non-standard finite element mesh truncation technique in the context of open region electromagnetic wave propagation problems. A total of six parallel algorithms implemented using OpenMP and CUDA have been used to carry out the comparison by leveraging the same levels of parallelism on both types of platforms. Three of the algorithms are presented for the first time in this paper, including a multi-GPU method, and two others are improved versions of algorithms previously developed by some of the authors. This paper presents a thorough experimental evaluation of the parallel algorithms on a radar cross-sectional prediction problem. Results show that performance obtained on the GPU clearly overcomes those obtained in the CPU, much more so if we use multiple GPUs to distribute both data and computations. Accelerations close to 30 have been obtained on the CPU, while with the multi-GPU version accelerations larger than 250 have been achieved.Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature. This work has been supported by the Spanish Government PID2020-113656RB-C21, PID2019-106455GB-C21 and by the Valencian Regional Government through PROMETEO/2019/109, as well as the Regional Government of Madrid throughout the project MIMACUHSPACE-CM-UC3M

    GPU Acceleration of a Non-Standard Finite Element Mesh Truncation Technique for Electromagnetics

    Get PDF
    The emergence of General Purpose Graphics Processing Units (GPGPUs) provides new opportunities to accelerate applications involving a large number of regular computations. However, properly leveraging the computational resources of graphical processors is a very challenging task. In this paper, we use this kind of device to parallelize FE-IIEE (Finite Element-Iterative Integral Equation Evaluation), a non-standard finite element mesh truncation technique introduced by two of the authors. This application is computationally very demanding due to the amount, size and complexity of the data involved in the procedure. Besides, an efficient implementation becomes even more difficult if the parallelization has to maintain the complex workflow of the original code. The proposed implementation using CUDA applies different optimization techniques to improve performance. These include leveraging the fastest memories of the GPU and increasing the granularity of the computations to reduce the impact of memory access. We have applied our parallel algorithm to two real radiation and scattering problems demonstrating speedups higher than 140 on a state-of-the-art GPU.This work was supported in part by the Spanish Government under Grant TEC2016-80386-P, Grant TIN2017-82972-R, and Grant ESP2015-68245-C4-1-P, and in part by the Valencian Regional Government under Grant PROMETEO/2019/109

    Comparison of parallel implementation strategies in GPU-accelerated System-on-Chip under proton irradiation

    Get PDF
    Commercial off-the-shelf (COTS) system-on-chip (SoC) are becoming widespread in embedded systems. Many of them include a multicore central processing unit (CPU) and a high-end graphics processing unit (GPU). They combine high computational performance with low power consumption and flexible multilevel parallelism. This kind of device is also being considered for radiation environments where large amounts of data must be processed or compute-intensive applications must be executed. In this article, we compare three different strategies to perform matrix multiplication in the GPU of a Tegra TK1 SoC. Our aim is to analyze how the different use of the resources of the GPU influences not only the computational performance of the algorithm, but also its radiation sensitivity. Radiation experiments with protons were performed to compare the behavior of the three strategies. Experimental results show that most of the errors force a reboot of the platform. The number of errors is directly related with how the algorithms use the internal memories of the GPU and increases with the matrix size. It is also related with the number of transactions with the global memory, which in our experiments is not affected by the radiation. Results show that the smallest cross section is obtained with the fastest algorithm, even if it uses the cores of the GPU more intensively.This work was supported in part by the Valencian Regional Government under Grant PROMETEO/2019/109, in part by Jaume I University under Project UJIB2019-36, and in part by the Spanish Ministry of Science and Innovation under Project PID2019-106455GB-C21 and Project PID2020-113656RB-C21.Publicad

    Evaluating the soft error sensitivity of a GPU-based SoC for matrix multiplication

    Get PDF
    Proceeding of: 31th European Symposium on Reliability of Electron Devices, Failure Physics and Analysis (ESREF 2020), Athens, Greece, 4th to 8 October 2020 (Virtual conference)System-on-Chip (SoC) devices can be composed of low-power multicore processors combined with a small graphics accelerator (or GPU) which offers a trade-off between computational capacity and low-power consumption. In this work we use the LLFI-GPU fault injection tool on one of these devices to compare the sensitivity to soft errors of two different CUDA versions of matrix multiplication benchmark. Specifically, we perform fault injection campaigns on a Jetson TK1 development kit, a board equipped with a SoC including an NVIDIA 'Kepler” Graphics Processing Unit (GPU). We evaluate the effect of modifying the size of the problem and also the thread-block size on the behaviour of the algorithms. Our results show that the block version of the matrix multiplication benchmark that leverages the shared memory of the GPU is not only faster than the element-wise version, but it is also much more resilient to soft errors. We also use the cuda-gdb debugger to analyze the main causes of the crashes in the code due to soft errors. Our experiments show that most of the errors are due to accesses to invalid positions of the different memories of the GPU, which causes that the block version suffers a higher percentage of this kind of errors.This work has been supported by the Spanish Government through TIN2017-82972-R and ESP2015-68245-C4-1-P, and by the Valencian Regional Government through PROMETEO/2019/109

    On the performance of a GPU-based SoC in a distributed spatial audio system

    Get PDF
    [EN] Many current system-on-chip (SoC) devices are composed of low-power multicore processors combined with a small graphics accelerator (or GPU) offering a trade-off between computational capacity and low-power consumption. In this context, spatial audio methods such as wave field synthesis (WFS) can benefit from a distributed system composed of several SoCs that collaborate to tackle the high computational cost of rendering virtual sound sources. This paper aims at evaluating important aspects dealing with a distributed WFS implementation that runs over a network of Jetson Nano boards composed of embedded GPU-based SoCs: computational performance, energy efficiency, and synchronization issues. Our results show that the maximum efficiency is obtained when the WFS system operates the GPU frequency at 691.2 MHz, achieving 11 sources-per-Watt. Synchronization experiments using the NTP protocol show that the maximum initial delay of 10 ms between nodes does not prevent us from achieving high spatial sound quality.This work has been supported by the Spanish Government through TIN2017-82972-R, ESP2015-68245-C4-1-P, the Valencian Regional Government through PROMETEO/2019/109 and the Universitat Jaume I through UJI-B2019-36.Belloch, JA.; Badía, JM.; Larios, DF.; Personal, E.; Ferrer Contreras, M.; Fuster Criado, L.; Lupoiu, M.... (2021). On the performance of a GPU-based SoC in a distributed spatial audio system. The Journal of Supercomputing (Online). 77(7):6920-6935. https://doi.org/10.1007/s11227-020-03577-46920693577

    Drones y digitalización para el manejo localizado en maíz

    Get PDF
    La transformación digital de la agricultura se apoya en tecnologías como la teledetección con drones, que permite monitorizar el estado del cultivo a un nivel de detalle sin precedentes. Este artículo describe aspectos fundamentales de esta tecnología mediante un caso de estudio en maíz, en el que evaluamos las relaciones entre abundancia de malas hierbas y el vigor y rendimiento del cultivo, con el objetivo de facilitar la aplicación de estrategias de agricultura de precisión en fases iniciales del desarrollo de cultivo.Este trabajo ha sido financiado por los proyectos AGL2017-83325-C4-1R y AGL2017-83325-C4-2R (Agencia Estatal de Investigación y Fondos FEDER, UE). La investigación de Clara Orno Badía ha obtenido una beca de la Sociedad Española de Malherbología (SEMh) para la realización de su Trabajo Final de Máster

    Matemática Financiera: Autoevaluación y rendimiento académico

    Get PDF
    Durante el curso 2006-2007, un equipo de profesores del Departamento de Matemática Económica, Financiera y Actuarial de la Universidad de Barcelona, relacionados con la asignatura de Matemática Financiera, vio la necesidad de adaptar materiales y crear nuevas formas para mejorar el aprendizaje, aprovechando el plan Bolonia. En nuestra facultad el número de alumnos siempre ha sido muy elevado y es una de las principales variables a tener en cuenta. En aquel curso, el volumen de alumnos que cursaban asignaturas relacionadas con la Matemática Financiera ascendió a 3.328. Utilizando Moodle hemos elaborado un material de aprendizaje y autoevaluación consistente en un banco de 218 preguntas. Con los datos completos de tres cursos académicos, desde 2008-2009 hasta el curso 2010-2011, los resultados de la experiencia se exponen en esta ponencia y se pueden calificar de esperanzadores

    Executive summary of the Consensus Document of the Spanish Society of Infectious Diseases and Clinical Microbiology (SEIMC) and of the Spanish Association of Surgeons (AEC) in antibiotic prophylaxis in surgery

    Get PDF
    [ES] La profilaxis antibiótica en cirugía es una de las medidas más eficaces para la prevención de la infección de localización quirúrgica, aunque su uso es con frecuencia inadecuado, pudiendo incrementar el riesgo de infección, toxicidades y resistencias bacterianas. Debido al avance en las técnicas quirúrgicas y la emergencia de microorganismos multirresistentes las actuales pautas de profilaxis precisan ser revisadas. La Sociedad Española de Enfermedades Infecciosas (SEIMC), conjuntamente con la Asociación Española de Cirujanos (AEC) ha revisado y actualizado las recomendaciones de profilaxis antimicrobiana para adaptarlas a cada tipo de intervención quirúrgica y a la epidemiología actual. En este documento se recogen las recomendaciones de los antimicrobianos utilizados en profilaxis en los diferentes procedimientos, las dosis, la duración, la profilaxis en huéspedes especiales, y en situación epidemiológica de multirresistencia, de tal forma que permitan un manejo estandarizado, un uso racional, seguro y efectivo de los mismos en la cirugía electiva.[EN] Antibiotic prophylaxis in surgery is one of the most effective measures for preventing surgical site infection, although its use is frequently inadequate and may even increase the risk of infection, toxicities and antimicrobial resistance. As a result of advances in surgical techniques and the emergence of multidrug-resistant organisms, the current guidelines for prophylaxis need to be revised. The Sociedad Española de Enfermedades Infecciosas (Spanish Society of Infectious Diseases and Clinical Microbiology) (SEIMC) together with the Asociación Española de Cirujanos (Spanish Association of Surgeons) (AEC) have revised and updated the recommendations for antibiotic prophylaxis in surgery to adapt them to any type of surgical intervention and to current epidemiology. This document gathers together the recommendations on antimicrobial prophylaxis in the various procedures, with doses, duration, prophylaxis in special patient groups, and in epidemiological settings of multidrug resistance to facilitate standardized management and the safe, effective and rational use of antibiotics in elective surgery

    Y los estudiantes, ¿qué opinan?

    Get PDF
    Los estudiantes aprenden de y con nosotros, pero nosotros también podemos aprender de y con ellos. Como docentes preocupados por el aprendizaje de nuestros alumnos en ocasiones nos hacemos preguntas que, o no nos atrevemos a hacerles, o no se atreven a contestarnos, o no encontramos el momento de plantearles. En este artículo recogemos la opinión que tienen los estudiantes sobre diferentes aspectos relacionados con la organización y la impartición de nuestra docencia. Las preguntas se centran básicamente en las metodologías docentes, el sistema de evaluación, la utilización del material de estudio, los horarios, las salidas profesionales y las razones de la asistencia y no asistencia a clase. La información proporcionada por los estudiantes nos ha permitido reflexionar con ellos y puede ayudarnos, sin lugar a dudas, a mejorar la organización de nuestras asignaturas, los contenidos abordados en ellas, la motivación del alumno y, en definitiva, el proceso de aprendizaje.Trabajo financiado por la Unidad de Soporte Educativo de la Universidad Jaume I de Castellón
    corecore