778 research outputs found

    On the use of many-core machines for the acceleration of a mesh truncation technique for FEM

    Get PDF
    Finite element method (FEM) has been used for years for radiation problems in the field of electromagnetism. To tackle problems of this kind, mesh truncation techniques are required, which may lead to the use of high computational resources. In fact, electrically large radiation problems can only be tackled using massively parallel computational resources. Different types of multi-core machines are commonly employed in diverse fields of science for accelerating a number of applications. However, properly managing their computational resources becomes a very challenging task. On the one hand, we present a hybrid message passing interface + OpenMP-based acceleration of a mesh truncation technique included in a FEM code for electromagnetism in a high-performance computing cluster equipped with 140 compute nodes. Results show that we obtain about 85% of the theoretical maximum speedup of the machine. On the other hand, a graphics processing unit has been used to accelerate one of the parts that presents high fine-grain parallelism.This work has been fnancially supported by TEC2016-80386-P, TIN2017-82972-R, CAM S2013/ICE-3004 projects and “Ayudas para contratos predoctorales de Formación del Profesorado Universitario FPU”

    Strategies to parallelize a finite element mesh truncation technique on multi-core and many-core architectures

    Get PDF
    Achieving maximum parallel performance on multi-core CPUs and many-core GPUs is a challenging task depending on multiple factors. These include, for example, the number and granularity of the computations or the use of the memories of the devices. In this paper, we assess those factors by evaluating and comparing different parallelizations of the same problem on a multiprocessor containing a CPU with 40 cores and four P100 GPUs with Pascal architecture. We use, as study case, the convolutional operation behind a non-standard finite element mesh truncation technique in the context of open region electromagnetic wave propagation problems. A total of six parallel algorithms implemented using OpenMP and CUDA have been used to carry out the comparison by leveraging the same levels of parallelism on both types of platforms. Three of the algorithms are presented for the first time in this paper, including a multi-GPU method, and two others are improved versions of algorithms previously developed by some of the authors. This paper presents a thorough experimental evaluation of the parallel algorithms on a radar cross-sectional prediction problem. Results show that performance obtained on the GPU clearly overcomes those obtained in the CPU, much more so if we use multiple GPUs to distribute both data and computations. Accelerations close to 30 have been obtained on the CPU, while with the multi-GPU version accelerations larger than 250 have been achieved.Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature. This work has been supported by the Spanish Government PID2020-113656RB-C21, PID2019-106455GB-C21 and by the Valencian Regional Government through PROMETEO/2019/109, as well as the Regional Government of Madrid throughout the project MIMACUHSPACE-CM-UC3M

    Strategies to parallelize a finite element mesh truncation technique on multi-core and many-core architectures

    Get PDF
    Achieving maximum parallel performance on multi-core CPUs and many-core GPUs is a challenging task depending on multiple factors. These include, for example, the number and granularity of the computations or the use of the memories of the devices. In this paper, we assess those factors by evaluating and comparing different parallelizations of the same problem on a multiprocessor containing a CPU with 40 cores and four P100 GPUs with Pascal architecture. We use, as study case, the convolutional operation behind a non-standard finite element mesh truncation technique in the context of open region electromagnetic wave propagation problems. A total of six parallel algorithms implemented using OpenMP and CUDA have been used to carry out the comparison by leveraging the same levels of parallelism on both types of platforms. Three of the algorithms are presented for the first time in this paper, including a multi-GPU method, and two others are improved versions of algorithms previously developed by some of the authors. This paper presents a thorough experimental evaluation of the parallel algorithms on a radar cross-sectional prediction problem. Results show that performance obtained on the GPU clearly overcomes those obtained in the CPU, much more so if we use multiple GPUs to distribute both data and computations. Accelerations close to 30 have been obtained on the CPU, while with the multi-GPU version accelerations larger than 250 have been achieved.Funding for open access charge: CRUE-Universitat Jaume

    Efficient Numerical Solution of Large Scale Algebraic Matrix Equations in PDE Control and Model Order Reduction

    Get PDF
    Matrix Lyapunov and Riccati equations are an important tool in mathematical systems theory. They are the key ingredients in balancing based model order reduction techniques and linear quadratic regulator problems. For small and moderately sized problems these equations are solved by techniques with at least cubic complexity which prohibits their usage in large scale applications. Around the year 2000 solvers for large scale problems have been introduced. The basic idea there is to compute a low rank decomposition of the quadratic and dense solution matrix and in turn reduce the memory and computational complexity of the algorithms. In this thesis efficiency enhancing techniques for the low rank alternating directions implicit iteration based solution of large scale matrix equations are introduced and discussed. Also the applicability in the context of real world systems is demonstrated. The thesis is structured in seven central chapters. After the introduction chapter 2 introduces the basic concepts and notations needed as fundamental tools for the remainder of the thesis. The next chapter then introduces a collection of test examples spanning from easily scalable academic test systems to badly conditioned technical applications which are used to demonstrate the features of the solvers. Chapter four and five describe the basic solvers and the modifications taken to make them applicable to an even larger class of problems. The following two chapters treat the application of the solvers in the context of model order reduction and linear quadratic optimal control of PDEs. The final chapter then presents the extensive numerical testing undertaken with the solvers proposed in the prior chapters. Some conclusions and an appendix complete the thesis

    dynamic modeling of wind turbines experimental tuning of a multibody model

    Get PDF
    Abstract This work is part of a research project funded by the Italian Ministry of the University and Research (MIUR), under the call for "National Interest Research Projects 2015 (PRIN 2015)", titled "Smart Optimized Fault Tolerant WIND turbines (SOFTWIND)". Within this project, the research unit of the University of Perugia (UniPG) aims to develop dynamic modeling and simulation methodologies and fatigue behavior evaluation ones for wind turbine as a whole. The development of these methodologies will be aimed at predicting the life of generic wind turbines, also providing important and fundamental parameters for optimizing their control, aimed at reducing the failures of these machines. In the present paper, a small turbine, developed at the Department of Engineering of the University of Perugia, will be analyzed. The multibody modeling technique adopted and the experimental activity conducted in the wind tunnel of UniPG, needed for the tuning of the model, will be described. The analysis of both model behavior and experimental data has allowed for the definition of a robust multibody modeling technique that adopts a freeware code (NREL - FAST), universally considered to be a reference in this field. The goodness of the model guarantees the capabilities of the simulation environment to analyze the real load scenario and the fatigue behavior of this kind of device

    Development and applications of the Finite Point Method to compressible aerodynamics problems

    Get PDF
    This work deals with the development and application of the Finite Point Method (FPM) to compressible aerodynamics problems. The research focuses mainly on investigating the capabilities of the meshless technique to address practical problems, one of the most outstanding issues in meshless methods. The FPM spatial approximation is studied firstly, with emphasis on aspects of the methodology that can be improved to increase its robustness and accuracy. Suitable ranges for setting the relevant approximation parameters and the performance likely to be attained in practice are determined. An automatic procedure to adjust the approximation parameters is also proposed to simplify the application of the method, reducing problem- and user-dependence without affecting the flexibility of the meshless technique. The discretization of the flow equations is carried out following wellestablished approaches, but drawing on the meshless character of the methodology. In order to meet the requirements of practical applications, the procedures are designed and implemented placing emphasis on robustness and efficiency (a simplification of the basic FPM technique is proposed to this end). The flow solver is based on an upwind spatial discretization of the convective fluxes (using the approximate Riemann solver of Roe) and an explicit time integration scheme. Two additional artificial diffusion schemes are also proposed to suit those cases of study in which computational cost is a major concern. The performance of the flow solver is evaluated in order to determine the potential of the meshless approach. The accuracy, computational cost and parallel scalability of the method are studied in comparison with a conventional FEM-based technique. Finally, practical applications and extensions of the flow solution scheme are presented. The examples provided are intended not only to show the capabilities of the FPM, but also to exploit meshless advantages. Automatic hadaptive procedures, moving domain and fluid-structure interaction problems, as well as a preliminary approach to solve high-Reynolds viscous flows, are a sample of the topics explored. All in all, the results obtained are satisfactorily accurate and competitive in terms of computational cost (if compared with a similar mesh-based implementation). This indicates that meshless advantages can be exploited with efficiency and constitutes a good starting point towards more challenging applications.En este trabajo se aborda el desarrollo del Método de Puntos Finitos (MPF) y su aplicación a problemas de aerodinámica de flujos compresibles. El objetivo principal es investigar el potencial de la técnica sin malla para la solución de problemas prácticos, lo cual constituye una de las limitaciones más importantes de los métodos sin malla. En primer lugar se estudia la aproximación espacial en el MPF, haciendo hincapié en aquéllos aspectos que pueden ser mejorados para incrementar la robustez y exactitud de la metodología. Se determinan rangos adecuados para el ajuste de los parámetros de la aproximación y su comportamiento en situaciones prácticas. Se propone además un procedimiento de ajuste automático de estos parámetros a fin de simplificar la aplicación del método y reducir la dependencia de factores como el tipo de problema y la intervención del usuario, sin afectar la flexibilidad de la técnica sin malla. A continuación se aborda el esquema de solución de las ecuaciones del flujo. La discretización de las mismas se lleva a cabo siguiendo métodos estándar, pero aprovechando las características de la técnica sin malla. Con el objetivo de abordar problemas prácticos, se pone énfasis en la robustez y eficiencia de la implementación numérica (se propone además una simplificación del procedimiento de solución). El comportamiento del esquema se estudia en detalle para evaluar su potencial y se analiza su exactitud, coste computacional y escalabilidad, todo ello en comparación con un método convencional basado en Elementos Finitos. Finalmente se presentan distintas aplicaciones y extensiones de la metodología desarrollada. Los ejemplos numéricos pretenden demostrar las capacidades del método y también aprovechar las ventajas de la metodología sin malla en áreas en que la misma puede ser de especial interés. Los problemas tratados incluyen, entre otras características, el refinamiento automático de la discretización, la presencia de fronteras móviles e interacción fluido-estructura, como así también una aplicación preliminar a flujos compresibles de alto número de Reynolds. Los resultados obtenidos muestran una exactitud satisfactoria. Además, en comparación con una técnica similar basada en Elementos Finitos, demuestran ser competitivos en términos del coste computacional. Esto indica que las ventajas de la metodología sin malla pueden ser explotadas con eficiencia, lo cual constituye un buen punto de partida para el desarrollo de ulteriores aplicaciones.Postprint (published version

    Extended analytical charge modeling for permanent-magnet based devices : practical application to the interactions in a vibration isolation system

    Get PDF
    This thesis researches the analytical surface charge modeling technique which provides a fast, mesh-free and accurate description of complex unbound electromagnetic problems. To date, it has scarcely been used to design passive and active permanent-magnet devices, since ready-to-use equations were still limited to a few domain areas. Although publications available in the literature have demonstrated the surface-charge modeling potential, they have only scratched the surface of its application domain. The research that is presented in this thesis proposes ready-to-use novel analytical equations for force, stiffness and torque. The analytical force equations for cuboidal permanent magnets are now applicable to any magnetization vector combination and any relative position. Symbolically derived stiffness equations directly provide the analytical 3 £ 3 stiffness matrix solution. Furthermore, analytical torque equations are introduced that allow for an arbitrary reference point, hence a direct torque calculation on any assembly of cuboidal permanent magnets. Some topics, such as the analytical calculation of the force and torque for rotated magnets and extensions to the field description of unconventionally shaped magnets, are outside the scope of this thesis are recommended for further research. A worldwide first permanent-magnet-based, high-force and low-stiffness vibration isolation system has been researched and developed using this advanced modeling technique. This one-of-a-kind 6-DoF vibration isolation system consumes a minimal amount of energy (Ç 1W) and exploits its electromagnetic nature by maximizing the isolation bandwidth (> 700Hz). The resulting system has its resonance > 1Hz with a -2dB per decade acceleration slope. It behaves near-linear throughout its entire 6-DoF working range, which allows for uncomplicated control structures. Its position accuracy is around 4mum, which is in close proximity to the sensor’s theoretical noise level of 1mum. The extensively researched passive (no energy consumption) permanent-magnet based gravity compensator forms the magnetic heart of this vibration isolation system. It combines a 7.1kN vertical force with <10kN/m stiffness in all six degrees of freedom. These contradictory requirements are extremely challenging and require the extensive research into gravity compensator topologies that is presented in this thesis. The resulting cross-shaped topology with vertical airgaps has been filed as a European patent. Experiments have illustrated the influence of the ambient temperature on the magnetic behavior, 1.7h/K or 12N/K, respectively. The gravity compensator has two integrated voice coil actuators that are designed to exhibit a high force and low power consumption (a steepness of 625N2/W and a force constant of 31N/A) within the given current and voltage constraints. Three of these vibration isolators, each with a passive 6-DoF gravity compensator and integrated 2-DoF actuation, are able to stabilize the six degrees of freedom. The experimental results demonstrate the feasibility of passive magnet-based gravity compensation for an advanced, high-force vibration isolation system. Its modular topology enables an easy force and stiffness scaling. Overall, the research presented in this thesis shows the high potential of this new class of electromagnetic devices for vibration isolation purposes or other applications that are demanding in terms of force, stiffness and energy consumption. As for any new class of devices, there are still some topics that require further study before this design can be implemented in the next generation of vibration isolation systems. Examples of these topics are the tunability of the gravity compensator’s force and a reduction of magnetic flux leakage

    Development and applications of the finite point method to compressible aerodynamics problems

    Get PDF
    This work deals with the development and application of the Finite Point Method (FPM) to compressible aerodynamics problems. The research focuses mainly on investigating the capabilities of the meshless technique to address practical problems, one of the most outstanding issues in meshless methods. The FPM spatial approximation is studied firstly, with emphasis on aspects of the methodology that can be improved to increase its robustness and accuracy. Suitable ranges for setting the relevant approximation parameters and the performance likely to be attained in practice are determined. An automatic procedure to adjust the approximation parameters is also proposed to simplify the application of the method, reducing problem- and user-dependence without affecting the flexibility of the meshless technique. The discretization of the flow equations is carried out following wellestablished approaches, but drawing on the meshless character of the methodology. In order to meet the requirements of practical applications, the procedures are designed and implemented placing emphasis on robustness and efficiency (a simplification of the basic FPM technique is proposed to this end). The flow solver is based on an upwind spatial discretization of the convective fluxes (using the approximate Riemann solver of Roe) and an explicit time integration scheme. Two additional artificial diffusion schemes are also proposed to suit those cases of study in which computational cost is a major concern. The performance of the flow solver is evaluated in order to determine the potential of the meshless approach. The accuracy, computational cost and parallel scalability of the method are studied in comparison with a conventional FEM-based technique. Finally, practical applications and extensions of the flow solution scheme are presented. The examples provided are intended not only to show the capabilities of the FPM, but also to exploit meshless advantages. Automatic hadaptive procedures, moving domain and fluid-structure interaction problems, as well as a preliminary approach to solve high-Reynolds viscous flows, are a sample of the topics explored. All in all, the results obtained are satisfactorily accurate and competitive in terms of computational cost (if compared with a similar mesh-based implementation). This indicates that meshless advantages can be exploited with efficiency and constitutes a good starting point towards more challenging applications

    Numerical solutions of differential equations on FPGA-enhanced computers

    Get PDF
    Conventionally, to speed up scientific or engineering (S&E) computation programs on general-purpose computers, one may elect to use faster CPUs, more memory, systems with more efficient (though complicated) architecture, better software compilers, or even coding with assembly languages. With the emergence of Field Programmable Gate Array (FPGA) based Reconfigurable Computing (RC) technology, numerical scientists and engineers now have another option using FPGA devices as core components to address their computational problems. The hardware-programmable, low-cost, but powerful “FPGA-enhanced computer” has now become an attractive approach for many S&E applications. A new computer architecture model for FPGA-enhanced computer systems and its detailed hardware implementation are proposed for accelerating the solutions of computationally demanding and data intensive numerical PDE problems. New FPGAoptimized algorithms/methods for rapid executions of representative numerical methods such as Finite Difference Methods (FDM) and Finite Element Methods (FEM) are designed, analyzed, and implemented on it. Linear wave equations based on seismic data processing applications are adopted as the targeting PDE problems to demonstrate the effectiveness of this new computer model. Their sustained computational performances are compared with pure software programs operating on commodity CPUbased general-purpose computers. Quantitative analysis is performed from a hierarchical set of aspects as customized/extraordinary computer arithmetic or function units, compact but flexible system architecture and memory hierarchy, and hardwareoptimized numerical algorithms or methods that may be inappropriate for conventional general-purpose computers. The preferable property of in-system hardware reconfigurability of the new system is emphasized aiming at effectively accelerating the execution of complex multi-stage numerical applications. Methodologies for accelerating the targeting PDE problems as well as other numerical PDE problems, such as heat equations and Laplace equations utilizing programmable hardware resources are concluded, which imply the broad usage of the proposed FPGA-enhanced computers
    corecore