357 research outputs found

    On the equivalence between the Scheduled Relaxation Jacobi method and Richardson's non-stationary method

    Get PDF
    The Scheduled Relaxation Jacobi (SRJ) method is an extension of the classical Jacobi iterative method to solve linear systems of equations (Au=b) associated with elliptic problems. It inherits its robustness and accelerates its convergence rate computing a set of P relaxation factors that result from a minimization problem. In a typical SRJ scheme, the former set of factors is employed in cycles of M consecutive iterations until a prescribed tolerance is reached. We present the analytic form for the optimal set of relaxation factors for the case in which all of them are strictly different, and find that the resulting algorithm is equivalent to a non-stationary generalized Richardson's method where the matrix of the system of equations is preconditioned multiplying it by D=diag(A). Our method to estimate the weights has the advantage that the explicit computation of the maximum and minimum eigenvalues of the matrix A (or the corresponding iteration matrix of the underlying weighted Jacobi scheme) is replaced by the (much easier) calculation of the maximum and minimum frequencies derived from a von Neumann analysis of the continuous elliptic operator. This set of weights is also the optimal one for the general problem, resulting in the fastest convergence of all possible SRJ schemes for a given grid structure. The amplification factor of the method can be found analytically and allows for the exact estimation of the number of iterations needed to achieve a desired tolerance. We also show that with the set of weights computed for the optimal SRJ scheme for a fixed cycle size it is possible to estimate numerically the optimal value of the parameter ω in the Successive Overrelaxation (SOR) method in some cases. Finally, we demonstrate with practical examples that our method also works very well for Poisson-like problems in which a high-order discretization of the Laplacian operator is employed (e.g., a 9- or 17-points discretization). This is of interest since the former discretizations do not yield consistently ordered A matrices and, hence, the theory of Young cannot be used to predict the optimal value of the SOR parameter. Furthermore, the optimal SRJ schemes deduced here are advantageous over existing SOR implementations for high-order discretizations of the Laplacian operator in as much as they do not need to resort to multi-coloring schemes for their parallel implementation

    Scheduled Relaxation Jacobi method: Improvements and applications

    Get PDF
    Elliptic partial differential equations (ePDEs) appear in a wide variety of areas of mathematics, physics and engineering. Typically, ePDEs must be solved numerically, which sets an ever growing demand for efficient and highly parallel algorithms to tackle their computational solution. The Scheduled Relaxation Jacobi (SRJ) is a promising class of methods, atypical for combining simplicity and efficiency, that has been recently introduced for solving linear Poisson-like ePDEs. The SRJ methodology relies on computing the appropriate parameters of a multilevel approach with the goal of minimizing the number of iterations needed to cut down the residuals below specified tolerances. The efficiency in the reduction of the residual increases with the number of levels employed in the algorithm. Applying the original methodology to compute the algorithm parameters with more than 5 levels notably hinders obtaining optimal SRJ schemes, as the mixed (non- linear) algebraic-differential system of equations from which they result becomes notably stiff. Here we present a new methodology for obtaining the parameters of SRJ schemes that overcomes the limitations of the original algorithm and provide parameters for SRJ schemes with up to 15 levels and resolutions of up to 2^15 points per dimension, allowing for acceleration factors larger than several hundreds with respect to the Jacobi method for typical resolutions and, in some high resolution cases, close to 1000. Most of the success in finding SRJ optimal schemes with more than 10 levels is based on an analytic reduction of the complexity of the previously mentioned system of equations. Furthermore, we extend the original algorithm to apply it to certain systems of non-linear ePDEs

    Efficient GPU implementation of a Boltzmann‑Schrödinger‑Poisson solver for the simulation of nanoscale DG MOSFETs

    Get PDF
    81–102, 2019) describes an efficient and accurate solver for nanoscale DG MOSFETs through a deterministic Boltzmann-Schrödinger-Poisson model with seven electron–phonon scattering mechanisms on a hybrid parallel CPU/GPU platform. The transport computational phase, i.e. the time integration of the Boltzmann equations, was ported to the GPU using CUDA extensions, but the computation of the system’s eigenstates, i.e. the solution of the Schrödinger-Poisson block, was parallelized only using OpenMP due to its complexity. This work fills the gap by describing a port to GPU for the solver of the Schrödinger-Poisson block. This new proposal implements on GPU a Scheduled Relaxation Jacobi method to solve the sparse linear systems which arise in the 2D Poisson equation. The 1D Schrödinger equation is solved on GPU by adapting a multi-section iteration and the Newton-Raphson algorithm to approximate the energy levels, and the Inverse Power Iterative Method is used to approximate the wave vectors. We want to stress that this solver for the Schrödinger-Poisson block can be thought as a module independent of the transport phase (Boltzmann) and can be used for solvers using different levels of description for the electrons; therefore, it is of particular interest because it can be adapted to other macroscopic, hence faster, solvers for confined devices exploited at industrial level.Project PID2020-117846GB-I00 funded by the Spanish Ministerio de Ciencia e InnovaciónProject A-TIC-344-UGR20 funded by European Regional Development Fund

    Improved Numerical Methods for Elliptic Problems in Astrophysics

    Get PDF
    Les ePDEs (elliptic partial differential equations, en anglès) apareixen en una àmplia varietat d'àrees de les matemàtiques, la física i l'enginyeria. Són de particular interès en Astrofísica on apareixen, per exemple, quan es calcula el potencial gravitacional, en la solució de l'equació de Grad-Shafranov per magnetosferes lliures de forces, o d'imposar lligadures de divergència zero en la integració numèrica de les equacions MHD (magnetohydrodynamics, en anglès). En general, les ePDEs s'han de resoldre numèricament, establint una demanda cada vegada més gran d'algoritmes eficients i altament paral·lels per abordar la seua resolució computacional. El SRJ (scheduled relaxation Jacobi, en anglès) pertany a una prometedora classe de mètodes, atípic per la combinació de senzillesa i eficàcia, que s'ha introduït recentment per resoldre ePDEs lineals de tipus Poisson. És una extensió del mètode iteratiu clàssic de Jacobi utilitzat per resoldre sistemes d'equacions lineals del tipus Au = b. Hereta, d'entre altres, la seua robustesa. La seua metodologia es basa en el càlcul d'uns paràmetres apropiats per a una aproximació multinivell amb l'objectiu de minimitzar el nombre d'iteracions necessàries per a reduir el residual per davall d'una tolerància especificada. L'eficiència en la reducció del residual augmenta amb el nombre de nivells emprats en l'algoritme. Tanmateix, l'aplicació de la metodologia original per calcular els paràmetres d'estos esquemes SRJ òptims més enllà de 5 nivells és enormement dificultosa. Això és degut fonamentalment a la presència d'un sistema mixt algebraic-diferencial (no lineal) d'equacions el qual es torna cada vegada més rígid a mesura que augmenta el nombre de nivells. D'una banda, hem trobat una nova metodologia per a l'obtenció dels paràmetres dels esquemes òptims de l'algoritme SRJ que supera les limitacions de la metodologia original i proporciona els paràmetres per a estos esquemes amb un nombre elevat de nivells, fóra bo fins a 15, i per a resolucions de fins a 215 punts per dimensió. Això dóna lloc a factors d'acceleració de diversos centenars respecte del mètode de Jacobi en el cas de resolucions típiques i de milers en alguns casos amb altes resolucions. La major part de l'èxit en la recerca d'estos esquemes òptims amb més de 10 nivells es basa en una reducció analítica de la complexitat del sistema d'equacions abans esmentat. A més, s'estén l'algoritme original per aplicar-lo a certs sistemes d'equacions el·líptiques no lineals. D'altra banda, en un esquema típic SRJ, s'empra l'anterior conjunt de paràmetres en cicles de M iteracions consecutives fins que s'arriba a la tolerància prescrita. Presentem la forma analítica del conjunt òptim de factors de relaxació per al cas en què tots ells són estrictament diferents, i veiem que l'algoritme resultant és equivalent al mètode no estacionari de Richardson generalitzat, en el que es precondiciona la matriu del sistema d'equacions multiplicant per D = diag(A). El nostre mètode per estimar els pesos té l'avantatge que el càlcul explícit dels valors propis mínim i màxim de la matriu A (o la matriu d'iteració corresponent de l'esquema de Jacobi amb pes subjacent) es substitueix pel càlcul (molt més fàcil) de les freqüències mínima i màxima derivades de l'anàlisi d'estabilitat de von Neumann de l'operador el·líptic continu. Este conjunt de pesos també és l'òptim per al problema general, la qual cosa ens dóna la convergència més ràpida de tots els possibles esquemes SRJ per una estructura de malla donada. Ens referirem a ell com el mètode de Chebyshev-Jacobi. El factor d'amplificació del mètode es pot trobar analíticament i permet l'estimació exacta del nombre d'iteracions necessàries per a assolir la tolerància desitjada. També mostrem que a partir del conjunt de pesos calculats per l'esquema SRJ òptim per a una mida de cicle fix és possible calcular numèricament el valor òptim del factor de relaxació del mètode SOR (successive overrelaxation, en anglès) en alguns casos. Demostrem amb exemples pràctics, d'aplicació en Astrofísica, que el nostre mètode també funciona molt bé per als problemes de tipus Poisson en els que es fa servir una discretització d'alt ordre de l'operador Laplacià (per exemple, discretitzacions de 9- o 17- punts). Això té molt d'interès, ja que estes discretitzacions no produeixen matrius CO (consistently ordered, en anglès) i, per tant, la teoria de Young no es pot utilitzar per calcular el valor òptim del paràmetre de relaxació òptim de SOR. D'altra banda, els esquemes SRJ òptims deduïts ací són avantatjoses respecte a les implementacions existents per SOR pel que fa a discretitzacions d'alt ordre de l'operador Laplacià en la mesura que no cal recórrer als esquemes multicolors per a la seua execució en paral·lel. Presentem el mètode de Chebyshev-Jacobi fent servir una implementació purament MPI i una implementació híbrida OpenMP/MPI, ambdues sobre màquines de memòria compartida i de memòria distribuïda. Mostrem el seu rendiment i com escalen. També mostrem com arribar a velocitats de convergència notables amb execucions en paral·lel sobre GPUs quan la resolució d'equacions en derivades parcials el·líptiques amb diferències finites es fa utilitzant de manera conjunta el mètode de Chebyshev-Jacobi i les discretitzacions d'alt ordre. Finalment, tractar d'aplicar els nostres mètodes més enllà de l'àmbit de l'Astrofísica. En particular, abordem el problema de trobar els modes normals de vibració de l'ull humà. Este problema es pot resoldre amb una variant millorada de la metodologia que ací es presenta. La millora consisteix a estendre el càlcul del conjunt òptim de paràmetres al cas de matrius no definides positives. Les nostres idees sobre com procedir en este camp s'esbossen en el treball futur d'esta tesi.Elliptic partial differential equations appear in a wide variety of areas of mathematics, physics and engineering. They are of particular interest in Astrophysics and appear, e.g., when computing the gravitational potential, in the solution of the Grad-Shafranov equation for force-free magnetospheres, to impose divergence free constraints in the numerical integration of MHD equations or when solving the constraint equations in General Relativity. Typically, elliptic equations must be solved numerically, which sets an ever-growing demand for efficient and highly parallel algorithms to tackle their computational solution. The Scheduled Relaxation Jacobi is a promising class of methods, atypical for combining simplicity and efficiency, that has been recently introduced for solving linear Poisson-like elliptic equations. It is an extension of the classical Jacobi iterative method to solve linear systems of equations (Au=b). It also inherits its robustness. Its methodology relies on computing the appropriate parameters of a multilevel approach with the goal of minimizing the number of iterations needed to cut down the residuals below specified tolerances. The efficiency in the reduction of the residual increases with the number of levels employed in the algorithm. Applying the original methodology to compute the algorithm parameters with more than 5 levels notably hinders obtaining optimal schemes, as the mixed (non-linear) algebraic-differential system of equations from which they result become notably stiff. On one hand, we present a new methodology for obtaining the parameters of Scheduled Relaxation Jacobi schemes that overcomes the limitations of the original algorithm and provides parameters for these schemes with up to 15 levels and resolutions of up to 215 points per dimension, allowing for acceleration factors larger than several hundreds with respect to the Jacobi method for typical resolutions and, in some high resolution cases, close to 1000. Most of the success in finding these optimal schemes with more than 10 levels is based on an analytic reduction of the complexity of the previously mentioned system of equations. Furthermore, we extend the original algorithm to apply it to certain systems of non-linear elliptic equations. On the other hand, in a typical Scheduled Relaxation Jacobi scheme, the former set of factors is employed in cycles of M consecutive iterations until a prescribed tolerance is reached. We present the analytic form for the optimal set of relaxation factors for the case in which all of them are strictly different, and find that the resulting algorithm is equivalent to a non-stationary generalized Richardson's method where the matrix of the system of equations is preconditioned multiplying it by D=diag(A). Our method to estimate the weights has the advantage that the explicit computation of the maximum and minimum eigenvalues of the matrix A (or the corresponding iteration matrix of the underlying weighted Jacobi scheme) is replaced by the (much easier) calculation of the maximum and minimum frequencies derived from a von Neumann analysis of the continuous elliptic operator. This set of weights is also the optimal one for the general problem, resulting in the fastest convergence of all possible Scheduled Relaxation Jacobi schemes for a given grid structure. We refer to it as the Chebyshev-Jacobi method. The amplification factor of the method can be found analytically and allows for the exact estimation of the number of iterations needed to achieve a desired tolerance. We also show that with the set of weights computed for the optimal SRJ scheme for a fixed cycle size it is possible to estimate numerically the optimal value of the relaxation factor in the successive overrelaxation method in some cases. We demonstrate with practical examples, with application in Astrophysics, that our method also works very well for Poisson-like problems in which a high-order discretization of the Laplacian operator is employed (e.g., a 9- or 17-points discretization). This is of interest since the former discretizations do not yield consistently ordered A matrices and, hence, the theory of Young cannot be used to predict the optimal value of the SOR parameter. Furthermore, the optimal SRJ schemes deduced here are advantageous over existing SOR implementations for high-order discretizations of the Laplacian operator in as much as they do not need to resort to multi-coloring schemes for their parallel implementation. We present the implementation of the Chebyshev-Jacobi method using a purely MPI implementation, an openMP / MPI hybrid implementation over both shared memory machines and distributed memory machines. They show ideal speed up. We also show how to reach a remarkable speed up when solving elliptic partial differential equations with finite differences thanks to the joint use of the Chebyshev-Jacobi method with high order discretizations and its parallel implementation over GPUs. Finally, we have tried to apply our methods beyond the realm of Astrophysics with limited success though. Specially, we have addressed the problem of finding normal modes of human eyeballs. This problem is ready for being solved with an improved variant of the methodology here presented. The improvement consists on extending the calculation of the optimal set of parameters to non positive-definite matrices. Our ideas on how to proceed in this field are sketched in the outlook of this thesis

    Spontaneous formation of fluid escape pipes from subsurface reservoirs

    Get PDF
    Ubiquitous observations of channelised fluid flow in the form of pipes or chimney-like features in sedimentary sequences provide strong evidence for significant transient permeability-generation in the subsurface. Understanding the mechanisms and dynamics for spontaneous flow localisation into fluid conductive chimneys is vital for natural fluid migration and anthropogenic fluid and gas operations, and in waste sequestration. Yet no model exists that can predict how, when, or where these conduits form. Here we propose a physical mechanism and show that pipes and chimneys can form spontaneously through hydro-mechanical coupling between fluid flow and solid deformation. By resolving both fluid flow and shear deformation of the matrix in three dimensions, we predict fluid flux and matrix stress distribution over time. The pipes constitute efficient fluid pathways with permeability enhancement exceeding three orders of magnitude. We find that in essentially impermeable shale, vertical fluid migration rates in the high-permeability pipes or chimneys approach rates expected in permeable sandstones. This previously unidentified fluid focusing mechanism bridges the gap between observations and established conceptual models for overcoming and destroying assumed impermeable barriers. This mechanism therefore has a profound impact on assessing the evolution of leakage pathways in natural gas emissions, for reliable risk assessment for long-term subsurface waste storage, or CO2 sequestration

    Numerical relativity initial data for neutron star binaries and the hyperbolic relaxation method

    Get PDF
    Numerical Relativity Initial Data for Neutron Star Binaries and the Hyperbolic Relaxation Method. A new class of relaxation schemes for the solution of systems of elliptic partial differential equations is derived from a modification of the parabolic Jacobi relaxation scheme. The new scheme can be viewed as a formal embedding of the elliptic partial differential equations into a system of hyperbolic partial differential equations with damping. The hyperbolicity of the hyperbolic relaxation equations is examined and the implementation of the method in the hyperbolic relaxation code "bamps" is discussed. Furthermore the convergence properties investigated for simple model problems in terms of an analytical mode analysis and an experimental comparison to other numerical solution methods for elliptic partial differential equations. Subsequently the hyperbolic relaxation scheme is applied to the initial data problem in numerical relativity for the case of massless scalar fields, single neutron stars and binary neutron stars. For binary neutron stars a scheme avoiding adapted coordinates by formally extending the elliptic partial differential equations to regions outside the neutron stars is investigated. To improve the physical accuracy of initial data for neutron star binaries with spin an attempt is made to take into account terms that are neglected in current state-of-the-art formalisms. By investigation of additional constraints originating from the neglected terms that the current formalism would actually pose unphysical requirements on the solution. Consequently some inconsistencies in the current formalism are pointed out and a modified formalism fixing these inconsistencies is discussed
    corecore