1,305 research outputs found

    USRA/RIACS

    Get PDF
    The Research Institute for Advanced Computer Science (RIACS) was established by the Universities Space Research Association (USRA) at the NASA Ames Research Center (ARC) on 6 June 1983. RIACS is privately operated by USRA, a consortium of universities with research programs in the aerospace sciences, under a cooperative agreement with NASA. The primary mission of RIACS is to provide research and expertise in computer science and scientific computing to support the scientific missions of NASA ARC. The research carried out at RIACS must change its emphasis from year to year in response to NASA ARC's changing needs and technological opportunities. A flexible scientific staff is provided through a university faculty visitor program, a post doctoral program, and a student visitor program. Not only does this provide appropriate expertise but it also introduces scientists outside of NASA to NASA problems. A small group of core RIACS staff provides continuity and interacts with an ARC technical monitor and scientific advisory group to determine the RIACS mission. RIACS activities are reviewed and monitored by a USRA advisory council and ARC technical monitor. Research at RIACS is currently being done in the following areas: Parallel Computing; Advanced Methods for Scientific Computing; Learning Systems; High Performance Networks and Technology; Graphics, Visualization, and Virtual Environments

    Numerical solution of 3-D electromagnetic problems in exploration geophysics and its implementation on massively parallel computers

    Get PDF
    The growing significance, technical development and employment of electromagnetic (EM) methods in exploration geophysics have led to the increasing need for reliable and fast techniques of interpretation of 3-D EM data sets acquired in complex geological environments. The first and most important step to creating an inversion method is the development of a solver for the forward problem. In order to create an efficient, reliable and practical 3-D EM inversion, it is necessary to have a 3-D EM modelling code that is highly accurate, robust and very fast. This thesis focuses precisely on this crucial and very demanding step to building a 3-D EM interpretation method. The thesis presents as its main contribution a highly accurate, robust, very fast and extremely scalable numerical method for 3-D EM modelling in geophysics that is based on finite elements (FE) and designed to run on massively parallel computing platforms. Thanks to the fact that the FE approach supports completely unstructured tetrahedral meshes as well as local mesh refinements, the presented solver is able to represent complex geometries of subsurface structures very precisely and thus improve the solution accuracy and avoid misleading artefacts in images. Consequently, it can be successfully used in geological environments of arbitrary geometrical complexities. The parallel implementation of the method, which is based on the domain decomposition and a hybrid MPI-OpenMP scheme, has proved to be highly scalable - the achieved speed-up is close to the linear for more than a thousand processors. Thanks to this, the code is able to deal with extremely large problems, which may have hundreds of millions of degrees of freedom, in a very efficient way. The importance of having this forward-problem solver lies in the fact that it is now possible to create a 3-D EM inversion that can deal with data obtained in extremely complex geological environments in a way that is realistic for practical use in industry. So far, such imaging tool has not been proposed due to a lack of efficient, parallel FE solutions as well as the limitations of efficient solvers based on finite differences. In addition, the thesis discusses physical, mathematical and numerical aspects and challenges of 3-D EM modelling, which have been studied during my research in order to properly design the presented software for EM field simulations on 3-D areas of the Earth. Through this work, a physical problem formulation based on the secondary Coulomb-gauged EM potentials has been validated, proving that it can be successfully used with the standard nodal FE method to give highly accurate numerical solutions. Also, this work has shown that Krylov subspace iterative methods are the best solution for solving linear systems that arise after FE discretisation of the problem under consideration. More precisely, it has been discovered empirically that the best iterative method for this kind of problems is biconjugate gradient stabilised with an elaborate preconditioner. Since most commonly used preconditioners proved to be either unable to improve the convergence of the implemented solvers to the desired extent, or impractical in the parallel context, I have proposed a preconditioning technique for Krylov methods that is based on algebraic multigrid. Tests for various problems with different conductivity structures and characteristics have shown that the new preconditioner greatly improves the convergence of different Krylov subspace methods, which significantly reduces the total execution time of the program and improves the solution quality. Furthermore, the preconditioner is very practical for parallel implementation. Finally, it has been concluded that there are not any restrictions in employing classical parallel programming models, MPI and OpenMP, for parallelisation of the presented FE solver. Moreover, they have proved to be enough to provide an excellent scalability for it

    The EMCC / DARPA Massively Parallel Electromagnetic Scattering Project

    Get PDF
    The Electromagnetic Code Consortium (EMCC) was sponsored by the Advanced Research Program Agency (ARPA) to demonstrate the effectiveness of massively parallel computing in large scale radar signature predictions. The EMCC/ARPA project consisted of three parts

    Discrete sensitivity derivatives of the Navier-Stokes equations with a parallel Krylov solver

    Get PDF
    This paper solves an 'incremental' form of the sensitivity equations derived by differentiating the discretized thin-layer Navier Stokes equations with respect to certain design variables of interest. The equations are solved with a parallel, preconditioned Generalized Minimal RESidual (GMRES) solver on a distributed-memory architecture. The 'serial' sensitivity analysis code is parallelized by using the Single Program Multiple Data (SPMD) programming model, domain decomposition techniques, and message-passing tools. Sensitivity derivatives are computed for low and high Reynolds number flows over a NACA 1406 airfoil on a 32-processor Intel Hypercube, and found to be identical to those computed on a single-processor Cray Y-MP. It is estimated that the parallel sensitivity analysis code has to be run on 40-50 processors of the Intel Hypercube in order to match the single-processor processing time of a Cray Y-MP

    Three-Dimensional Aerodynamic Design Optimization Using Discrete Sensitivity Analysis and Parallel Computing

    Get PDF
    A hybrid automatic differentiation/incremental iterative method was implemented in the general purpose advanced computational fluid dynamics code (CFL3D Version 4.1) to yield a new code (CFL3D.ADII) that is capable of computing consistently discrete first order sensitivity derivatives for complex geometries. With the exception of unsteady problems, the new code retains all the useful features and capabilities of the original CFL3D flow analysis code. The superiority of the new code over a carefully applied method of finite-differences is demonstrated. A coarse grain, scalable, distributed-memory, parallel version of CFL3D.ADII was developed based on derivative stripmining . In this data-parallel approach, an identical copy of CFL3D.ADII is executed on each processor with different derivative input files. The effect of communication overhead on the overall parallel computational efficiency is negligible. However, the fraction of CFL3D.ADII duplicated on all processors has significant impact on the computational efficiency. To reduce the large execution time associated with the sequential 1-D line search in gradient-based aerodynamic optimization, an alternative parallel approach was developed. The execution time of the new approach was reduced effectively to that of one flow analysis, regardless of the number of function evaluations in the 1-D search. The new approach was found to yield design results that are essentially identical to those obtained from the traditional sequential approach but at much smaller execution time. The parallel CFL3D.ADII and the parallel 1-D line search are demonstrated in shape improvement studies of a realistic High Speed Civil Transport (HSCT) wing/body configuration represented by over 100 design variables and 200,000 grid points in inviscid supersonic flow on the 16 node IBM SP2 parallel computer at the Numerical Aerospace Simulation (NAS) facility, NASA Ames Research Center. In addition to making the handling of such a large problem possible, the use of parallel computation provided significantly reduced overall execution time and turnaround time

    A parallel algorithm for deformable contact problems

    Get PDF
    In the field of nonlinear computational solid mechanics, contact problems deal with the deformation of separate bodies which interact when they come in touch. Usually, these problems are formulated as constrained minimization problems which may be solved using optimization techniques such as penalty method, Lagrange multipliers, Augmented Lagrangian method, etc. This classical approach is based on node connectivities between the contacting bodies. These connectivities are created through the construction of contact elements introduced for the discretization of the contact interface, which incorporate the contact constraints in the global weak form. These methods are well known and widely used in the resolution of contact problems in engineering and science. As parallel computing platforms are nowadays widely available, solving large engineering problems on high performance computers is a real possibility for any engineer or researcher. Due to the memory and compute power that contact problems require and consume, they are good candidates for parallel computation. Industrial and scientific realistic contact problems involve different physical domains and a large number of degrees of freedom, so algorithms designed to run efficiently in high performance computers are needed. Nevertheless, the parallelization of the numerical solution methods that arises from the classical optimization techniques and discretization approaches presents some drawbacks which must be considered. Mainly, for general contact cases where sliding occurs, the introduction of contact elements requires the update of the mesh graph in a fixed number of time steps. From the point of view of the domain decomposition method for parallel resolution of numerical problems this is a major drawback due to its computational expensiveness, since dynamic repartitioning must be done to redistribute the updated mesh graph to the different processors. On the other hand, some of the optimization techniques modify dynamically the number of degrees of freedom in the problem, by introducing Lagrange multipliers as unknowns. In this work we introduce a Dirichlet-Neumann type parallel algorithm for the numerical solution of nonlinear frictional contact problems, putting a strong focus on its computational implementation. Among its main characteristics it can be highlighted that there is no need to update the mesh graph during the simulation, as no contact elements are used. Also, no additional degrees of freedom are introduced into the system, since no Lagrange multipliers are required. In this algorithm the bodies in contact are treated separately, in a segregated way. The coupling between the contacting bodies is performed through boundary conditions transfer at the contact zone. From a computational point of view, this feature allows to use a multi-code approach. Furthermore, the algorithm can be interpreted as a black-box method as it solves each body separately even with different computational codes. In addition, the contact algorithm proposed in this thesis can also be formulated as a general fixed-point solver for the solution of interface problems. This generalization gives us the theoretical basis to extrapolate and implement numerical techniques that were already developed and widely tested in the field of fluid-structure interaction (FSI) problems, especially those related to convergence ensurance and acceleration. We describe the parallel implementation of the proposed algorithm and analyze its parallel behaviour and performance in both validation and realistic test cases executed in HPC machines using several processors.En el ámbito de la mecánica de contacto computacional, los problemas de contacto tratan con la deformación que sufren cuerpos separados cuando interactúan entre ellos. Comunmente, estos problemas son formulados como problemas de minimización con restricciones, que pueden ser resueltos utilizando técnicas de optimización como la penalización, los multiplicadores de Lagrange, el Lagrangiano Aumentado, etc. Este enfoque clásico está basado en la conectividad de nodos entre los cuerpos, que se realiza a través de la construcción de los elementos de contacto que surgen de la discretización de la interfaz. Estos elementos incorporan las restricciones de contacto en forma débil. Debido al consumo de memoria y a los requerimientos de potencia de cálculo que los problemas de contacto requieren, resultan ser muy buenos candidatos para su paralelización computacional. Sin embargo, tanto la paralelización de los métodos numéricos que surgen de las técnicas clásicas de optimización como los distintos enfoques para su discretización, presentan algunas desventajas que deben ser consideradas. Por un lado, el principal problema aparece ya que en los casos más generales de la mecánica de contacto ocurre un deslizamiento entre cuerpos. Por este motivo, la introducción de los elementos de contacto vuelve necesaria una actualización del grafo de la malla cada cierto número de pasos de tiempo. Desde el punto de vista del método de descomposición de dominios utilizado en la resolución paralela de problemas numéricos, esto es una gran desventaja debidoa su coste computacional. En estos casos, un reparticionamiento dinámico debe ser realizado para redistribuir el grafo actualizado de la malla entre los diferentes procesadores. Por otro lado, algunas técnicas de optimización modifican dinámicamente el número de grados de libertad del problema al introducir multiplicadores de Lagrange como incógnitas. En este trabajo presentamos un algoritmo paralelo del tipo Dirichlet-Neumann para la resolución numérica de problemas de contacto no lineales con fricción, poniendo un especial énfasis en su implementación computacional. Entre sus principales características se puede destacar que no hay necesidad de actualizar el grafo de la malla durante la simulación, ya que en este algoritmo no se utilizan elementos de contacto. Adicionalmente, ningún grado de libertad extra es introducido al sistema, ya que los multiplicadores de Lagrange no son requeridos. En este algoritmo los cuerpos en contacto son tratados de forma separada, de una manera segregada. El acople entre estos cuerpos es realizado a través del intercambio de condiciones de contorno en la interfaz de contacto. Desde un punto de vista computacional, esta característica permite el uso de un enfoque multi-código. Además, este algoritmo puede ser interpretado como un método del tipo black-box ya que permite resolver cada cuerpo por separado, aún utilizando distintos códigos computacionales. Adicionalmente, el algoritmo de contacto propuesto en esta tesis puede ser formulado como un esquema de resolución de punto fijo, empleado de forma general en la solución de problemas de interfaz. Esta generalización permite extrapolar técnicas numéricas ya utilizadas en los problemas de interacción fluido-estructura e implementarlas en la mecánica de contacto, en especial aquellas relacionadas con el aseguramiento y aceleración de la convergencia. En este trabajo describimos la implementación paralela del algoritmo propuesto y analizamos su comportamiento y performance paralela tanto en casos de validación como reales, ejecutados en computadores de alta performance utilizando varios procesadores.Postprint (published version
    corecore