37 research outputs found

    Phase-space iterative solvers

    Full text link
    I introduce a new iterative method to solve problems in small-strain non-linear elasticity. The method is inspired by recent work in data-driven computational mechanics, which reformulated the classic boundary value problem of continuum mechanics using the concept of "phase space". The latter is an abstract metric space, whose coordinates are indexed by strains and stress components, where each possible state of the discretized body corresponds to a point. Since the phase space is associated to the discretized body, it is finite dimensional. Two subsets are then defined: an affine space termed "physically-admissible set" made up by those points that satisfy equilibrium and a "materially-admissible set" containing points that satisfy the constitutive law. Solving the boundary-value problem amounts to finding the intersection between these two subdomains. In the linear-elastic setting, this can be achieved through the solution of a set of linear equations; when material non-linearity enters the picture, such is not the case anymore and iterative solution approaches are necessary. Our iterative method consists on projecting points alternatively from one set to the other, until convergence. The method is similar in spirit to the "method of alternative projections" and to the "method of projections onto convex sets", for which there is a solid mathematical foundation that furnishes conditions for existence and uniqueness of solutions, upon which we rely to uphold our new method's performance. We present two examples to illustrate the applicability of the method, and to showcase its strengths when compared to the classic Newton-Raphson method, the usual tool of choice in non-linear continuum mechanics.Comment: 22 pages, 7 tables, 6 figure

    Advances in Time-Domain Electromagnetic Simulation Capabilities Through the Use of Overset Grids and Massively Parallel Computing

    Get PDF
    A new methodology is presented for conducting numerical simulations of electromagnetic scattering and wave propagation phenomena. Technologies from several scientific disciplines, including computational fluid dynamics, computational electromagnetics, and parallel computing, are uniquely combined to form a simulation capability that is both versatile and practical. In the process of creating this capability, work is accomplished to conduct the first study designed to quantify the effects of domain decomposition on the performance of a class of explicit hyperbolic partial differential equations solvers; to develop a new method of partitioning computational domains comprised of overset grids; and to provide the first detailed assessment of the applicability of overset grids to the field of computational electromagnetics. Furthermore, the first Finite Volume Time Domain (FVTD) algorithm capable of utilizing overset grids on massively parallel computing platforms is developed and implemented. Results are presented for a number of scattering and wave propagation simulations conducted using this algorithm, including two spheres in close proximity and a finned missile

    Schedules for Dynamic Bidirectional Simulations on Parallel Computers

    Get PDF
    For adjoint calculations, parameter estimation, and similar purposes one may need to reverse the execution of a computer program. The simplest option is to record a complete execution log and then to read it backwards. This requires massive amounts of storage. Instead one may generate the execution log piecewise by restarting the ``forward'' calculation repeatedly from suitably placed checkpoints. This thesis extends the theoretical results of the parallel reversal schedules. First a algorithm was constructed which carries out the ``forward'' calculation and distributes checkpoints in a way, such that the reversal calculation can be started at any time. This approach provides adaptive parallel reversal schedules for simulations where the number of time steps is not known a-priori. The number of checkpoints and processors used is optimal at any time. Further, an algorithm was described which makes is possible to restart the initial computer program during the program reversal. Again, this can be done without any additional computation at any time. Hence, optimal parallel reversal schedules for the bidirectional simulation are provided by this thesis.Bei der Berechnung von Adjungierten, zum Debuggen und für ähnliche Anwendungen kann man die Umkehr der entsprechenden Programmauswertung verwenden. Der einfachste Ansatz, nämlich das Erstellen einer kompletten Mitschrift der Vorwärtsrechnung, welche anschließend rückwärts gelesen wird, verursacht einen enormen Speicherplatzbedarf. Als Alternative dazu kann man die Mitschrift auch stückweise erzeugen, indem die Programmauswertung von passend gewählten Checkpoints wiederholt gestartet wird. In dieser Arbeit wird die Theorie der optimalen parallelen Umkehrschemata erweitert. Zum einen erfolgt die Konstruktion von adaptiven parallelen Umkehrschemata. Dafür wird ein Algorithmus beschrieben, der es durch die Nutzung von mehreren Prozessen ermöglicht, Checkpoints so zu verteilen, daß die Umkehrung des Programmes jederzeit ohne Zeitverlust erfolgen kann. Hierbei bleibt die Zahl der verwendeten Checkpoints und Prozesse innerhalb der bekannten Optimalitätsgrenzen. Zum anderen konnte für die adaptiven parallelen Umkehrschemata ein Algorithmus entwickelt werden, welcher ein Restart der eigentlichen Programmauswertung basierend auf der laufenden Programmumkehr erlaubt. Dieser Restart kann wieder jederzeit ohne Zeitverlust erfolgen und die entstehenden Checkpointverteilung erfüllen wieder sowohl Optimalitäts- als auch die Adaptivitätskriterien. Zusammenfassend wurden damit in dieser Arbeit Schemata konstruiert, die bidirektionale Simulationen ermöglichen

    Aeronautical Engineering: A Continuing Bibliography with Indexes

    Get PDF
    This supplemental issue of Aeronautical Engineering, A Continuing Bibliography with Indexes (NASA/SP-1999-7037) lists reports, articles, and other documents recently announced in the NASA STI Database. The coverage includes documents on the engineering and theoretical aspects of design, construction, evaluation, testing, operation, and performance of aircraft (including aircraft engines) and associated components, equipment, and systems. It also includes research and development in aerodynamics, aeronautics, and ground support equipment for aeronautical vehicles. Each entry in the publication consists of a standard bibliographic citation accompanied, in most cases, by an abstract. Two indexes-subject and author are included after the abstract section

    Volunteer computing

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2001.Includes bibliographical references (p. 205-216).This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.This thesis presents the idea of volunteer computing, which allows high-performance parallel computing networks to be formed easily, quickly, and inexpensively by enabling ordinary Internet users to share their computers' idle processing power without needing expert help. In recent years, projects such as SETI@home have demonstrated the great potential power of volunteer computing. In this thesis, we identify volunteer computing's further potentials, and show how these can be achieved. We present the Bayanihan system for web-based volunteer computing. Using Java applets, Bayanihan enables users to volunteer their computers by simply visiting a web page. This makes it possible to set up parallel computing networks in a matter of minutes compared to the hours, days, or weeks required by traditional NOW and metacomputing systems. At the same time, Bayanihan provides a flexible object-oriented software framework that makes it easy for programmers to write various applications, and for researchers to address issues such as adaptive parallelism, fault-tolerance, and scalability. Using Bayanihan, we develop a general-purpose runtime system and APIs, and show how volunteer computing's usefulness extends beyond solving esoteric mathematical problems to other, more practical, master-worker applications such as image rendering, distributed web-crawling, genetic algorithms, parametric analysis, and Monte Carlo simulations. By presenting a new API using the bulk synchronous parallel (BSP) model, we further show that contrary to popular belief and practice, volunteer computing need not be limited to master-worker applications, but can be used for coarse-grain message-passing programs as well. Finally, we address the new problem of maintaining reliability in the presence of malicious volunteers. We present and analyze traditional techniques such as voting, and new ones such as spot-checking, encrypted computation, and periodic obfuscation. Then, we show how these can be integrated in a new idea called credibility-based fault-tolerance, which uses probability estimates to limit and direct the use of redundancy. We validate this new idea with parallel Monte Carlo simulations, and show how it can achieve error rates several orders-of-magnitude smaller than traditional voting for the same slowdown.by Luis F.G. Sarmenta.Ph.D

    Aeronautical Engineering: A Continuing Bibliography with Indexes

    Get PDF
    This report lists reports, articles and other documents recently announced in the NASA STI Database. The coverage includes documents on the engineering and theoretical aspects of design, construction, evaluation, testing, operation, and performance of aircraft (including aircraft engines) and associated components, equipment, and systems. It also includes research and development in aerodynamics, aeronautics, and ground support equipment for aeronautical vehicles. Each entry in the publication consists of a standard bibliographic citation accompanied, in most cases, by an abstract

    Mathematical and numerical modelling of dispersive water waves

    Get PDF
    Fecha de lectura de Tesis: 4 diciembre 2018.En esta tesis doctoral se expone en primer lugar una visión general del modelado de ondas dispersivas para la simulación de procesos tsunami-génicos. Se deduce un nuevo sistema bicapa con propiedades de dispersión mejoradas y un nuevo sistema hiperbólico. Además se estudian sus respectivas propiedades dispersivas, estructura espectral y ciertas soluciones analíticas. Así mismo, se ha diseñado un nuevo modelo de viscosidad sencillo para la simulación de los fenómenos físicos relacionados con la ruptura de olas en costa. Se establecen los resultados teóricos requeridos para el diseño de esquemas numéricos de tipo volúmenes finitos y Galerkin discontinuo de alto orden bien equilibrados para sistemas hiperbólicos no conservativos en una y dos dimensiones. Más adelante, los esquemas numéricos propuestos para los sistemas de presión no hidrostática introducidos se describen. Se pueden destacar diferentes enfoques y estrategias. Por un lado, se diseñan esquemas de volúmenes finitos implícitos de tipo proyección-corrección en mallas decaladas y no decaladas. Por otro lado, se propone un esquema numérico de tipo Galerkin discontinuo explícito para el nuevo sistema de EDPs hiperbólico propuesto. Para permitir simulaciones en tiempo real, una implementación eficiente en GPU de los métodos es llevado a cabo y algunas directrices sobre su implementación son dados. Los esquemas numéricos antes mencionados se han aplicado a test de referencia académicos y a situaciones físicas más desafiantes como la simulación de tsunamis reales, y la comparación con datos de campo. Finalmente, un último capítulo es dedicado a medir la influencia al considerar efectos dispersivos en la simulación de transporte y arrastre de sedimentos. Para ello, se deduce un nuevo sistema de dos capas de aguas someras, se diseña un esquema numérico y se muestran algunos test académicos y de validación, que ofrecen resultados prometedores

    Implementation and Evaluation of Algorithmic Skeletons: Parallelisation of Computer Algebra Algorithms

    Get PDF
    This thesis presents design and implementation approaches for the parallel algorithms of computer algebra. We use algorithmic skeletons and also further approaches, like data parallel arithmetic and actors. We have implemented skeletons for divide and conquer algorithms and some special parallel loops, that we call ‘repeated computation with a possibility of premature termination’. We introduce in this thesis a rational data parallel arithmetic. We focus on parallel symbolic computation algorithms, for these algorithms our arithmetic provides a generic parallelisation approach. The implementation is carried out in Eden, a parallel functional programming language based on Haskell. This choice enables us to encode both the skeletons and the programs in the same language. Moreover, it allows us to refrain from using two different languages—one for the implementation and one for the interface—for our implementation of computer algebra algorithms. Further, this thesis presents methods for evaluation and estimation of parallel execution times. We partition the parallel execution time into two components. One of them accounts for the quality of the parallelisation, we call it the ‘parallel penalty’. The other is the sequential execution time. For the estimation, we predict both components separately, using statistical methods. This enables very confident estimations, although using drastically less measurement points than other methods. We have applied both our evaluation and estimation approaches to the parallel programs presented in this thesis. We haven also used existing estimation methods. We developed divide and conquer skeletons for the implementation of fast parallel multiplication. We have implemented the Karatsuba algorithm, Strassen’s matrix multiplication algorithm and the fast Fourier transform. The latter was used to implement polynomial convolution that leads to a further fast multiplication algorithm. Specially for our implementation of Strassen algorithm we have designed and implemented a divide and conquer skeleton basing on actors. We have implemented the parallel fast Fourier transform, and not only did we use new divide and conquer skeletons, but also developed a map-and-transpose skeleton. It enables good parallelisation of the Fourier transform. The parallelisation of Karatsuba multiplication shows a very good performance. We have analysed the parallel penalty of our programs and compared it to the serial fraction—an approach, known from literature. We also performed execution time estimations of our divide and conquer programs. This thesis presents a parallel map+reduce skeleton scheme. It allows us to combine the usual parallel map skeletons, like parMap, farm, workpool, with a premature termination property. We use this to implement the so-called ‘parallel repeated computation’, a special form of a speculative parallel loop. We have implemented two probabilistic primality tests: the Rabin–Miller test and the Jacobi sum test. We parallelised both with our approach. We analysed the task distribution and stated the fitting configurations of the Jacobi sum test. We have shown formally that the Jacobi sum test can be implemented in parallel. Subsequently, we parallelised it, analysed the load balancing issues, and produced an optimisation. The latter enabled a good implementation, as verified using the parallel penalty. We have also estimated the performance of the tests for further input sizes and numbers of processing elements. Parallelisation of the Jacobi sum test and our generic parallelisation scheme for the repeated computation is our original contribution. The data parallel arithmetic was defined not only for integers, which is already known, but also for rationals. We handled the common factors of the numerator or denominator of the fraction with the modulus in a novel manner. This is required to obtain a true multiple-residue arithmetic, a novel result of our research. Using these mathematical advances, we have parallelised the determinant computation using the Gauß elimination. As always, we have performed task distribution analysis and estimation of the parallel execution time of our implementation. A similar computation in Maple emphasised the potential of our approach. Data parallel arithmetic enables parallelisation of entire classes of computer algebra algorithms. Summarising, this thesis presents and thoroughly evaluates new and existing design decisions for high-level parallelisations of computer algebra algorithms

    The scalability of reliable computation in Erlang

    Get PDF
    With the advent of many-core architectures, scalability is a key property for programming languages. Actor-based frameworks like Erlang are fundamentally scalable, but in practice they have some scalability limitations. The RELEASE project aims to scale the Erlang's radical concurrency-oriented programming paradigm to build reliable general-purpose software, such as server-based systems, on emergent commodity architectures with 10,000 cores. The RELEASE consortium works to scale Erlang at the virtual machine, language level, infrastructure levels, and to supply profiling and refactoring tools. This research contributes to the RELEASE project at the language level. Firstly, we study the provision of scalable persistent storage options for Erlang. We articulate the requirements for scalable and available persistent storage, and evaluate four popular Erlang DBMSs against these requirements. We investigate the scalability limits of the Riak NoSQL DBMS using Basho Bench up to 100 nodes on the Kalkyl cluster and establish for the first time scientifically the scalability limit of Riak as 60 nodes, thereby confirming developer folklore. We design and implement DE-Bench, a scalable fault-tolerant peer-to-peer benchmarking tool that measures the throughput and latency of distributed Erlang commands on a cluster of Erlang nodes. We employ DE-Bench to investigate the scalability limits of distributed Erlang on up to 150 nodes and 1200 cores. Our results demonstrate that the frequency of global commands limits the scalability of distributed Erlang. We also show that distributed Erlang scales linearly up to 150 nodes and 1200 cores with relatively heavy data and computation loads when no global commands are used. As part of the RELEASE project, the Glasgow University team has developed Scalable Distributed Erlang (SD Erlang) to address the scalability limits of distributed Erlang. We evaluate SD Erlang by designing and implementing the first ever demonstrators for SD Erlang, i.e. DE-Bench, Orbit and Ant Colony Optimisation(ACO). We employ DE-Bench to evaluate the performance and scalability of group operations in SD-Erlang up to 100 nodes. Our results show that the alternatives SD-Erlang offers for global commands (i.e. group commands) scale linearly up to 100 nodes. We also develop and evaluate an SD-Erlang implementation of Orbit, a symbolic computing kernel and a generalization of a transitive closure computation. Our evaluation results show that SD Erlang Orbit outperforms the distributed Erlang Orbit on 160 nodes and 1280 cores. Moreover, we develop a reliable distributed version of ACO and show that the reliability of ACO limits its scalability in traditional distributed Erlang. We use SD-Erlang to improve the scalability of the reliable ACO by eliminating global commands and avoiding full mesh connectivity between nodes. We show that SD Erlang reduces the network traffic between nodes in an Erlang cluster effectively
    corecore