93,317 research outputs found

    Three-Dimensional Aerodynamic Design Optimization Using Discrete Sensitivity Analysis and Parallel Computing

    Get PDF
    A hybrid automatic differentiation/incremental iterative method was implemented in the general purpose advanced computational fluid dynamics code (CFL3D Version 4.1) to yield a new code (CFL3D.ADII) that is capable of computing consistently discrete first order sensitivity derivatives for complex geometries. With the exception of unsteady problems, the new code retains all the useful features and capabilities of the original CFL3D flow analysis code. The superiority of the new code over a carefully applied method of finite-differences is demonstrated. A coarse grain, scalable, distributed-memory, parallel version of CFL3D.ADII was developed based on derivative stripmining . In this data-parallel approach, an identical copy of CFL3D.ADII is executed on each processor with different derivative input files. The effect of communication overhead on the overall parallel computational efficiency is negligible. However, the fraction of CFL3D.ADII duplicated on all processors has significant impact on the computational efficiency. To reduce the large execution time associated with the sequential 1-D line search in gradient-based aerodynamic optimization, an alternative parallel approach was developed. The execution time of the new approach was reduced effectively to that of one flow analysis, regardless of the number of function evaluations in the 1-D search. The new approach was found to yield design results that are essentially identical to those obtained from the traditional sequential approach but at much smaller execution time. The parallel CFL3D.ADII and the parallel 1-D line search are demonstrated in shape improvement studies of a realistic High Speed Civil Transport (HSCT) wing/body configuration represented by over 100 design variables and 200,000 grid points in inviscid supersonic flow on the 16 node IBM SP2 parallel computer at the Numerical Aerospace Simulation (NAS) facility, NASA Ames Research Center. In addition to making the handling of such a large problem possible, the use of parallel computation provided significantly reduced overall execution time and turnaround time

    Cloud computing resource scheduling and a survey of its evolutionary approaches

    Get PDF
    A disruptive technology fundamentally transforming the way that computing services are delivered, cloud computing offers information and communication technology users a new dimension of convenience of resources, as services via the Internet. Because cloud provides a finite pool of virtualized on-demand resources, optimally scheduling them has become an essential and rewarding topic, where a trend of using Evolutionary Computation (EC) algorithms is emerging rapidly. Through analyzing the cloud computing architecture, this survey first presents taxonomy at two levels of scheduling cloud resources. It then paints a landscape of the scheduling problem and solutions. According to the taxonomy, a comprehensive survey of state-of-the-art approaches is presented systematically. Looking forward, challenges and potential future research directions are investigated and invited, including real-time scheduling, adaptive dynamic scheduling, large-scale scheduling, multiobjective scheduling, and distributed and parallel scheduling. At the dawn of Industry 4.0, cloud computing scheduling for cyber-physical integration with the presence of big data is also discussed. Research in this area is only in its infancy, but with the rapid fusion of information and data technology, more exciting and agenda-setting topics are likely to emerge on the horizon

    Learning Parallel Computations with ParaLab

    Full text link
    In this paper, we present the ParaLab teachware system, which can be used for learning the parallel computation methods. ParaLab provides the tools for simulating the multiprocessor computational systems with various network topologies, for carrying out the computational experiments in the simulation mode, and for evaluating the efficiency of the parallel computation methods. The visual presentation of the parallel computations taking place in the computational experiments is the key feature of the system. ParaLab can be used for the laboratory training within various teaching courses in the field of parallel, distributed, and supercomputer computations

    Polynomial Response Surface Approximations for the Multidisciplinary Design Optimization of a High Speed Civil Transport

    Get PDF
    Surrogate functions have become an important tool in multidisciplinary design optimization to deal with noisy functions, high computational cost, and the practical difficulty of integrating legacy disciplinary computer codes. A combination of mathematical, statistical, and engineering techniques, well known in other contexts, have made polynomial surrogate functions viable for MDO. Despite the obvious limitations imposed by sparse high fidelity data in high dimensions and the locality of low order polynomial approximations, the success of the panoply of techniques based on polynomial response surface approximations for MDO shows that the implementation details are more important than the underlying approximation method (polynomial, spline, DACE, kernel regression, etc.). This paper surveys some of the ancillary techniques—statistics, global search, parallel computing, variable complexity modeling—that augment the construction and use of polynomial surrogates

    From Physics Model to Results: An Optimizing Framework for Cross-Architecture Code Generation

    Full text link
    Starting from a high-level problem description in terms of partial differential equations using abstract tensor notation, the Chemora framework discretizes, optimizes, and generates complete high performance codes for a wide range of compute architectures. Chemora extends the capabilities of Cactus, facilitating the usage of large-scale CPU/GPU systems in an efficient manner for complex applications, without low-level code tuning. Chemora achieves parallelism through MPI and multi-threading, combining OpenMP and CUDA. Optimizations include high-level code transformations, efficient loop traversal strategies, dynamically selected data and instruction cache usage strategies, and JIT compilation of GPU code tailored to the problem characteristics. The discretization is based on higher-order finite differences on multi-block domains. Chemora's capabilities are demonstrated by simulations of black hole collisions. This problem provides an acid test of the framework, as the Einstein equations contain hundreds of variables and thousands of terms.Comment: 18 pages, 4 figures, accepted for publication in Scientific Programmin

    Optimal Reconfiguration of Formation Flying Spacecraft--a Decentralized Approach

    Get PDF
    This paper introduces a hierarchical, decentralized, and parallelizable method for dealing with optimization problems with many agents. It is theoretically based on a hierarchical optimization theorem that establishes the equivalence of two forms of the problem, and this idea is implemented using DMOC (Discrete Mechanics and Optimal Control). The result is a method that is scalable to certain optimization problems for large numbers of agents, whereas the usual “monolithic” approach can only deal with systems with a rather small number of degrees of freedom. The method is illustrated with the example of deployment of spacecraft, motivated by the Darwin (ESA) and Terrestrial Planet Finder (NASA) missions

    Group Leaders Optimization Algorithm

    Full text link
    We present a new global optimization algorithm in which the influence of the leaders in social groups is used as an inspiration for the evolutionary technique which is designed into a group architecture. To demonstrate the efficiency of the method, a standard suite of single and multidimensional optimization functions along with the energies and the geometric structures of Lennard-Jones clusters are given as well as the application of the algorithm on quantum circuit design problems. We show that as an improvement over previous methods, the algorithm scales as N^2.5 for the Lennard-Jones clusters of N-particles. In addition, an efficient circuit design is shown for two qubit Grover search algorithm which is a quantum algorithm providing quadratic speed-up over the classical counterpart
    corecore