128,458 research outputs found

    A Parallel Dual Fast Gradient Method for MPC Applications

    Full text link
    We propose a parallel adaptive constraint-tightening approach to solve a linear model predictive control problem for discrete-time systems, based on inexact numerical optimization algorithms and operator splitting methods. The underlying algorithm first splits the original problem in as many independent subproblems as the length of the prediction horizon. Then, our algorithm computes a solution for these subproblems in parallel by exploiting auxiliary tightened subproblems in order to certify the control law in terms of suboptimality and recursive feasibility, along with closed-loop stability of the controlled system. Compared to prior approaches based on constraint tightening, our algorithm computes the tightening parameter for each subproblem to handle the propagation of errors introduced by the parallelization of the original problem. Our simulations show the computational benefits of the parallelization with positive impacts on performance and numerical conditioning when compared with a recent nonparallel adaptive tightening scheme.Comment: This technical report is an extended version of the paper "A Parallel Dual Fast Gradient Method for MPC Applications" by the same authors submitted to the 54th IEEE Conference on Decision and Contro

    Parametric micro-level performance models for parallel computing and parallel implementation of hydrostatic MM5

    Get PDF
    This dissertation presents Parametric micro-level performance models and Parallel implementation of the hydrostatic version of MM5;Parametric micro-level (PM) performance models are introduced to address the important issue of how to realistically model parallel performance. These models can be used to predict execution times and identify performance bottlenecks. The accurate prediction and analysis of execution times is achieved by incorporating precise details of interprocessor communication, memory operations, auxiliary instructions, and effects of communication and computation schedules. The parameters provide the flexibility to study various algorithmic and architectural issues. The development and verification process, parameters and the scope of applicability of these models are discussed. A coherent view of performance is obtained from the execution profiles generated by PM models. The models are targeted at a large class numerical algorithms commonly implemented on both SIMD and MIMD machines. Specific models are presented for matrix multiplication, LU decomposition, and FFT on a 2-D processor array with distributed memory. A case study includes comparison of parallel machines and parallel algorithms. In a comparison of parallel machines, PM models are used to analyze execution times so as to relate the performance to architectural attributes of a machine. In a comparison of parallel algorithms, PM models are used to study performance of two LU decomposition algorithms: non-blocked and blocked. Two algorithms are compared to identify the tradeoffs between them. This analysis is useful to determine an optimum block size for the blocked algorithm. The case study is done on MasPar MP-1 and MP-2 machines;The dissertation also describes the parallel implementation of the hydrostatic version of MM5 (the fifth generation of Mesoscale Model), which has been widely used for climate studies. The model was parallelized in machine-independent manner using the Runtime System Library (RSL), a runtime library for handling message-passing and index transformation. The dissertation discusses validation of the parallel implementation of MM5 using field data and presents performance results. The parallel model was tested on the IBM SP1, a distributed memory parallel computer

    Defining Asymptotic Parallel Time Complexity of Data-dependent Algorithms

    Get PDF
    The scientific research community has reached a stage of maturity where its strong need for high-performance computing has diffused into also everyday life of engineering and industry algorithms. In efforts to satisfy this need, parallel computers provide an efficient and economical way to solve large-scale and/or time-constrained problems. As a consequence, the end-users of these systems have a vested interest in defining the asymptotic time complexity of parallel algorithms to predict their performance on a particular parallel computer. The asymptotic parallel time complexity of data-dependent algorithms depends on the number of processors, data size, and other parameters. Discovering the main other parameters is a challenging problem and the clue in obtaining a good estimate of performance order. Great examples of these types of applications are sorting algorithms, searching algorithms and solvers of the traveling salesman problem (TSP). This article encompasses all the knowledge discovery aspects to the problem of defining the asymptotic parallel time complexity of datadependent algorithms. The knowledge discovery methodology begins by designing a considerable number of experiments and measuring their execution times. Then, an interactive and iterative process explores data in search of patterns and/or relationships detecting some parameters that affect performance. Knowing the key parameters which characterise time complexity, it becomes possible to hypothesise to restart the process and to produce a subsequent improved time complexity model. Finally, the methodology predicts the performance order for new data sets on a particular parallel computer by replacing a numerical identification. As a case of study, a global pruning traveling salesman problem implementation (GP-TSP) has been chosen to analyze the influence of indeterminism in performance prediction of data-dependent parallel algorithms, and also to show the usefulness of the defined knowledge discovery methodology. The subsequent hypotheses generated to define the asymptotic parallel time complexity of the TSP were corroborated one by one. The experimental results confirm the expected capability of the proposed methodology; the predictions of performance time order were rather good comparing with real execution time (in the order of 85%)

    Integration of a big data emerging on large sparse simulation and its application on green computing platform

    Get PDF
    The process of analyzing large data and verifying a big data set are a challenge for understanding the fundamental concept behind it. Many big data analysis techniques suffer from the poor scalability, variation inequality, instability, lower convergence, and weak accuracy of the large-scale numerical algorithms. Due to these limitations, a wider opportunity for numerical analysts to develop the efficiency and novel parallel algorithms has emerged. Big data analytics plays an important role in the field of sciences and engineering for extracting patterns, trends, actionable information from large sets of data and improving strategies for making a decision. A large data set consists of a large-scale data collection via sensor network, transformation from signal to digital images, high resolution of a sensing system, industry forecasts, existing customer records to predict trends and prepare for new demand. This paper proposes three types of big data analytics in accordance to the analytics requirement involving a large-scale numerical simulation and mathematical modeling for solving a complex problem. First is a big data analytics for theory and fundamental of nanotechnology numerical simulation. Second, big data analytics for enhancing the digital images in 3D visualization, performance analysis of embedded system based on the large sparse data sets generated by the device. Lastly, extraction of patterns from the electroencephalogram (EEG) data set for detecting the horizontal-vertical eye movements. Thus, the process of examining a big data analytics is to investigate the behavior of hidden patterns, unknown correlations, identify anomalies, and discover structure inside unstructured data and extracting the essence, trend prediction, multi-dimensional visualization and real-time observation using the mathematical model. Parallel algorithms, mesh generation, domain-function decomposition approaches, inter-node communication design, mapping the subdomain, numerical analysis and parallel performance evaluations (PPE) are the processes of the big data analytics implementation. The superior of parallel numerical methods such as AGE, Brian and IADE were proven for solving a large sparse model on green computing by utilizing the obsolete computers, the old generation servers and outdated hardware, a distributed virtual memory and multi-processors. The integration of low-cost communication of message passing software and green computing platform is capable of increasing the PPE up to 60% when compared to the limited memory of a single processor. As a conclusion, large-scale numerical algorithms with great performance in scalability, equality, stability, convergence, and accuracy are important features in analyzing big data simulation

    Integration of a big data emerging on large sparse simulation and its application on green computing platform

    Get PDF
    The process of analyzing large data and verifying a big data set are a challenge for understanding the fundamental concept behind it. Many big data analysis techniques suffer from the poor scalability, variation inequality, instability, lower convergence, and weak accuracy of the large-scale numerical algorithms. Due to these limitations, a wider opportunity for numerical analysts to develop the efficiency and novel parallel algorithms has emerged. Big data analytics plays an important role in the field of sciences and engineering for extracting patterns, trends, actionable information from large sets of data and improving strategies for making a decision. A large data set consists of a large-scale data collection via sensor network, transformation from signal to digital images, high resolution of a sensing system, industry forecasts, existing customer records to predict trends and prepare for new demand. This paper proposes three types of big data analytics in accordance to the analytics requirement involving a large-scale numerical simulation and mathematical modeling for solving a complex problem. First is a big data analytics for theory and fundamental of nanotechnology numerical simulation. Second, big data analytics for enhancing the digital images in 3D visualization, performance analysis of embedded system based on the large sparse data sets generated by the device. Lastly, extraction of patterns from the electroencephalogram (EEG) data set for detecting the horizontal-vertical eye movements. Thus, the process of examining a big data analytics is to investigate the behavior of hidden patterns, unknown correlations, identify anomalies, and discover structure inside unstructured data and extracting the essence, trend prediction, multi-dimensional visualization and real-time observation using the mathematical model. Parallel algorithms, mesh generation, domain-function decomposition approaches, inter-node communication design, mapping the subdomain, numerical analysis and parallel performance evaluations (PPE) are the processes of the big data analytics implementation. The superior of parallel numerical methods such as AGE, Brian and IADE were proven for solving a large sparse model on green computing by utilizing the obsolete computers, the old generation servers and outdated hardware, a distributed virtual memory and multi-processors. The integration of low-cost communication of message passing software and green computing platform is capable of increasing the PPE up to 60% when compared to the limited memory of a single processor. As a conclusion, large-scale numerical algorithms with great performance in scalability, equality, stability, convergence, and accuracy are important features in analyzing big data simulation

    Parallel performance prediction for numerical codes in a multi-cluster environment

    Get PDF
    We propose a model for describing and predicting the performance of parallel numerical software on distributed memory architectures within a multi-cluster environment. The goal of the model is to allow reliable predictions to be made as to the execution time of a given code on a large number of processors of a given parallel system, and on a combination of systems, by only benchmarking the code on small numbers of processors. Thishas potential applications for the scheduling of jobs in a Grid computing environment where informed decisions about which resources to use in order to maximize the performance and/or minimize the cost of a job will be valuable. The methodology is built and tested for a particular class of numerical code, based upon the multilevel solution of discretized partial differential equations, and despite its simplicity it is demonstrated to be extremely accurate and robust with respect to both the processor and communications architectures considered. Furthermore,results are also presented which demonstrate that excellent predictions may also be obtained for numerical algorithms that are more general than the pure multigrid solver used to motivate the methodology. These are based upon the use of a practical parallel engineering code that is briefly described. The potential significance of this work is illustrated via two scenarios which consider a Grid user who wishes to use the available resources either (i) to obtain a particular result as quickly as possible, or (ii) to obtain results to different levels of accuracy. Index Termsā€”Parallel Distributed Algorithms; Grid Computing; Cluster Computing; Performance Evaluation and Prediction; Meta-Scheduling

    Adaptive mesh refinement computation of acoustic radiation from an engine intake

    No full text
    A block-structured adaptive mesh refinement (AMR) method was applied to the computational problem of acoustic radiation from an aeroengine intake. The aim is to improve the computational and storage efficiency in aeroengine noise prediction through reduction of computational cells. A parallel implementation of the adaptive mesh refinement algorithm was achieved using message passing interface. It combined a range of 2nd- and 4th-order spatial stencils, a 4th-order low-dissipation and low-dispersion Rungeā€“Kutta scheme for time integration and several different interpolation methods. Both the parallel AMR algorithms and numerical issues were introduced briefly in this work. To solve the problem of acoustic radiation from an aeroengine intake, the code was extended to support body-fitted grid structures. The problem of acoustic radiation was solved with linearised Euler equations. The AMR results were compared with the previous results computed on a uniformly fine mesh to demonstrate the accuracy and the efficiency of the current AMR strategy. As the computational load of the whole adaptively refined mesh has to be balanced between nodes on-line, the parallel performance of the existing code deteriorates along with the increase of processors due to the expensive inter-nodes memory communication costs. The potential solution was suggested in the end

    Efficient Multigrid Preconditioners for Atmospheric Flow Simulations at High Aspect Ratio

    Get PDF
    Many problems in fluid modelling require the efficient solution of highly anisotropic elliptic partial differential equations (PDEs) in "flat" domains. For example, in numerical weather- and climate-prediction an elliptic PDE for the pressure correction has to be solved at every time step in a thin spherical shell representing the global atmosphere. This elliptic solve can be one of the computationally most demanding components in semi-implicit semi-Lagrangian time stepping methods which are very popular as they allow for larger model time steps and better overall performance. With increasing model resolution, algorithmically efficient and scalable algorithms are essential to run the code under tight operational time constraints. We discuss the theory and practical application of bespoke geometric multigrid preconditioners for equations of this type. The algorithms deal with the strong anisotropy in the vertical direction by using the tensor-product approach originally analysed by B\"{o}rm and Hiptmair [Numer. Algorithms, 26/3 (2001), pp. 219-234]. We extend the analysis to three dimensions under slightly weakened assumptions, and numerically demonstrate its efficiency for the solution of the elliptic PDE for the global pressure correction in atmospheric forecast models. For this we compare the performance of different multigrid preconditioners on a tensor-product grid with a semi-structured and quasi-uniform horizontal mesh and a one dimensional vertical grid. The code is implemented in the Distributed and Unified Numerics Environment (DUNE), which provides an easy-to-use and scalable environment for algorithms operating on tensor-product grids. Parallel scalability of our solvers on up to 20,480 cores is demonstrated on the HECToR supercomputer.Comment: 22 pages, 6 Figures, 2 Table
    • ā€¦