13 research outputs found

    Generalizing Amdahl’s Law for Power and Energy

    Get PDF
    Extending Amdahl\u27s law to identify optimal power-performance configurations requires considering the interactive effects of power, performance, and parallel overhead

    The Limits Of DIS: Final Report

    Get PDF
    Two reports on the computational capabilities of formal DIS networks in simulating real physical events and the demands placed on DIS in the future as projected by the DIS user community.; Contents: 50514 01 Purpose -- Background -- Phase 1 -- Phase 2 --Appendices -- A On the limitation of DIS by Thomas L. Clarke -- Appendix B Useful communications in distributed interactive simulation by Scott H. Smith --Appendix C Useful communications in distributed interactive simulation (viewgraphs) by Scott H. Smith

    Coupling of adaptive refinement with variational multiscale element free Galerkin method for high gradient problems

    Get PDF
    In this thesis, a new adaptive refinement coupled with variational multiscale element free Galerkin method (EFGM) is developed for solving high gradient problems. The aim of this thesis is to propose a new framework of moving least squares (MLS) approximation with coupling method based on the variational multiscale concept. Additional new nodes will be inserted automatically at high gradient regions by adaptive algorithm based on refinement criteria. An enrichment function is embedded in the MLS approximation for the fine scale part of the problem. Besides, this new technique will be parallelized by using OpenMP which is based on shared memory architecture. The proposed new approach is first applied in two-dimensional large localized gradient problem, transient heat conduction problem as well as Burgers' equation in order to analyze the accuracy of the proposed method and validated with an available analytic solutions. The obtained numerical results show a very good agreement with the analytic solutions and is able to obtain more accurate results than the standard EFGM. It is found that the average relative error of this new method is reduced in the range of 15% to 70%. Besides, this new method is also extended to solve two-dimensional sine-Gordon solitons. The results obtained show good agreement with the published results. Moreover, the parallelization of adaptive variational multiscale EFGM can improve the computational efficiency by reducing the execution time without loss of accuracy. Therefore, the capability and robustness of this new method has the potential to investigate more complicated problems in order to produce higher precision solutions with shorter computational time

    Optimal seismic retrofitting of existing RC frames through soft-computing approaches

    Get PDF
    2016 - 2017Ph.D. Thesis proposes a Soft-Computing approach capable of supporting the engineer judgement in the selection and design of the cheapest solution for seismic retrofitting of existing RC framed structure. Chapter 1 points out the need for strengthening the existing buildings as one of the main way of decreasing economic and life losses as direct consequences of earthquake disasters. Moreover, it proposes a wide, but not-exhaustive, list of the most frequently observed deficiencies contributing to the vulnerability of concrete buildings. Chapter 2 collects the state of practice on seismic analysis methods for the assessment the safety of the existing buildings within the framework of a performancebased design. The most common approaches for modeling the material plasticity in the frame non-linear analysis are also reviewed. Chapter 3 presents a wide state of practice on the retrofitting strategies, intended as preventive measures aimed at mitigating the effect of a future earthquake by a) decreasing the seismic hazard demands; b) improving the dynamic characteristics supplied to the existing building. The chapter presents also a list of retrofitting systems, intended as technical interventions commonly classified into local intervention (also known “member-level” techniques) and global intervention (also called “structure-level” techniques) that might be used in synergistic combination to achieve the adopted strategy. In particular, the available approaches and the common criteria, respectively for selecting an optimum retrofit strategy and an optimal system are discussed. Chapter 4 highlights the usefulness of the Soft-Computing methods as efficient tools for providing “objective” answer in reasonable time for complex situation governed by approximation and imprecision. In particular, Chapter 4 collects the applications found in the scientific literature for Fuzzy Logic, Artificial Neural Network and Evolutionary Computing in the fields of structural and earthquake engineering with a taxonomic classification of the problems in modeling, simulation and optimization. Chapter 5 “translates” the search for the cheapest retrofitting system into a constrained optimization problem. To this end, the chapter includes a formulation of a novel procedure that assembles a numerical model for seismic assessment of framed structures within a Soft-Computing-driven optimization algorithm capable to minimize the objective function defined as the total initial cost of intervention. The main components required to assemble the procedure are described in the chapter: the optimization algorithm (Genetic Algorithm); the simulation framework (OpenSees); and the software environment (Matlab). Chapter 6 describes step-by-step the flow-chart of the proposed procedure and it focuses on the main implementation aspects and working details, ranging from a clever initialization of the population of candidate solutions up to a proposal of tuning procedure for the genetic parameters. Chapter 7 discusses numerical examples, where the Soft-Computing procedure is applied to the model of multi-storey RC frames obtained through simulated design. A total of fifteen “scenarios” are studied in order to assess its “robustness” to changes in input data. Finally, Chapter 8, on the base of the outcomes observed, summarizes the capabilities of the proposed procedure, yet highlighting its “limitations” at the current state of development. Some possible modifications are discussed to enhance its efficiency and completeness. [edited by author]XVI n.s

    High performance bioinformatics and computational biology on general-purpose graphics processing units

    Get PDF
    Bioinformatics and Computational Biology (BCB) is a relatively new multidisciplinary field which brings together many aspects of the fields of biology, computer science, statistics, and engineering. Bioinformatics extracts useful information from biological data and makes these more intuitive and understandable by applying principles of information sciences, while computational biology harnesses computational approaches and technologies to answer biological questions conveniently. Recent years have seen an explosion of the size of biological data at a rate which outpaces the rate of increases in the computational power of mainstream computer technologies, namely general purpose processors (GPPs). The aim of this thesis is to explore the use of off-the-shelf Graphics Processing Unit (GPU) technology in the high performance and efficient implementation of BCB applications in order to meet the demands of biological data increases at affordable cost. The thesis presents detailed design and implementations of GPU solutions for a number of BCB algorithms in two widely used BCB applications, namely biological sequence alignment and phylogenetic analysis. Biological sequence alignment can be used to determine the potential information about a newly discovered biological sequence from other well-known sequences through similarity comparison. On the other hand, phylogenetic analysis is concerned with the investigation of the evolution and relationships among organisms, and has many uses in the fields of system biology and comparative genomics. In molecular-based phylogenetic analysis, the relationship between species is estimated by inferring the common history of their genes and then phylogenetic trees are constructed to illustrate evolutionary relationships among genes and organisms. However, both biological sequence alignment and phylogenetic analysis are computationally expensive applications as their computing and memory requirements grow polynomially or even worse with the size of sequence databases. The thesis firstly presents a multi-threaded parallel design of the Smith- Waterman (SW) algorithm alongside an implementation on NVIDIA GPUs. A novel technique is put forward to solve the restriction on the length of the query sequence in previous GPU-based implementations of the SW algorithm. Based on this implementation, the difference between two main task parallelization approaches (Inter-task and Intra-task parallelization) is presented. The resulting GPU implementation matches the speed of existing GPU implementations while providing more flexibility, i.e. flexible length of sequences in real world applications. It also outperforms an equivalent GPPbased implementation by 15x-20x. After this, the thesis presents the first reported multi-threaded design and GPU implementation of the Gapped BLAST with Two-Hit method algorithm, which is widely used for aligning biological sequences heuristically. This achieved up to 3x speed-up improvements compared to the most optimised GPP implementations. The thesis then presents a multi-threaded design and GPU implementation of a Neighbor-Joining (NJ)-based method for phylogenetic tree construction and multiple sequence alignment (MSA). This achieves 8x-20x speed up compared to an equivalent GPP implementation based on the widely used ClustalW software. The NJ method however only gives one possible tree which strongly depends on the evolutionary model used. A more advanced method uses maximum likelihood (ML) for scoring phylogenies with Markov Chain Monte Carlo (MCMC)-based Bayesian inference. The latter was the subject of another multi-threaded design and GPU implementation presented in this thesis, which achieved 4x-8x speed up compared to an equivalent GPP implementation based on the widely used MrBayes software. Finally, the thesis presents a general evaluation of the designs and implementations achieved in this work as a step towards the evaluation of GPU technology in BCB computing, in the context of other computer technologies including GPPs and Field Programmable Gate Arrays (FPGA) technology

    Understanding Quantum Technologies 2022

    Full text link
    Understanding Quantum Technologies 2022 is a creative-commons ebook that provides a unique 360 degrees overview of quantum technologies from science and technology to geopolitical and societal issues. It covers quantum physics history, quantum physics 101, gate-based quantum computing, quantum computing engineering (including quantum error corrections and quantum computing energetics), quantum computing hardware (all qubit types, including quantum annealing and quantum simulation paradigms, history, science, research, implementation and vendors), quantum enabling technologies (cryogenics, control electronics, photonics, components fabs, raw materials), quantum computing algorithms, software development tools and use cases, unconventional computing (potential alternatives to quantum and classical computing), quantum telecommunications and cryptography, quantum sensing, quantum technologies around the world, quantum technologies societal impact and even quantum fake sciences. The main audience are computer science engineers, developers and IT specialists as well as quantum scientists and students who want to acquire a global view of how quantum technologies work, and particularly quantum computing. This version is an extensive update to the 2021 edition published in October 2021.Comment: 1132 pages, 920 figures, Letter forma

    Lifetime reliability of multi-core systems: modeling and applications.

    Get PDF
    Huang, Lin.Thesis (M.Phil.)--Chinese University of Hong Kong, 2011.Includes bibliographical references (leaves 218-232).Abstracts in English and Chinese.Abstract --- p.iAcknowledgement --- p.ivChapter 1 --- Introduction --- p.1Chapter 1.1 --- Preface --- p.1Chapter 1.2 --- Background --- p.5Chapter 1.3 --- Contributions --- p.6Chapter 1.3.1 --- Lifetime Reliability Modeling --- p.6Chapter 1.3.2 --- Simulation Framework --- p.7Chapter 1.3.3 --- Applications --- p.9Chapter 1.4 --- Thesis Outline --- p.10Chapter I --- Modeling --- p.12Chapter 2 --- Lifetime Reliability Modeling --- p.13Chapter 2.1 --- Notation --- p.13Chapter 2.2 --- Assumption --- p.16Chapter 2.3 --- Introduction --- p.16Chapter 2.4 --- Related Work --- p.19Chapter 2.5 --- System Model --- p.21Chapter 2.5.1 --- Reliability of A Surviving Component --- p.22Chapter 2.5.2 --- Reliability of a Hybrid k-out-of-n:G System --- p.26Chapter 2.6 --- Special Cases --- p.31Chapter 2.6.1 --- Case I: Gracefully Degrading System --- p.31Chapter 2.6.2 --- Case II: Standby Redundant System --- p.33Chapter 2.6.3 --- Case III: l-out-of-3:G System with --- p.34Chapter 2.7 --- Numerical Results --- p.37Chapter 2.7.1 --- Experimental Setup --- p.37Chapter 2.7.2 --- Experimental Results and Discussion --- p.40Chapter 2.8 --- Conclusion --- p.43Chapter 2.9 --- Appendix --- p.44Chapter II --- Simulation Framework --- p.47Chapter 3 --- AgeSim: A Simulation Framework --- p.48Chapter 3.1 --- Introduction --- p.48Chapter 3.2 --- Preliminaries and Motivation --- p.51Chapter 3.2.1 --- Prior Work on Lifetime Reliability Analysis of Processor- Based Systems --- p.51Chapter 3.2.2 --- Motivation of This Work --- p.53Chapter 3.3 --- The Proposed Framework --- p.54Chapter 3.4 --- Aging Rate Calculation --- p.57Chapter 3.4.1 --- Lifetime Reliability Calculation --- p.58Chapter 3.4.2 --- Aging Rate Extraction --- p.60Chapter 3.4.3 --- Discussion on Representative Workload --- p.63Chapter 3.4.4 --- Numerical Validation --- p.65Chapter 3.4.5 --- Miscellaneous --- p.66Chapter 3.5 --- Lifetime Reliability Model for MPSoCs with Redundancy --- p.68Chapter 3.6 --- Case Studies --- p.70Chapter 3.6.1 --- Dynamic Voltage and Frequency Scaling --- p.71Chapter 3.6.2 --- Burst Task Arrival --- p.75Chapter 3.6.3 --- Task Allocation on Multi-Core Processors --- p.77Chapter 3.6.4 --- Timeout Policy on Multi-Core Processors with Gracefully Degrading Redundancy --- p.78Chapter 3.7 --- Conclusion --- p.79Chapter 4 --- Evaluating Redundancy Schemes --- p.83Chapter 4.1 --- Introduction --- p.83Chapter 4.2 --- Preliminaries and Motivation --- p.85Chapter 4.2.1 --- Failure Mechanisms --- p.85Chapter 4.2.2 --- Related Work and Motivation --- p.86Chapter 4.3 --- Proposed Analytical Model for the Lifetime Reliability of Proces- sor Cores --- p.88Chapter 4.3.1 --- "Impact of Temperature, Voltage, and Frequency" --- p.88Chapter 4.3.2 --- Impact of Workloads --- p.92Chapter 4.4 --- Lifetime Reliability Analysis for Multi-core Processors with Vari- ous Redundancy Schemes --- p.95Chapter 4.4.1 --- Gracefully Degrading System (GDS) --- p.95Chapter 4.4.2 --- Processor Rotation System (PRS) --- p.97Chapter 4.4.3 --- Standby Redundant System (SRS) --- p.98Chapter 4.4.4 --- Extension to Heterogeneous System --- p.99Chapter 4.5 --- Experimental Methodology --- p.101Chapter 4.5.1 --- Workload Description --- p.102Chapter 4.5.2 --- Temperature Distribution Extraction --- p.102Chapter 4.5.3 --- Reliability Factors --- p.103Chapter 4.6 --- Results and Discussions --- p.103Chapter 4.6.1 --- Wear-out Rate Computation --- p.103Chapter 4.6.2 --- Comparison on Lifetime Reliability --- p.105Chapter 4.6.3 --- Comparison on Performance --- p.110Chapter 4.6.4 --- Comparison on Expected Computation Amount --- p.112Chapter 4.7 --- Conclusion --- p.118Chapter III --- Applications --- p.119Chapter 5 --- Task Allocation and Scheduling for MPSoCs --- p.120Chapter 5.1 --- Introduction --- p.120Chapter 5.2 --- Prior Work and Motivation --- p.122Chapter 5.2.1 --- IC Lifetime Reliability --- p.122Chapter 5.2.2 --- Task Allocation and Scheduling for MPSoC Designs --- p.124Chapter 5.3 --- Proposed Task Allocation and Scheduling Strategy --- p.126Chapter 5.3.1 --- Problem Definition --- p.126Chapter 5.3.2 --- Solution Representation --- p.128Chapter 5.3.3 --- Cost Function --- p.129Chapter 5.3.4 --- Simulated Annealing Process --- p.130Chapter 5.4 --- Lifetime Reliability Computation for MPSoC Embedded Systems --- p.133Chapter 5.5 --- Efficient MPSoC Lifetime Approximation --- p.138Chapter 5.5.1 --- Speedup Technique I - Multiple Periods --- p.139Chapter 5.5.2 --- Speedup Technique II - Steady Temperature --- p.139Chapter 5.5.3 --- Speedup Technique III - Temperature Pre- calculation --- p.140Chapter 5.5.4 --- Speedup Technique IV - Time Slot Quantity Control --- p.144Chapter 5.6 --- Experimental Results --- p.144Chapter 5.6.1 --- Experimental Setup --- p.144Chapter 5.6.2 --- Results and Discussion --- p.146Chapter 5.7 --- Conclusion and Future Work --- p.152Chapter 6 --- Energy-Efficient Task Allocation and Scheduling --- p.154Chapter 6.1 --- Introduction --- p.154Chapter 6.2 --- Preliminaries and Problem Formulation --- p.157Chapter 6.2.1 --- Related Work --- p.157Chapter 6.2.2 --- Problem Formulation --- p.159Chapter 6.3 --- Analytical Models --- p.160Chapter 6.3.1 --- Performance and Energy Models for DVS-Enabled Pro- cessors --- p.160Chapter 6.3.2 --- Lifetime Reliability Model --- p.163Chapter 6.4 --- Proposed Algorithm for Single-Mode Embedded Systems --- p.165Chapter 6.4.1 --- Task Allocation and Scheduling --- p.165Chapter 6.4.2 --- Voltage Assignment for DVS-Enabled Processors --- p.168Chapter 6.5 --- Proposed Algorithm for Multi-Mode Embedded Systems --- p.169Chapter 6.5.1 --- Feasible Solution Set --- p.169Chapter 6.5.2 --- Searching Procedure for a Single Mode --- p.171Chapter 6.5.3 --- Feasible Solution Set Identification --- p.171Chapter 6.5.4 --- Multi-Mode Combination --- p.177Chapter 6.6 --- Experimental Results --- p.178Chapter 6.6.1 --- Experimental Setup --- p.178Chapter 6.6.2 --- Case Study --- p.180Chapter 6.6.3 --- Sensitivity Analysis --- p.181Chapter 6.6.4 --- Extensive Results --- p.183Chapter 6.7 --- Conclusion --- p.185Chapter 7 --- Customer-Aware Task Allocation and Scheduling --- p.186Chapter 7.1 --- Introduction --- p.186Chapter 7.2 --- Prior Work and Problem Formulation --- p.188Chapter 7.2.1 --- Related Work and Motivation --- p.188Chapter 7.2.2 --- Problem Formulation --- p.191Chapter 7.3 --- Proposed Design-Stage Task Allocation and Scheduling --- p.192Chapter 7.3.1 --- Solution Representation and Moves --- p.193Chapter 7.3.2 --- Cost Function --- p.196Chapter 7.3.3 --- Impact of DVFS --- p.198Chapter 7.4 --- Proposed Algorithm for Online Adjustment --- p.200Chapter 7.4.1 --- Reliability Requirement for Online Adjustment --- p.201Chapter 7.4.2 --- Analytical Model --- p.203Chapter 7.4.3 --- Overall Flow --- p.204Chapter 7.5 --- Experimental Results --- p.205Chapter 7.5.1 --- Experimental Setup --- p.205Chapter 7.5.2 --- Results and Discussion --- p.207Chapter 7.6 --- Conclusion --- p.211Chapter 7.7 --- Appendix --- p.211Chapter 8 --- Conclusion and Future Work --- p.214Chapter 8.1 --- Conclusion --- p.214Chapter 8.2 --- Future Work --- p.215Bibliography --- p.23
    corecore