445 research outputs found
A fast parallel algorithm for special linear systems of equations using processor arrays with reconfigurable bus systems
A parallel algorithm using Processor Arrays with Reconfigurable Bus Systems
has been designed to solve dense Symmetric Positive Definite (SPD) systems of
equations Ax = b. The key content of this report is the parallelisation of the
algorithm by Delosme & Ipson [8]. In order to design a parallel algorithm for
PARBS, many procedures involved in [8] are handled in a slightly different
way. The parallel time and processor’s complexity of each step of the
algorithm is calculated. The parallel time complexity is O(n) using 2n × 2n ×
5n number of Processing Elements
A Finite Domain Constraint Approach for Placement and Routing of Coarse-Grained Reconfigurable Architectures
Scheduling, placement, and routing are important steps in Very Large Scale Integration (VLSI) design. Researchers have developed numerous techniques to solve placement and routing problems. As the complexity of Application Specific Integrated Circuits (ASICs) increased over the past decades, so did the demand for improved place and route techniques. The primary objective of these place and route approaches has typically been wirelength minimization due to its impact on signal delay and design performance. With the advent of Field Programmable Gate Arrays (FPGAs), the same place and route techniques were applied to FPGA-based design. However, traditional place and route techniques may not work for Coarse-Grained Reconfigurable Architectures (CGRAs), which are reconfigurable devices offering wider path widths than FPGAs and more flexibility than ASICs, due to the differences in architecture and routing network. Further, the routing network of several types of CGRAs, including the Field Programmable Object Array (FPOA), has deterministic timing as compared to the routing fabric of most ASICs and FPGAs reported in the literature. This necessitates a fresh look at alternative approaches to place and route designs. This dissertation presents a finite domain constraint-based, delay-aware placement and routing methodology targeting an FPOA. The proposed methodology takes advantage of the deterministic routing network of CGRAs to perform a delay aware placement
Applications of Broyden-based input space mapping to modeling and design optimization in high-tech companies in Mexico
One of the most powerful and computationally efficient optimization approaches in RF and microwave engineering is the space mapping (SM) approach to design. SM optimization methods belong to the general class of surrogate-based optimization algorithms. They are specialized on the efficient optimization of computationally expensive models. This paper reviews the Broyden-based input SM algorithm, better known as aggressive space mapping (ASM), which is perhaps the SM variation with more industrial applications. The two main characteristics that explain its popularity in industry and academia are emphasized in this paper: simplicity and efficiency. The fundamentals behind the Broyden-based input SM algorithm are described, highlighting key steps for its successful implementation, as well as situations where it may fail. Recent applications of the Broyden-based input space mapping algorithm in high-tech industries located in Mexico are briefly described, including application areas such as signal integrity and high-speed interconnect design, as well as post-silicon validation of high-performance computer platforms, among others. Emerging new applications in multi-physics interconnect design and power-integrity design optimization are also mentioned.ITESO, A.C
A protocol reconfiguration and optimization system for MPI
Modern high performance computing (HPC) applications, for example adaptive mesh refinement and multi-physics codes, have dynamic communication characteristics which result in poor performance on current Message Passing Interface (MPI) implementations. The degraded application performance can be attributed to a mismatch between changing application requirements and static communication library functionality. To improve the performance of these applications, MPI libraries should change their protocol functionality in response to changing application requirements, and tailor their functionality to take advantage of hardware capabilities. This dissertation describes Protocol Reconfiguration and Optimization system for MPI (PRO-MPI), a framework for constructing profile-driven reconfigurable MPI libraries; these libraries use past application characteristics (profiles) to dynamically change their functionality to match the changing application requirements. The framework addresses the challenges of designing and implementing the reconfigurable MPI libraries, which include collecting and reasoning about application characteristics to drive the protocol reconfiguration and defining abstractions required for implementing these reconfigurations. Two prototype reconfigurable MPI implementations based on the framework - Open PRO-MPI and Cactus PRO-MPI - are also presented to demonstrate the utility of the framework. To demonstrate the effectiveness of reconfigurable MPI libraries, this dissertation presents experimental results to show the impact of using these libraries on the application performance. The results show that PRO-MPI improves the performance of important HPC applications and benchmarks. They also show that HyperCLaw performance improves by approximately 22% when exact profiles are available, and HyperCLaw performance improves by approximately 18% when only approximate profiles are available
Reconfigurable Model Execution in the OpenMDAO Framework
NASA's OpenMDAO framework facilitates constructing complex models and computing their derivatives for multidisciplinary design optimization. Decomposing a model into components that follow a prescribed interface enables OpenMDAO to assemble multidisciplinary derivatives from the component derivatives using what amounts to the adjoint method, direct method, chain rule, global sensitivity equations, or any combination thereof, using the MAUD architecture. OpenMDAO also handles the distribution of processors among the disciplines by hierarchically grouping the components, and it automates the data transfer between components that are on different processors. These features have made OpenMDAO useful for applications in aircraft design, satellite design, wind turbine design, and aircraft engine design, among others. This paper presents new algorithms for OpenMDAO that enable reconfigurable model execution. This concept refers to dynamically changing, during execution, one or more of: the variable sizes, solution algorithm, parallel load balancing, or set of variables-i.e., adding and removing components, perhaps to switch to a higher-fidelity sub-model. Any component can reconfigure at any point, even when running in parallel with other components, and the reconfiguration algorithm presented here performs the synchronized updates to all other components that are affected. A reconfigurable software framework for multidisciplinary design optimization enables new adaptive solvers, adaptive parallelization, and new applications such as gradient-based optimization with overset flow solvers and adaptive mesh refinement. Benchmarking results demonstrate the time savings for reconfiguration compared to setting up the model again from scratch, which can be significant in large-scale problems. Additionally, the new reconfigurability feature is applied to a mission profile optimization problem for commercial aircraft where both the parametrization of the mission profile and the time discretization are adaptively refined, resulting in computational savings of roughly 10% and the elimination of oscillations in the optimized altitude profile
Center for Aeronautics and Space Information Sciences
This report summarizes the research done during 1991/92 under the Center for Aeronautics and Space Information Science (CASIS) program. The topics covered are computer architecture, networking, and neural nets
Fast, Scalable, and Interactive Software for Landau-de Gennes Numerical Modeling of Nematic Topological Defects
Numerical modeling of nematic liquid crystals using the tensorial Landau-de
Gennes (LdG) theory provides detailed insights into the structure and
energetics of the enormous variety of possible topological defect
configurations that may arise when the liquid crystal is in contact with
colloidal inclusions or structured boundaries. However, these methods can be
computationally expensive, making it challenging to predict (meta)stable
configurations involving several colloidal particles, and they are often
restricted to system sizes well below the experimental scale. Here we present
an open-source software package that exploits the embarrassingly parallel
structure of the lattice discretization of the LdG approach. Our
implementation, combining CUDA/C++ and OpenMPI, allows users to accelerate
simulations using both CPU and GPU resources in either single- or multiple-core
configurations. We make use of an efficient minimization algorithm, the Fast
Inertial Relaxation Engine (FIRE) method, that is well-suited to large-scale
parallelization, requiring little additional memory or computational cost while
offering performance competitive with other commonly used methods. In
multi-core operation we are able to scale simulations up to supra-micron length
scales of experimental relevance, and in single-core operation the simulation
package includes a user-friendly GUI environment for rapid prototyping of
interfacial features and the multifarious defect states they can promote. To
demonstrate this software package, we examine in detail the competition between
curvilinear disclinations and point-like hedgehog defects as size scale,
material properties, and geometric features are varied. We also study the
effects of an interface patterned with an array of topological point-defects.Comment: 16 pages, 6 figures, 1 youtube link. The full catastroph
- …