1,220 research outputs found
MATEX: A Distributed Framework for Transient Simulation of Power Distribution Networks
We proposed MATEX, a distributed framework for transient simulation of power
distribution networks (PDNs). MATEX utilizes matrix exponential kernel with
Krylov subspace approximations to solve differential equations of linear
circuit. First, the whole simulation task is divided into subtasks based on
decompositions of current sources, in order to reduce the computational
overheads. Then these subtasks are distributed to different computing nodes and
processed in parallel. Within each node, after the matrix factorization at the
beginning of simulation, the adaptive time stepping solver is performed without
extra matrix re-factorizations. MATEX overcomes the stiff-ness hinder of
previous matrix exponential-based circuit simulator by rational Krylov subspace
method, which leads to larger step sizes with smaller dimensions of Krylov
subspace bases and highly accelerates the whole computation. MATEX outperforms
both traditional fixed and adaptive time stepping methods, e.g., achieving
around 13X over the trapezoidal framework with fixed time step for the IBM
power grid benchmarks.Comment: ACM/IEEE DAC 2014. arXiv admin note: substantial text overlap with
arXiv:1505.0669
Progress and status of APEmille
We report on the progress and status of the APEmille project: a SIMD parallel
computer with a peak performance in the TeraFlops range which is now in an
advanced development phase. We discuss the hardware and software architecture,
and present some performance estimates for Lattice Gauge Theory (LGT)
applications.Comment: Talk presented at LATTICE97, 3 pages, Late
A Switch Architecture for Real-Time Multimedia Communications
In this paper we present a switch that can be used to transfer multimedia type of trafJic. The switch provides a guaranteed throughput and a bounded latency. We focus on the design of a prototype Switching Element using the new technology opportunities being offered today. The architecture meets the multimedia requirements but still has a low complexity and needs a minimum amount of hardware. A main item of this paper will be the background of the architectural design decisions made. These include the interconnection topology, buffer organization, routing and scheduling. The implementation of the switching fabric with FPGAs, allows us to experiment with switching mode, routing strategy and scheduling policy in a multimedia environment. The witching elements are interconnected in a Kautz topology. Kautz graphs have interesting properties such as: a small diametec the degree is independent of the network size, the network is fault-tolerant and has a simple routing algorithm
Circuit simulation using distributed waveform relaxation techniques
Simulation plays an important role in the design of integrated circuits. Due to high costs and large delays involved in their fabrication, simulation is commonly used to verify functionality and to predict performance before fabrication. This thesis describes analysis, implementation and performance evaluation of a distributed memory parallel waveform relaxation technique for the electrical circuit simulation of MOS VLSI circuits. The waveform relaxation technique exhibits inherent parallelism due to the partitioning of a circuit into a number of sub-circuits. These subcircuits can be concurrently simulated on parallel processors. Different forms of parallelism in the direct method and the waveform relaxation technique are studied. An analysis of single queue and distributed queue approaches to implement parallel waveform relaxation on distributed memory machines is performed and their performance implications are studied. The distributed queue approach selected for exploiting the coarse grain parallelism across sub-circuits is described. Parallel waveform relaxation programs based on Gauss-Seidel and Gauss-Jacobi techniques are implemented using a network of eight Transputers. Static and dynamic load balancing strategies are studied. A dynamic load balancing algorithm is developed and implemented. Results of parallel implementation are analyzed to identify sources of bottlenecks. This thesis has demonstrated the applicability of a low cost distributed memory multi-computer system for simulation of MOS VLSI circuits. Speed-up measurements prove that a five times improvement in the speed of calculations can be achieved using a full window parallel Gauss-Jacobi waveform relaxation algorithm. Analysis of overheads shows that load imbalance is the major source of overhead and that the fraction of the computation which must be performed sequentially is very low. Communication overhead depends on the nature of the parallel architecture and the design of communication mechanisms. The run-time environment (parallel processing framework) developed in this research exploits features of the Transputer architecture to reduce the effect of the communication overhead by effectively overlapping computation with communications, and running communications processes at a higher priority. This research will contribute to the development of low cost, high performance workstations for computer-aided design and analysis of VLSI circuits
Using Rapid Prototyping in Computer Architecture Design Laboratories
This paper describes the undergraduate computer architecture courses and laboratories introduced at Georgia Tech during the past two years. A core sequence of six required courses for computer engineering students has been developed. In this paper, emphasis is placed upon the new core laboratories which utilize commercial CAD tools, FPGAs, hardware emulators, and a VHDL based rapid prototyping approach to simulate, synthesize, and implement prototype computer hardware
Agent-based resource management for grid computing
A computational grid is a hardware and software infrastructure that provides
dependable, consistent, pervasive, and inexpensive access to high-end
computational capability. An ideal grid environment should provide access to the
available resources in a seamless manner. Resource management is an important
infrastructural component of a grid computing environment. The overall aim of
resource management is to efficiently schedule applications that need to utilise the
available resources in the grid environment. Such goals within the high
performance community will rely on accurate performance prediction capabilities.
An existing toolkit, known as PACE (Performance Analysis and Characterisation
Environment), is used to provide quantitative data concerning the performance of
sophisticated applications running on high performance resources. In this thesis an
ASCI (Accelerated Strategic Computing Initiative) kernel application, Sweep3D,
is used to illustrate the PACE performance prediction capabilities. The validation
results show that a reasonable accuracy can be obtained, cross-platform
comparisons can be easily undertaken, and the process benefits from a rapid
evaluation time. While extremely well-suited for managing a locally distributed
multi-computer, the PACE functions do not map well onto a wide-area
environment, where heterogeneity, multiple administrative domains, and communication irregularities dramatically complicate the job of resource
management. Scalability and adaptability are two key challenges that must be
addressed.
In this thesis, an A4 (Agile Architecture and Autonomous Agents) methodology is
introduced for the development of large-scale distributed software systems with
highly dynamic behaviours. An agent is considered to be both a service provider
and a service requestor. Agents are organised into a hierarchy with service
advertisement and discovery capabilities. There are four main performance
metrics for an A4 system: service discovery speed, agent system efficiency,
workload balancing, and discovery success rate.
Coupling the A4 methodology with PACE functions, results in an Agent-based
Resource Management System (ARMS), which is implemented for grid
computing. The PACE functions supply accurate performance information (e. g.
execution time) as input to a local resource scheduler on the fly. At a meta-level,
agents advertise their service information and cooperate with each other to
discover available resources for grid-enabled applications. A Performance
Monitor and Advisor (PMA) is also developed in ARMS to optimise the
performance of the agent behaviours.
The PMA is capable of performance modelling and simulation about the agents in
ARMS and can be used to improve overall system performance. The PMA can
monitor agent behaviours in ARMS and reconfigure them with optimised
strategies, which include the use of ACTs (Agent Capability Tables), limited
service lifetime, limited scope for service advertisement and discovery, agent
mobility and service distribution, etc.
The main contribution of this work is that it provides a methodology and
prototype implementation of a grid Resource Management System (RMS). The
system includes a number of original features that cannot be found in existing
research solutions
- âŠ