378,451 research outputs found
Lanczos eigensolution method for high-performance computers
The theory, computational analysis, and applications are presented of a Lanczos algorithm on high performance computers. The computationally intensive steps of the algorithm are identified as: the matrix factorization, the forward/backward equation solution, and the matrix vector multiples. These computational steps are optimized to exploit the vector and parallel capabilities of high performance computers. The savings in computational time from applying optimization techniques such as: variable band and sparse data storage and access, loop unrolling, use of local memory, and compiler directives are presented. Two large scale structural analysis applications are described: the buckling of a composite blade stiffened panel with a cutout, and the vibration analysis of a high speed civil transport. The sequential computational time for the panel problem executed on a CONVEX computer of 181.6 seconds was decreased to 14.1 seconds with the optimized vector algorithm. The best computational time of 23 seconds for the transport problem with 17,000 degs of freedom was on the the Cray-YMP using an average of 3.63 processors
Advanced Architectures for Astrophysical Supercomputing
Astronomers have come to rely on the increasing performance of computers to
reduce, analyze, simulate and visualize their data. In this environment, faster
computation can mean more science outcomes or the opening up of new parameter
spaces for investigation. If we are to avoid major issues when implementing
codes on advanced architectures, it is important that we have a solid
understanding of our algorithms. A recent addition to the high-performance
computing scene that highlights this point is the graphics processing unit
(GPU). The hardware originally designed for speeding-up graphics rendering in
video games is now achieving speed-ups of in general-purpose
computation -- performance that cannot be ignored. We are using a generalized
approach, based on the analysis of astronomy algorithms, to identify the
optimal problem-types and techniques for taking advantage of both current GPU
hardware and future developments in computing architectures.Comment: 4 pages, 1 figure, to appear in the proceedings of ADASS XIX, Oct 4-8
2009, Sapporo, Japan (ASP Conf. Series
Preliminary Evaluation of MapReduce for High-Performance Climate Data Analysis
MapReduce is an approach to high-performance analytics that may be useful to data intensive problems in climate research. It offers an analysis paradigm that uses clusters of computers and combines distributed storage of large data sets with parallel computation. We are particularly interested in the potential of MapReduce to speed up basic operations common to a wide range of analyses. In order to evaluate this potential, we are prototyping a series of canonical MapReduce operations over a test suite of observational and climate simulation datasets. Our initial focus has been on averaging operations over arbitrary spatial and temporal extents within Modern Era Retrospective- Analysis for Research and Applications (MERRA) data. Preliminary results suggest this approach can improve efficiencies within data intensive analytic workflows
Grid enabled data analysis on handheld devices
The requirement for information on portable, handheld devices demands the realization of increasingly complex applications for increasingly small and ubiquitous devices. This trend promotes the migration of technologies that were originally developed for desktop computers to handheld devices. With the onset of grid computing, users of handheld devices should be able to accomplish much more complex tasks, by accessing the processing and storage resources of the grid. This paper describes the development, features, and performance aspects of a grid enabled analysis environment designed for handheld devices. We also describe some differences in the technologies required to run these applications on desktop machines and handheld devices. In addition, we propose a prototype agent-based distributed architecture for carrying out high-speed analysis of physics data on handheld devices
Energy Saving Potential of Idle Pacman Supercomputing Nodes
To determine the energy saving potential of suspending idle supercomputing nodes without sacrificing efficiency, my research involved the setup of a compute node power usage monitoring system. This system measures how much power each node draws at its diff erent levels of operation using an automated Expect script. The script automates tasks with interactive command line interfaces, to perform the power measurement readings. Steps required for the power usage monitoring system include remotely logging into the Pacman Penguin compute cluster power distribution units (PDUs), feeding commands to the PDUs, and storing the returned data. Using a Python script the data is then parsed into a more coherent format and written to a common file format for analysis. With this system, the Arctic Region Supercomputing Center (ARSC) will be able to determine how much energy is used during diff erent levels of load intensity on the Pacman supercomputer and how much energy can be saved by suspending unnecessary nodes during levels of reduced activity. Power utilization by supercomputers is of major interest to those who design and purchase them. Since 2008, the leading source of worldwide supercomputer
speed rankings has also included power consumption and power efficiency values. Because digital computers utilize electricity to perform computation, larger computers tend to utilize more energy and produce more heat.
Pacman, an acronym for Pacific Area Climate Monitoring and Analysis Network, is a high performance supercomputer designed for large compute and memory intensive jobs. Pacman is composed of the following general computational nodes:
• 256 four-core compute nodes containing two dual core 2.6 GHz AMD Opteron processors each
• 20 twelve-core compute nodes containing two six core 2.6 GHz AMD Opteron processors each
• 88 sixteen-core compute nodes containing two eight core 2.3 GHz AMD Opteron processors eac
Recommended from our members
NBS monograph
From Abstract: "This Monograph presents an outline of the methods used at the National Bureau of Standards to predict the performance of lenses from an analysis of their designs. The technique is based on the use of spot diagrams, which are analogs of star image tests, and makes extensive use of high-speed digital computers.
GraPE: fast and scalable Graph Processing and Embedding
Graph Representation Learning methods have enabled a wide range of learning
problems to be addressed for data that can be represented in graph form.
Nevertheless, several real world problems in economy, biology, medicine and
other fields raised relevant scaling problems with existing methods and their
software implementation, due to the size of real world graphs characterized by
millions of nodes and billions of edges. We present GraPE, a software resource
for graph processing and random walk based embedding, that can scale with large
and high-degree graphs and significantly speed up-computation. GraPE comprises
specialized data structures, algorithms, and a fast parallel implementation
that displays everal orders of magnitude improvement in empirical space and
time complexity compared to state of the art software resources, with a
corresponding boost in the performance of machine learning methods for edge and
node label prediction and for the unsupervised analysis of graphs.GraPE is
designed to run on laptop and desktop computers, as well as on high performance
computing cluster
- …