Search CORE

587,387 research outputs found

Lanczos eigensolution method for high-performance computers

Author: Bostic Susan W.
Publication venue
Publication date
Field of study

The theory, computational analysis, and applications are presented of a Lanczos algorithm on high performance computers. The computationally intensive steps of the algorithm are identified as: the matrix factorization, the forward/backward equation solution, and the matrix vector multiples. These computational steps are optimized to exploit the vector and parallel capabilities of high performance computers. The savings in computational time from applying optimization techniques such as: variable band and sparse data storage and access, loop unrolling, use of local memory, and compiler directives are presented. Two large scale structural analysis applications are described: the buckling of a composite blade stiffened panel with a cutout, and the vibration analysis of a high speed civil transport. The sequential computational time for the panel problem executed on a CONVEX computer of 181.6 seconds was decreased to 14.1 seconds with the optimized vector algorithm. The best computational time of 23 seconds for the transport problem with 17,000 degs of freedom was on the the Cray-YMP using an average of 3.63 processors

NASA Technical Reports Server

Open-architecture Implementation of Fragment Molecular Orbital Method for Peta-scale Computing

Author: Aoyagi Mutsumi
Honda Hiroaki
Inadomi Yuichi
Kobayashi Taizo
Maki Jun
Nogita Rie
Ooba Jun-ichi
Takami Toshiya
Publication venue
Publication date: 10/01/2007
Field of study

We present our perspective and goals on highperformance computing for nanoscience in accordance with the global trend toward "peta-scale computing." After reviewing our results obtained through the grid-enabled version of the fragment molecular orbital method (FMO) on the grid testbed by the Japanese Grid Project, National Research Grid Initiative (NAREGI), we show that FMO is one of the best candidates for peta-scale applications by predicting its effective performance in peta-scale computers. Finally, we introduce our new project constructing a peta-scale application in an open-architecture implementation of FMO in order to realize both goals of highperformance in peta-scale computers and extendibility to multiphysics simulations.Comment: 6 pages, 9 figures, proceedings of the 2nd IEEE/ACM international workshop on high performance computing for nano-science and technology (HPCNano06

arXiv.org e-Print Archive

CiteSeerX

Using A Nameserver to Enhance Control System Efficiency

Author: Bickley M. H.
Sage J.
White K. S.
Publication venue
Publication date: 01/11/2001
Field of study

The Thomas Jefferson National Accelerator Facility (Jefferson Lab) control system uses a nameserver to reduce system response time and to minimize the impact of client name resolution on front-end computers. The control system is based on the Experimental Physics and Industrial Control System (EPICS), which uses name-based broadcasts to initiate data communication. By default, when EPICS process variables (PV) are requested by client applications, all front-end computers receive the broadcasts and perform name resolution processing against local channel name lists. The nameserver is used to offload the name resolution task to a single node. This processing, formerly done on all front-end computers, is now done only by the nameserver. In a control system with heavily loaded front-end computers and high peak client connection loads, a significant performance improvement is seen. This paper describes the name server in more detail, and discusses the strengths and weaknesses of making name resolution a centralized service.Comment: ICALEPCS 200

arXiv.org e-Print Archive

UNT Digital Library

Distributed OpenGL Rendering in Network Bandwidth Constrained Environments

Author: Hunkin Paul
McGregor Anthony James
Neal Braden
Publication venue: European Association for Computer Graphics
Publication date: 01/01/2011
Field of study

Display walls made from multiple monitors are often used when very high resolution images are required. To utilise a display wall, rendering information must be sent to each computer that the monitors are connect to. The network is often the performance bottleneck for demanding applications, like high performance 3D animations. This paper introduces ClusterGL; a distribution library for OpenGL applications. ClusterGL reduces network traffic by using compression, frame differencing and multi-cast. Existing applications can use ClusterGL without recompilation. Benchmarks show that, for most applications, ClusterGL outperforms other systems that support unmodified OpenGL applications including Chromium and BroadcastGL. The difference is larger for more complex scene geometries and when there are more display machines. For example, when rendering OpenArena, ClusterGL outperforms Chromium by over 300% on the Symphony display wall at The University of Waikato, New Zealand. This display has 20 monitors supported by five computers connected by gigabit Ethernet, with a full resolution of over 35 megapixels. ClusterGL is freely available via Google Code

Research Commons@Waikato

Performance Portable High Performance Conjugate Gradients Benchmark

Author: Bookey Zachary
Publication venue: DigitalCommons@CSB/SJU
Publication date: 01/04/2016
Field of study

The High Performance Conjugate Gradient Benchmark (HPCG) is an international project to create a more appropriate benchmark test for the world\u27s most powerful computers. The current LINPACK benchmark, which is the standard for measuring the performance of the top 500 fastest computers in the world, is moving computers in a direction that is no longer beneficial to many important parallel applications. HPCG is designed to exercise computations and data access patterns more commonly found in applications. The reference version of HPCG exploits only some parallelism available on existing supercomputers and the main focus of this work was to create a performance portable version of HPCG that gives reasonable performance on hybrid architectures

College of Saint Benedict and Saint John’s University: DigitalCommons@CSB/SJU

Enhanced Face Recognition Method Performance on Android vs Windows Platform

Author: Abdul Adam Abdullah
Alsibai Mohammad Hayyan
Hadi Manap
Publication venue: Asian Research Publishing Network (ARPN)
Publication date: 01/01/2015
Field of study

Android is becoming one of the most popular operating systems on smartphones, tablet computers and similar mobile devices. With the quick development in mobile device specifications, it is worthy to think about mobile devices as current or - at least - near future replacement of personal computers. This paper presents an enhanced face recognition method. The method is tested on two different platforms using Windows and Android operating systems. This is done to evaluate the method and to compare the platforms. The platforms are compared according to two factors: development simplicity and performance. The target is evaluating the possibility of replacing personal computers using Windows operating system by mobile devices using Android operating system. Face recognition has been chosen because of the relatively high computing cost of image processing and pattern recognition applications comparing with other applications. The experiment results show acceptable performance of the method on Android platform

UMP Institutional Repository

High-Performance Cloud Computing: A View of Scientific Applications

Author: Buyya Rajkumar
Pandey Suraj
Vecchiola Christian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2009
Field of study

Scientific computing often requires the availability of a massive number of computers for performing large scale experiments. Traditionally, these needs have been addressed by using high-performance computing solutions and installed facilities such as clusters and super computers, which are difficult to setup, maintain, and operate. Cloud computing provides scientists with a completely new model of utilizing the computing infrastructure. Compute resources, storage resources, as well as applications, can be dynamically provisioned (and integrated within the existing infrastructure) on a pay per use basis. These resources can be released when they are no more needed. Such services are often offered within the context of a Service Level Agreement (SLA), which ensure the desired Quality of Service (QoS). Aneka, an enterprise Cloud computing solution, harnesses the power of compute resources by relying on private and public Clouds and delivers to users the desired QoS. Its flexible and service based infrastructure supports multiple programming paradigms that make Aneka address a variety of different scenarios: from finance applications to computational science. As examples of scientific computing in the Cloud, we present a preliminary case study on using Aneka for the classification of gene expression data and the execution of fMRI brain imaging workflow.Comment: 13 pages, 9 figures, conference pape

arXiv.org e-Print Archive

CiteSeerX

Crossref