Search CORE

10 research outputs found

Rc-blast: towards a portable, cost-effective open source hardware implementation

Author: Krishna Muriki
Publication venue
Publication date
Field of study

Basic Local Alignment Search Tool (BLAST) is a standard computer application that molecular biologists use to search for sequence similarity in genomic databases. This report describes the implementation of an FPGAbased hardware implementation designed to accelerate the BLAST algorithm. FPGA-based custom computing machines, more widely known as Reconfigurable Computing, are supported by a number of vendors and the basic cost of FPGA hardware is dramatically decreasing. Hence, the main objective of this project is to explore the feasibility of using this new technology to realize a portable, Open Source FPGA-based accelerator for the BLAST Algorithm. The present design is targeted to an AceIIcard and the design is based on the latest version of BLAST available from NCBI. Since the entire application does not fit in hardware, a profile study was conducted that identifies the computationally intensive part of BLAST. An FPGA hardware component has been designed and implemented for this critical segment. The portability and cost-effectiveness of the design are discussed. 1

CiteSeerX

Design and Implementation of Open Source FPGA-Based Accelerator for BLAST

Author: Muriki Krishna
Publication venue: Clemson University Libraries
Publication date: 01/12/2004
Field of study

Clemson University: TigerPrints

Recommended from our members

GPU COMPUTING FOR PARTICLE TRACKING

Author: James Susan
Muriki Krishna
Nishimura Hiroshi
Qin Yong
Song Kai
Sun Changchun
Publication venue: Lawrence Berkeley National Laboratory
Publication date: 25/03/2011
Field of study

This is a feasibility study of using a modern Graphics Processing Unit (GPU) to parallelize the accelerator particle tracking code. To demonstrate the massive parallelization features provided by GPU computing, a simplified TracyGPU program is developed for dynamic aperture calculation. Performances, issues, and challenges from introducing GPU are also discussed. General purpose Computation on Graphics Processing Units (GPGPU) bring massive parallel computing capabilities to numerical calculation. However, the unique architecture of GPU requires a comprehensive understanding of the hardware and programming model to be able to well optimize existing applications. In the field of accelerator physics, the dynamic aperture calculation of a storage ring, which is often the most time consuming part of the accelerator modeling and simulation, can benefit from GPU due to its embarrassingly parallel feature, which fits well with the GPU programming model. In this paper, we use the Tesla C2050 GPU which consists of 14 multi-processois (MP) with 32 cores on each MP, therefore a total of 448 cores, to host thousands ot threads dynamically. Thread is a logical execution unit of the program on GPU. In the GPU programming model, threads are grouped into a collection of blocks Within each block, multiple threads share the same code, and up to 48 KB of shared memory. Multiple thread blocks form a grid, which is executed as a GPU kernel. A simplified code that is a subset of Tracy++ [2] is developed to demonstrate the possibility of using GPU to speed up the dynamic aperture calculation by having each thread track a particle

UNT Digital Library

Recommended from our members

HPC CLOUD APPLIED TO LATTICE OPTIMIZATION

Author: James Susan
Muriki Krishna
Nishimura Hiroshi
Qin Yong
Song Kai
Sun Changchun
Publication venue: Lawrence Berkeley National Laboratory
Publication date: 18/03/2011
Field of study

As Cloud services gain in popularity for enterprise use, vendors are now turning their focus towards providing cloud services suitable for scientific computing. Recently, Amazon Elastic Compute Cloud (EC2) introduced the new Cluster Compute Instances (CCI), a new instance type specifically designed for High Performance Computing (HPC) applications. At Berkeley Lab, the physicists at the Advanced Light Source (ALS) have been running Lattice Optimization on a local cluster, but the queue wait time and the flexibility to request compute resources when needed are not ideal for rapid development work. To explore alternatives, for the first time we investigate running the Lattice Optimization application on Amazon's new CCI to demonstrate the feasibility and trade-offs of using public cloud services for science

UNT Digital Library

Performance and Cost Analysis of the Supernova Factory on the Amazon AWS Cloud

Author: Karl J. Runge
Keith R. Jackson
Krishna Muriki
Lavanya Ramakrishnan
Rollin C. Thomas
Publication venue: Hindawi Limited
Publication date: 01/01/2011
Field of study

Today, our picture of the Universe radically differs from that of just over a decade ago. We now know that the Universe is not only expanding as Hubble discovered in 1929, but that the rate of expansion is accelerating, propelled by mysterious new physics dubbed “Dark Energy”. This revolutionary discovery was made by comparing the brightness of nearby Type Ia supernovae (which exploded in the past billion years) to that of much more distant ones (from up to seven billion years ago). The reliability of this comparison hinges upon a very detailed understanding of the physics of the nearby events. To further this understanding, the Nearby Supernova Factory (SNfactory) relies upon a complex pipeline of serial processes that execute various image processing algorithms in parallel on ~10 TBs of data. This pipeline traditionally runs on a local cluster. Cloud computing [Above the clouds: a Berkeley view of cloud computing, Technical Report UCB/EECS-2009-28, University of California, 2009] offers many features that make it an attractive alternative. The ability to completely control the software environment in a cloud is appealing when dealing with a community developed science pipeline with many unique library and platform requirements. In this context we study the feasibility of porting the SNfactory pipeline to the Amazon Web Services environment. Specifically we: describe the tool set we developed to manage a virtual cluster on Amazon EC2, explore the various design options available for application data placement, and offer detailed performance results and lessons learned from each of the above design options

Directory of Open Access Journals

Evaluating Interconnect and Virtualization Performance forHigh Performance Computing

Author: Iwona Sakrejda
Keahey K.
Krishna Muriki
Lavanya Ramakrishnan
Nicholas J. Wright
Ostermann S.
R. Shane Canon
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Performance Analysis of High Performance Computing Applications on the Amazon Web Services Cloud.

Author: Harvey J Wasserman
Hjwasserman@lbl Gov
John Shalf
Jshalf@lbl Gov
Keith R Jackson
Krishna Muriki
Lavanya Ramakrishnan
Nicholas J Wright
Njwright@lbl Gov
Scanon@lbl Gov
Scholia@lbl Gov
Shane Canon
Shreyas Cholia
Publication venue
Publication date: 01/01/2010
Field of study

Abstract-Cloud computing has seen tremendous growth, particularly for commercial web applications. The on-demand, pay-as-you-go model creates a flexible and cost-effective means to access compute resources. For these reasons, the scientific computing community has shown increasing interest in exploring cloud computing. However, the underlying implementation and performance of clouds are very different from those at traditional supercomputing centers. It is therefore critical to evaluate the performance of HPC applications in today's cloud environments to understand the tradeoffs inherent in migrating to the cloud. This work represents the most comprehensive evaluation to date comparing conventional HPC platforms to Amazon EC2, using real applications representative of the workload at a typical supercomputing center. Overall results indicate that EC2 is six times slower than a typical mid-range Linux cluster, and twenty times slower than a modern HPC system. The interconnect on the EC2 cloud platform severely limits performance and causes significant variability

CiteSeerX

Recommended from our members

An ultrahigh-resolution soft x-ray microscope for quantitative analysis of chemically heterogeneous nanomaterials.

Author: Babin Sergey
Celestre Richard S
Chao Weilun
Conley Raymond P
Denes Peter
Enders Bjoern
Enfedaque Pablo
James Susan
Joseph John M
Krishnan Harinarayan
Marchesini Stefano
Muriki Krishna
Nowrouzi Kasra
Oh Sharon R
Padmore Howard
Shapiro David A
Warwick Tony
Yang Lee
Yashchuk Valeriy V
Yu Young-Sang
Zhao Jiangtao
Publication venue: eScholarship, University of California
Publication date: 01/12/2020
Field of study

The analysis of chemical states and morphology in nanomaterials is central to many areas of science. We address this need with an ultrahigh-resolution scanning transmission soft x-ray microscope. Our instrument provides multiple analysis tools in a compact assembly and can achieve few-nanometer spatial resolution and high chemical sensitivity via x-ray ptychography and conventional scanning microscopy. A novel scanning mechanism, coupled to advanced x-ray detectors, a high-brightness x-ray source, and high-performance computing for analysis provide a revolutionary step forward in terms of imaging speed and resolution. We present x-ray microscopy with 8-nm full-period spatial resolution and use this capability in conjunction with operando sample environments and cryogenic imaging, which are now routinely available. Our multimodal approach will find wide use across many fields of science and facilitate correlative analysis of materials with other types of probes

eScholarship - University of California