10 research outputs found
Rc-blast: towards a portable, cost-effective open source hardware implementation
Basic Local Alignment Search Tool (BLAST) is a standard computer application that molecular biologists use to search for sequence similarity in genomic databases. This report describes the implementation of an FPGAbased hardware implementation designed to accelerate the BLAST algorithm. FPGA-based custom computing machines, more widely known as Reconfigurable Computing, are supported by a number of vendors and the basic cost of FPGA hardware is dramatically decreasing. Hence, the main objective of this project is to explore the feasibility of using this new technology to realize a portable, Open Source FPGA-based accelerator for the BLAST Algorithm. The present design is targeted to an AceIIcard and the design is based on the latest version of BLAST available from NCBI. Since the entire application does not fit in hardware, a profile study was conducted that identifies the computationally intensive part of BLAST. An FPGA hardware component has been designed and implemented for this critical segment. The portability and cost-effectiveness of the design are discussed. 1
Recommended from our members
GPU COMPUTING FOR PARTICLE TRACKING
This is a feasibility study of using a modern Graphics Processing Unit (GPU) to parallelize the accelerator particle tracking code. To demonstrate the massive parallelization features provided by GPU computing, a simplified TracyGPU program is developed for dynamic aperture calculation. Performances, issues, and challenges from introducing GPU are also discussed. General purpose Computation on Graphics Processing Units (GPGPU) bring massive parallel computing capabilities to numerical calculation. However, the unique architecture of GPU requires a comprehensive understanding of the hardware and programming model to be able to well optimize existing applications. In the field of accelerator physics, the dynamic aperture calculation of a storage ring, which is often the most time consuming part of the accelerator modeling and simulation, can benefit from GPU due to its embarrassingly parallel feature, which fits well with the GPU programming model. In this paper, we use the Tesla C2050 GPU which consists of 14 multi-processois (MP) with 32 cores on each MP, therefore a total of 448 cores, to host thousands ot threads dynamically. Thread is a logical execution unit of the program on GPU. In the GPU programming model, threads are grouped into a collection of blocks Within each block, multiple threads share the same code, and up to 48 KB of shared memory. Multiple thread blocks form a grid, which is executed as a GPU kernel. A simplified code that is a subset of Tracy++ [2] is developed to demonstrate the possibility of using GPU to speed up the dynamic aperture calculation by having each thread track a particle
Recommended from our members
HPC CLOUD APPLIED TO LATTICE OPTIMIZATION
As Cloud services gain in popularity for enterprise use, vendors are now turning their focus towards providing cloud services suitable for scientific computing. Recently, Amazon Elastic Compute Cloud (EC2) introduced the new Cluster Compute Instances (CCI), a new instance type specifically designed for High Performance Computing (HPC) applications. At Berkeley Lab, the physicists at the Advanced Light Source (ALS) have been running Lattice Optimization on a local cluster, but the queue wait time and the flexibility to request compute resources when needed are not ideal for rapid development work. To explore alternatives, for the first time we investigate running the Lattice Optimization application on Amazon's new CCI to demonstrate the feasibility and trade-offs of using public cloud services for science
Performance and Cost Analysis of the Supernova Factory on the Amazon AWS Cloud
Today, our picture of the Universe radically differs from that of just over a decade ago. We now know that the Universe is not only expanding as Hubble discovered in 1929, but that the rate of expansion is accelerating, propelled by mysterious new physics dubbed “Dark Energy”. This revolutionary discovery was made by comparing the brightness of nearby Type Ia supernovae (which exploded in the past billion years) to that of much more distant ones (from up to seven billion years ago). The reliability of this comparison hinges upon a very detailed understanding of the physics of the nearby events. To further this understanding, the Nearby Supernova Factory (SNfactory) relies upon a complex pipeline of serial processes that execute various image processing algorithms in parallel on ~10 TBs of data. This pipeline traditionally runs on a local cluster. Cloud computing [Above the clouds: a Berkeley view of cloud computing, Technical Report UCB/EECS-2009-28, University of California, 2009] offers many features that make it an attractive alternative. The ability to completely control the software environment in a cloud is appealing when dealing with a community developed science pipeline with many unique library and platform requirements. In this context we study the feasibility of porting the SNfactory pipeline to the Amazon Web Services environment. Specifically we: describe the tool set we developed to manage a virtual cluster on Amazon EC2, explore the various design options available for application data placement, and offer detailed performance results and lessons learned from each of the above design options
Performance Analysis of High Performance Computing Applications on the Amazon Web Services Cloud.
Abstract-Cloud computing has seen tremendous growth, particularly for commercial web applications. The on-demand, pay-as-you-go model creates a flexible and cost-effective means to access compute resources. For these reasons, the scientific computing community has shown increasing interest in exploring cloud computing. However, the underlying implementation and performance of clouds are very different from those at traditional supercomputing centers. It is therefore critical to evaluate the performance of HPC applications in today's cloud environments to understand the tradeoffs inherent in migrating to the cloud. This work represents the most comprehensive evaluation to date comparing conventional HPC platforms to Amazon EC2, using real applications representative of the workload at a typical supercomputing center. Overall results indicate that EC2 is six times slower than a typical mid-range Linux cluster, and twenty times slower than a modern HPC system. The interconnect on the EC2 cloud platform severely limits performance and causes significant variability
Recommended from our members
An ultrahigh-resolution soft x-ray microscope for quantitative analysis of chemically heterogeneous nanomaterials.
The analysis of chemical states and morphology in nanomaterials is central to many areas of science. We address this need with an ultrahigh-resolution scanning transmission soft x-ray microscope. Our instrument provides multiple analysis tools in a compact assembly and can achieve few-nanometer spatial resolution and high chemical sensitivity via x-ray ptychography and conventional scanning microscopy. A novel scanning mechanism, coupled to advanced x-ray detectors, a high-brightness x-ray source, and high-performance computing for analysis provide a revolutionary step forward in terms of imaging speed and resolution. We present x-ray microscopy with 8-nm full-period spatial resolution and use this capability in conjunction with operando sample environments and cryogenic imaging, which are now routinely available. Our multimodal approach will find wide use across many fields of science and facilitate correlative analysis of materials with other types of probes