11,485 research outputs found
Breadth First Search Vectorization on the Intel Xeon Phi
Breadth First Search (BFS) is a building block for graph algorithms and has
recently been used for large scale analysis of information in a variety of
applications including social networks, graph databases and web searching. Due
to its importance, a number of different parallel programming models and
architectures have been exploited to optimize the BFS. However, due to the
irregular memory access patterns and the unstructured nature of the large
graphs, its efficient parallelization is a challenge. The Xeon Phi is a
massively parallel architecture available as an off-the-shelf accelerator,
which includes a powerful 512 bit vector unit with optimized scatter and gather
functions. Given its potential benefits, work related to graph traversing on
this architecture is an active area of research.
We present a set of experiments in which we explore architectural features of
the Xeon Phi and how best to exploit them in a top-down BFS algorithm but the
techniques can be applied to the current state-of-the-art hybrid, top-down plus
bottom-up, algorithms.
We focus on the exploitation of the vector unit by developing an improved
highly vectorized OpenMP parallel algorithm, using vector intrinsics, and
understanding the use of data alignment and prefetching. In addition, we
investigate the impact of hyperthreading and thread affinity on performance, a
topic that appears under researched in the literature. As a result, we achieve
what we believe is the fastest published top-down BFS algorithm on the version
of Xeon Phi used in our experiments. The vectorized BFS top-down source code
presented in this paper can be available on request as free-to-use software
The "MIND" Scalable PIM Architecture
MIND (Memory, Intelligence, and Network Device) is an advanced parallel computer architecture for high performance computing and scalable embedded processing. It is a
Processor-in-Memory (PIM) architecture integrating both DRAM bit cells and CMOS logic devices on the same silicon die. MIND is multicore with multiple memory/processor nodes on
each chip and supports global shared memory across systems of MIND components. MIND is distinguished from other PIM architectures in that it incorporates mechanisms for efficient support of a global parallel execution model based on the semantics of message-driven multithreaded split-transaction processing. MIND is designed to operate either in conjunction with other conventional microprocessors or in standalone arrays of like devices. It also incorporates mechanisms for fault tolerance, real time execution, and active power management. This paper describes the major elements and operational methods of the MIND
architecture
Distinguishing low frequency mutations from RT-PCR and sequence errors in viral deep sequencing data
There is a high prevalence of coronary artery disease (CAD) in patients with left bundle branch block (LBBB); however there are many other causes for this electrocardiographic abnormality. Non-invasive assessment of these patients remains difficult, and all commonly used modalities exhibit several drawbacks. This often leads to these patients undergoing invasive coronary angiography which may not have been necessary. In this review, we examine the uses and limitations of commonly performed non-invasive tests for diagnosis of CAD in patients with LBBB
Massively Parallel Algorithm for Multiple Sequence Alignment Based on Artificial Bee Colony
In silico biological sequence processing is a key task in molecular biology. This scientific area requires powerful computing
resources for exploring large sets of biological data. Parallel in silico simulations based on methods and algorithms for analysis of
biological data using high-performance distributed computing is essential for accelerating the research and reducing the investment.
Multiple sequence alignment is a widely used method for biological sequence processing. The goal of this method is DNA and protein sequences alignment. This paper presents an innovative parallel algorithm MSA_BG for multiple alignment of biological sequences that is highly scalable and locality aware. The MSA_BG algorithm we describe is iterative and is based on the concept of Artificial Bee Colony metaheuristics and the concept of algorithmic and architectural spaces correlation. The metaphor of the ABC
metaheuristics has been constructed and the functionalities of the agents have been defined. The conceptual parallel model of
computation has been designed and the algorithmic framework of the designed parallel algorithm constructed. Experimental
simulations on the basis of parallel implementation of MSA_BG algorithm for multiple sequences alignment on heterogeneouc
compact computer cluster and supercomputer BlueGene/P have been carried out for the case study of the influenza virus variability
investigation. The performance estimation and profiling analyses have shown that the parallel system is well balanced both in
respect to the workload and machine size
- …