30,565 research outputs found
Speedup bioinformatics applications on multicore-based processor using vectorizing and multithreading strategies
Many computational intensive bioinformatics software, such as multiple sequence alignment, population structure analysis, etc.,
written in C/C++ are not multicore-aware. A multicore processor is an emerging CPU technology that combines two or more independent
processors into a single package. The Single Instruction Multiple Data-stream (SIMD) paradigm is heavily utilized in this class of
processors. Nevertheless, most popular compilers including Microsoft Visual C/C++ 6.0, x86 gnu C-compiler gcc do not automatically
create SIMD code which can fully utilize the advancement of these processors. To harness the power of the new multicore architecture
certain compiler techniques must be considered. This paper presents a generic compiling strategy to assist the compiler in improving
the performance of bioinformatics applications written in C/C++. The proposed framework contains 2 main steps: multithreading and
vectorizing strategies. After following the strategies, the application can achieve higher speedup by taking the advantage of multicore
architecture technology. Due to the extremely fast interconnection networking among multiple cores, it is suggested that the proposed
optimization could be more appropriate than making use of parallelization on a small cluster computer which has larger network latency
and lower bandwidth
Palgol: A High-Level DSL for Vertex-Centric Graph Processing with Remote Data Access
Pregel is a popular distributed computing model for dealing with large-scale
graphs. However, it can be tricky to implement graph algorithms correctly and
efficiently in Pregel's vertex-centric model, especially when the algorithm has
multiple computation stages, complicated data dependencies, or even
communication over dynamic internal data structures. Some domain-specific
languages (DSLs) have been proposed to provide more intuitive ways to implement
graph algorithms, but due to the lack of support for remote access --- reading
or writing attributes of other vertices through references --- they cannot
handle the above mentioned dynamic communication, causing a class of Pregel
algorithms with fast convergence impossible to implement.
To address this problem, we design and implement Palgol, a more declarative
and powerful DSL which supports remote access. In particular, programmers can
use a more declarative syntax called chain access to naturally specify dynamic
communication as if directly reading data on arbitrary remote vertices. By
analyzing the logic patterns of chain access, we provide a novel algorithm for
compiling Palgol programs to efficient Pregel code. We demonstrate the power of
Palgol by using it to implement several practical Pregel algorithms, and the
evaluation result shows that the efficiency of Palgol is comparable with that
of hand-written code.Comment: 12 pages, 10 figures, extended version of APLAS 2017 pape
Topological Quantum Compiling
A method for compiling quantum algorithms into specific braiding patterns for
non-Abelian quasiparticles described by the so-called Fibonacci anyon model is
developed. The method is based on the observation that a universal set of
quantum gates acting on qubits encoded using triplets of these quasiparticles
can be built entirely out of three-stranded braids (three-braids). These
three-braids can then be efficiently compiled and improved to any required
accuracy using the Solovay-Kitaev algorithm.Comment: 20 pages, 20 figures, published versio
Modularizing and Specifying Protocols among Threads
We identify three problems with current techniques for implementing protocols
among threads, which complicate and impair the scalability of multicore
software development: implementing synchronization, implementing coordination,
and modularizing protocols. To mend these deficiencies, we argue for the use of
domain-specific languages (DSL) based on existing models of concurrency. To
demonstrate the feasibility of this proposal, we explain how to use the model
of concurrency Reo as a high-level protocol DSL, which offers appropriate
abstractions and a natural separation of protocols and computations. We
describe a Reo-to-Java compiler and illustrate its use through examples.Comment: In Proceedings PLACES 2012, arXiv:1302.579
Parallel software for lattice N=4 supersymmetric Yang--Mills theory
We present new parallel software, SUSY LATTICE, for lattice studies of
four-dimensional supersymmetric Yang--Mills theory with gauge
group SU(N). The lattice action is constructed to exactly preserve a single
supersymmetry charge at non-zero lattice spacing, up to additional potential
terms included to stabilize numerical simulations. The software evolved from
the MILC code for lattice QCD, and retains a similar large-scale framework
despite the different target theory. Many routines are adapted from an existing
serial code, which SUSY LATTICE supersedes. This paper provides an overview of
the new parallel software, summarizing the lattice system, describing the
applications that are currently provided and explaining their basic workflow
for non-experts in lattice gauge theory. We discuss the parallel performance of
the code, and highlight some notable aspects of the documentation for those
interested in contributing to its future development.Comment: Code available at https://github.com/daschaich/sus
- …