822 research outputs found
A fast parallel algorithm for special linear systems of equations using processor arrays with reconfigurable bus systems
A parallel algorithm using Processor Arrays with Reconfigurable Bus Systems
has been designed to solve dense Symmetric Positive Definite (SPD) systems of
equations Ax = b. The key content of this report is the parallelisation of the
algorithm by Delosme & Ipson [8]. In order to design a parallel algorithm for
PARBS, many procedures involved in [8] are handled in a slightly different
way. The parallel time and processor’s complexity of each step of the
algorithm is calculated. The parallel time complexity is O(n) using 2n × 2n ×
5n number of Processing Elements
Parallel progressive multiple sequence alignment on reconfigurable meshes
<p>Abstract</p> <p>Background</p> <p>One of the most fundamental and challenging tasks in bio-informatics is to identify related sequences and their hidden biological significance. The most popular and proven best practice method to accomplish this task is aligning multiple sequences together. However, multiple sequence alignment is a computing extensive task. In addition, the advancement in DNA/RNA and Protein sequencing techniques has created a vast amount of sequences to be analyzed that exceeding the capability of traditional computing models. Therefore, an effective parallel multiple sequence alignment model capable of resolving these issues is in a great demand.</p> <p>Results</p> <p>We design <it>O</it>(1) run-time solutions for both local and global dynamic programming pair-wise alignment algorithms on reconfigurable mesh computing model. To align <it>m </it>sequences with max length <it>n</it>, we combining the parallel pair-wise dynamic programming solutions with newly designed parallel components. We successfully reduce the progressive multiple sequence alignment algorithm's run-time complexity from <it>O</it>(<it>m </it>× <it>n</it><sup>4</sup>) to <it>O</it>(<it>m</it>) using <it>O</it>(<it>m </it>× <it>n</it><sup>3</sup>) processing units for scoring schemes that use three distinct values for match/mismatch/gap-extension. The general solution to multiple sequence alignment algorithm takes <it>O</it>(<it>m </it>× <it>n</it><sup>4</sup>) processing units and completes in <it>O</it>(<it>m</it>) time.</p> <p>Conclusions</p> <p>To our knowledge, this is the first time the progressive multiple sequence alignment algorithm is completely parallelized with <it>O</it>(<it>m</it>) run-time. We also provide a new parallel algorithm for the Longest Common Subsequence (LCS) with <it>O</it>(1) run-time using <it>O</it>(<it>n</it><sup>3</sup>) processing units. This is a big improvement over the current best constant-time algorithm that uses <it>O</it>(<it>n</it><sup>4</sup>) processing units.</p
RMESH Algorithms for Parallel String Matching
String matching problem received much attention over the years due to its importance in various applications such as text/file comparison, DNA sequencing, search engines, and spelling correction. Especially with the introduction of search engines dealing with tremendous amount of textual information presented on the world wide web and the research on DNA sequencing, this problem deserves special attention and any algorithmic or hardware improvements to speed up the process will benefit these important applications. In this paper, we present three algorithms for string matching on reconfigurable mesh architectures. Given a text T of length n and a pattern P of length m, the first algorithm finds the exact matching between T and P in O(1) time on a 2-dimensional RMESH of size (n-m+1) * m. The second algorithm finds the approximate matching between T and P in O(k) time on a 2D RMESH, where k is the maximum edit distance between T and P. The third algorithm allows only the replacement operation in the calculation of the edit distance and finds an approximate matching between T and P in constant-time on a 3D RMESH
An Optimal Design for Universal Multiport Interferometers
Universal multiport interferometers, which can be programmed to implement any
linear transformation between multiple channels, are emerging as a powerful
tool for both classical and quantum photonics. These interferometers are
typically composed of a regular mesh of beam splitters and phase shifters,
allowing for straightforward fabrication using integrated photonic
architectures and ready scalability. The current, standard design for universal
multiport interferometers is based on work by Reck et al (Phys. Rev. Lett. 73,
58, 1994). We demonstrate a new design for universal multiport interferometers
based on an alternative arrangement of beam splitters and phase shifters, which
outperforms that by Reck et al. Our design occupies half the physical footprint
of the Reck design and is significantly more robust to optical losses.Comment: 8 pages, 4 figure
Scalable machine learning-assisted clear-box characterization for optimally controlled photonic circuits
Photonic integrated circuits offer a compact and stable platform for
generating, manipulating, and detecting light. They are instrumental for
classical and quantum applications. Imperfections stemming from fabrication
constraints, tolerances and operation wavelength impose limitations on the
accuracy and thus utility of current photonic integrated devices. Mitigating
these imperfections typically necessitates a model of the underlying physical
structure and the estimation of parameters that are challenging to access.
Direct solutions are currently lacking for mesh configurations extending beyond
trivial cases. We introduce a scalable and innovative method to characterize
photonic chips through an iterative machine learning-assisted procedure. Our
method is based on a clear-box approach that harnesses a fully modeled virtual
replica of the photonic chip to characterize. The process is sample-efficient
and can be carried out with a continuous-wave laser and powermeters. The model
estimates individual passive phases, crosstalk, beamsplitter reflectivity
values and relative input/output losses. Building upon the accurate
characterization results, we mitigate imperfections to enable enhanced control
over the device. We validate our characterization and imperfection mitigation
methods on a 12-mode Clements-interferometer equipped with 126 phase shifters,
achieving beyond state-of-the-art chip control with an average 99.77 %
amplitude fidelity on 100 implemented Haar-random unitary matrices
Addressing the programming challenges of practical interferometric mesh based optical processors
We demonstrate a novel mesh of Mach-Zehnder interferometers (MZIs) for
programmable optical processors. The proposed mesh, referred to as Bokun mesh,
is an architecture that merges the attributes of the prior topologies Diamond
and Clements. Similar to Diamond, Bokun provides diagonal paths passing through
every individual MZI enabling direct phase monitoring. However, unlike Diamond
and similar to Clements, Bokun maintains a minimum optical depth leading to
better scalability. Providing the monitoring option, Bokun's programming is
faster improving the total energy efficiency of the processor. The performance
of Bokun mesh enabled by an optimal optical depth is also more resilient to the
loss and fabrication imperfections compared to architectures with longer depth
such as Reck and Diamond. Employing an efficient programming scheme, the
proposed architecture improves energy efficiency by 83% maintaining the same
computation accuracy for weight matrix changes at 2 kHz
Design of an FPGA-based parallel SIMD machine for power flow analysis
Power flow analysis consists of computationally intensive calculations on large matrices, consumes several hours of computational time, and has shown the need for the implementation of application-specific parallel machines. The potential of Single-Instruction stream Multiple-Data stream (SIMD) parallel architectures for efficient operations on large matrices has been demonstrated as seen in the case of many existing supercomputers. The unsuitability of existing parallel machines for low-cost power system applications, their long design cycles, and the difficulty in using them show the need for application-specific SIMI) machines. Advances in VLSI technology and Field-Programmable Gate-Arrays (FPGAs) enable the implementation of Custom Computing Machines (CCMs) which can yield better performance for specific applications. The advent of SoftCore processors made it possible to integrate reconfigurable logic as a slave to a peripheral bus and has demonstrated the ability in the rapid prototyping of complete systems on programmable chips. This thesis aims at designing and implementing an FPGA-based SIMI) machine for power flow analysis. It presents the architecture of an SIMI) machine that consists of an array of processing elements with mesh interconnection and a Soft-Core processor; the latter is used as the host. The FPGAbased SIMI) machine is implemented on the Annapolis Microsystems Wildstar-II board that contains multiple Virtex-II FPGAs. The Soft-Core processor used is the Xilinx Microblaze and the application targeted is matrix multiplication
Multiple Biolgical Sequence Alignment: Scoring Functions, Algorithms, and Evaluations
Aligning multiple biological sequences such as protein sequences or DNA/RNA sequences is a fundamental task in bioinformatics and sequence analysis. These alignments may contain invaluable information that scientists need to predict the sequences\u27 structures, determine the evolutionary relationships between them, or discover drug-like compounds that can bind to the sequences. Unfortunately, multiple sequence alignment (MSA) is NP-Complete. In addition, the lack of a reliable scoring method makes it very hard to align the sequences reliably and to evaluate the alignment outcomes.
In this dissertation, we have designed a new scoring method for use in multiple sequence alignment. Our scoring method encapsulates stereo-chemical properties of sequence residues and their substitution probabilities into a tree-structure scoring scheme. This new technique provides a reliable scoring scheme with low computational complexity.
In addition to the new scoring scheme, we have designed an overlapping sequence clustering algorithm to use in our new three multiple sequence alignment algorithms. One of our alignment algorithms uses a dynamic weighted guidance tree to perform multiple sequence alignment in progressive fashion. The use of dynamic weighted tree allows errors in the early alignment stages to be corrected in the subsequence stages. Other two algorithms utilize sequence knowledge-bases and sequence consistency to produce biological meaningful sequence alignments. To improve the speed of the multiple sequence alignment, we have developed a parallel algorithm that can be deployed on reconfigurable computer models. Analytically, our parallel algorithm is the fastest progressive multiple sequence alignment algorithm
- …