Search CORE

163 research outputs found

Automata Technique for The LCS Problem

Author: Nguyễn Trường Huy
Publication venue: 'Publishing House for Science and Technology, Vietnam Academy of Science and Technology'
Publication date: 18/03/2019
Field of study

In this paper, we introduce two eﬃcient algorithms in practice for computing the length of a longest common subsequence of two strings, using automata technique, in sequential and parallel ways. For two input strings of lengths m and n with m ≤ n, the parallel algorithm uses k processors (k ≤ m) and costs time complexity O(n) in the worst case, where k is an upper estimate of the length of a longest common subsequence of the two strings. These results are based on the Knapsack Shaking approach proposed by P. T. Huy et al. in 2002. Experimental results show that for the alphabet of size 256, our sequential and parallel algorithms are about 65.85 and 3.41m times faster than the standard dynamic programming algorithm proposed by Wagner and Fisher in 1974, respectively

Vietnam Academy of Science and Technology: Journals Online

Parallel progressive multiple sequence alignment on reconfigurable meshes

Author: Nguyen Ken D
Nong Ge
Pan Yi
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background One of the most fundamental and challenging tasks in bio-informatics is to identify related sequences and their hidden biological significance. The most popular and proven best practice method to accomplish this task is aligning multiple sequences together. However, multiple sequence alignment is a computing extensive task. In addition, the advancement in DNA/RNA and Protein sequencing techniques has created a vast amount of sequences to be analyzed that exceeding the capability of traditional computing models. Therefore, an effective parallel multiple sequence alignment model capable of resolving these issues is in a great demand. Results We design <it>O</it>(1) run-time solutions for both local and global dynamic programming pair-wise alignment algorithms on reconfigurable mesh computing model. To align <it>m </it>sequences with max length <it>n</it>, we combining the parallel pair-wise dynamic programming solutions with newly designed parallel components. We successfully reduce the progressive multiple sequence alignment algorithm's run-time complexity from <it>O</it>(<it>m </it>× <it>n</it>4) to <it>O</it>(<it>m</it>) using <it>O</it>(<it>m </it>× <it>n</it>3) processing units for scoring schemes that use three distinct values for match/mismatch/gap-extension. The general solution to multiple sequence alignment algorithm takes <it>O</it>(<it>m </it>× <it>n</it>4) processing units and completes in <it>O</it>(<it>m</it>) time. Conclusions To our knowledge, this is the first time the progressive multiple sequence alignment algorithm is completely parallelized with <it>O</it>(<it>m</it>) run-time. We also provide a new parallel algorithm for the Longest Common Subsequence (LCS) with <it>O</it>(1) run-time using <it>O</it>(<it>n</it>3) processing units. This is a big improvement over the current best constant-time algorithm that uses <it>O</it>(<it>n</it>4) processing units.</p

Crossref

ScholarWorks @ Georgia State University

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Multiple Biolgical Sequence Alignment: Scoring Functions, Algorithms, and Evaluations

Author: Nguyen Ken D
Publication venue: ScholarWorks @ Georgia State University
Publication date: 14/12/2011
Field of study

Aligning multiple biological sequences such as protein sequences or DNA/RNA sequences is a fundamental task in bioinformatics and sequence analysis. These alignments may contain invaluable information that scientists need to predict the sequences\u27 structures, determine the evolutionary relationships between them, or discover drug-like compounds that can bind to the sequences. Unfortunately, multiple sequence alignment (MSA) is NP-Complete. In addition, the lack of a reliable scoring method makes it very hard to align the sequences reliably and to evaluate the alignment outcomes. In this dissertation, we have designed a new scoring method for use in multiple sequence alignment. Our scoring method encapsulates stereo-chemical properties of sequence residues and their substitution probabilities into a tree-structure scoring scheme. This new technique provides a reliable scoring scheme with low computational complexity. In addition to the new scoring scheme, we have designed an overlapping sequence clustering algorithm to use in our new three multiple sequence alignment algorithms. One of our alignment algorithms uses a dynamic weighted guidance tree to perform multiple sequence alignment in progressive fashion. The use of dynamic weighted tree allows errors in the early alignment stages to be corrected in the subsequence stages. Other two algorithms utilize sequence knowledge-bases and sequence consistency to produce biological meaningful sequence alignments. To improve the speed of the multiple sequence alignment, we have developed a parallel algorithm that can be deployed on reconfigurable computer models. Analytically, our parallel algorithm is the fastest progressive multiple sequence alignment algorithm

ScholarWorks @ Georgia State University

A dynamic programming model to solve optimisation problems using GPUs

Author: O'Connell Jonathan F.
Publication venue
Publication date
Field of study

This thesis presents a parallel, dynamic programming based model which is deployed on the GPU of a system to accelerate the solving of optimisation problems. This is achieved by simultaneously running GPU based computations, and memory transactions, allowing computation to never pause, and overcoming the memory constraints of solving large problem instances. Due to this some optimisation problems, which are currently not solved in an exact manner for real world sized instances due to their complexity, are moved into the solvable realm. The model is implemented to solve, a range of different test problems, where artificially constructed test data is used to ensure good performance even in the worst cases. Through this extensive testing, we can be confident the model will perform well when used to solve real world test cases. Testing of the model was carried out using a range of different implementation parameters in relation to deployment on the GPU, in order to identify both optimal implementation parameters, and how the model will operate when running on different systems. All problems, when implemented in parallel using the model, show run-time improvements compared to the sequential implementations, in some instances up to hundreds of times faster, but more importantly also show high efficiency metrics for the utilisation of GPU resources. Throughout testing emphasis has been placed on GPU based metrics to ensure the wider generic applicability of the model. Finally, the parallel model allows for new problems to be defined through the use of a simple file format, enabling wider usage of the model

Online Research @ Cardiff

Techniques of design optimisation for algorithms implemented in software

Author: Hopson Benjamin Thomas Ken
Publication venue: The University of Edinburgh
Publication date: 27/06/2016
Field of study

The overarching objective of this thesis was to develop tools for parallelising, optimising, and implementing algorithms on parallel architectures, in particular General Purpose Graphics Processors (GPGPUs). Two projects were chosen from different application areas in which GPGPUs are used: a defence application involving image compression, and a modelling application in bioinformatics (computational immunology). Each project had its own specific objectives, as well as supporting the overall research goal. The defence / image compression project was carried out in collaboration with the Jet Propulsion Laboratories. The specific questions were: to what extent an algorithm designed for bit-serial for the lossless compression of hyperspectral images on-board unmanned vehicles (UAVs) in hardware could be parallelised, whether GPGPUs could be used to implement that algorithm, and whether a software implementation with or without GPGPU acceleration could match the throughput of a dedicated hardware (FPGA) implementation. The dependencies within the algorithm were analysed, and the algorithm parallelised. The algorithm was implemented in software for GPGPU, and optimised. During the optimisation process, profiling revealed less than optimal device utilisation, but no further optimisations resulted in an improvement in speed. The design had hit a local-maximum of performance. Analysis of the arithmetic intensity and data-flow exposed flaws in the standard optimisation metric of kernel occupancy used for GPU optimisation. Redesigning the implementation with revised criteria (fused kernels, lower occupancy, and greater data locality) led to a new implementation with 10x higher throughput. GPGPUs were shown to be viable for on-board implementation of the CCSDS lossless hyperspectral image compression algorithm, exceeding the performance of the hardware reference implementation, and providing sufficient throughput for the next generation of image sensor as well. The second project was carried out in collaboration with biologists at the University of Arizona and involved modelling a complex biological system – VDJ recombination involved in the formation of T-cell receptors (TCRs). Generation of immune receptors (T cell receptor and antibodies) by VDJ recombination is an enormously complex process, which can theoretically synthesize greater than 1018 variants. Originally thought to be a random process, the underlying mechanisms clearly have a non-random nature that preferentially creates a small subset of immune receptors in many individuals. Understanding this bias is a longstanding problem in the field of immunology. Modelling the process of VDJ recombination to determine the number of ways each immune receptor can be synthesized, previously thought to be untenable, is a key first step in determining how this special population is made. The computational tools developed in this thesis have allowed immunologists for the first time to comprehensively test and invalidate a longstanding theory (convergent recombination) for how this special population is created, while generating the data needed to develop novel hypothesis

Edinburgh Research Archive

Recommended from our members

Mobile localization : approach and applications

Author: Rallapalli Swati
Publication venue
Publication date: 09/02/2015
Field of study

textLocalization is critical to a number of wireless network applications. In many situations GPS is not suitable. This dissertation (i) develops novel localization schemes for wireless networks by explicitly incorporating mobility information and (ii) applies localization to physical analytics i.e., understanding shoppers' behavior within retail spaces by leveraging inertial sensors, Wi-Fi and vision enabled by smart glasses. More specifically, we first focus on multi-hop mobile networks, analyze real mobility traces and observe that they exhibit temporal stability and low-rank structure. Motivated by these observations, we develop novel localization algorithms to effectively capture and also adapt to different degrees of these properties. Using extensive simulations and testbed experiments, we demonstrate the accuracy and robustness of our new schemes. Second, we focus on localizing a single mobile node, which may not be connected with multiple nodes (e.g., without network connectivity or only connected with an access point). We propose trajectory-based localization using Wi-Fi or magnetic field measurements. We show that these measurements have the potential to uniquely identify a trajectory. We then develop a novel approach that leverages multi-level wavelet coefficients to first identify the trajectory and then localize to a point on the trajectory. We show that this approach is highly accurate and power efficient using indoor and outdoor experiments. Finally, localization is a critical step in enabling a lot of applications --- an important one is physical analytics. Physical analytics has the potential to provide deep-insight into shoppers' interests and activities and therefore better advertisements, recommendations and a better shopping experience. To enable physical analytics, we build ThirdEye system which first achieves zero-effort localization by leveraging emergent devices like the Google-Glass to build AutoLayout that fuses video, Wi-Fi, and inertial sensor data, to simultaneously localize the shoppers while also constructing and updating the product layout in a virtual coordinate space. Further, ThirdEye comprises of a range of schemes that use a combination of vision and inertial sensing to study mobile users' behavior while shopping, namely: walking, dwelling, gazing and reaching-out. We show the effectiveness of ThirdEye through an evaluation in two large retail stores in the United States.Computer Science

Texas ScholarWorks

Approximate string matching algorithms in art media archives

Author: Casanovas Martín Fernando
Publication venue: Universitat Politècnica de Catalunya
Publication date: 18/05/2009
Field of study

Projecte realitzat en col.laboració amb el centre AGH University of Science and TechnologyThe implementation of a distributed electronic art platform with a considerable amount of data available involves some technical challenges. Due to the lack of a common European platform for media art, the European Commission has decided to finance the implementation of GAMA (Gateway to Archives of Media Art), with the aim of establish a common platform for media art archives. One of the universities involved in the project is the AGH University of Science and Technology from Krakow, with responsibilities concerning to the architecture and IT solutions. The AGH team involved in the project is integrated by professors and students of the University, including the author of this paper. In the following chapters, an approach to the architecture of GAMA is presented, as well as the actual problem of integration of services and the solutions used to solve it in GAMA. Also the concept of harmonization and approximate string matching is presented, along with its applications and most used implementation solutions. For the GAMA project a system has been implemented to deal with the problem, this thesis describes the methodology used and the details of the implementation

UPCommons. Portal del coneixement obert de la UPC

Fusion of Data from Heterogeneous Sensors with Distributed Fields of View and Situation Evaluation for Advanced Driver Assistance Systems

Author: Otto Carola
Publication venue: KIT Scientific Publishing
Publication date: 30/07/2019
Field of study

In order to develop a driver assistance system for pedestrian protection, pedestrians in the environment of a truck are detected by radars and a camera and are tracked across distributed fields of view using a Joint Integrated Probabilistic Data Association filter. A robust approach for prediction of the system vehicles trajectory is presented. It serves the computation of a probabilistic collision risk based on reachable sets where different sources of uncertainty are taken into account

Directory of Open Access Books (DOAB)