4 research outputs found

    A Parallel Algorithm for Large-Scale Multiple Sequence Alignment

    Get PDF
    Multiple sequence alignment is a central topic of extensive research in computational biology. Basically, two or more protein sequences are compared to evaluate their similarity and to identify conserved regions. This work reports a methodology for parallel processing of a multiple sequence alignment algorithm (ClustalW) in an environment of networked computers. A detailed description of the modules that compose the distributed system is provided, giving special attention to the way a dynamic programming algorithm is run in multilevel parallelism. Extensive experiments were done to evaluate performance and scalability of the reported method. Results suggest that the proposed method is very promising for large-scale multiple protein sequence alignment

    Model-driven development of data intensive applications over cloud resources

    Get PDF
    The proliferation of sensors over the last years has generated large amounts of raw data, forming data streams that need to be processed. In many cases, cloud resources are used for such processing, exploiting their flexibility, but these sensor streaming applications often need to support operational and control actions that have real-time and low-latency requirements that go beyond the cost effective and flexible solutions supported by existing cloud frameworks, such as Apache Kafka, Apache Spark Streaming, or Map-Reduce Streams. In this paper, we describe a model-driven and stepwise refinement methodological approach for streaming applications executed over clouds. The central role is assigned to a set of Petri Net models for specifying functional and non-functional requirements. They support model reuse, and a way to combine formal analysis, simulation, and approximate computation of minimal and maximal boundaries of non-functional requirements when the problem is either mathematically or computationally intractable. We show how our proposal can assist developers in their design and implementation decisions from a performance perspective. Our methodology allows to conduct performance analysis: The methodology is intended for all the engineering process stages, and we can (i) analyse how it can be mapped onto cloud resources, and (ii) obtain key performance indicators, including throughput or economic cost, so that developers are assisted in their development tasks and in their decision taking. In order to illustrate our approach, we make use of the pipelined wavefront array

    A parallel wavefront algorithm for efficient biological sequence comparison

    No full text
    In this paper we present a parallel wavefront algorithm for computing an alignment between two strings A and C, with |A| = m and |C| = n. On a distributed memory parallel computer of p processors each with O((m + n)/p) memory, the proposed algorithm requires O(p) communication rounds and O(mn/p) local computing time. The novelty of this algorithm is based on a compromise between the workload of each processor and the number of communication rounds required, expressed by a parameter called a. The proposed algorithm is expressed in terms of this parameter that can be tuned to obtain the best overall parallel time in a given implementation. We show very promising experimental results obtained on a 64-node Beowulf machine. A characteristic of the wavefront communication requirement is that each processor communicates with few other processors. This makes it very suitable as a potential application for grid computing

    A Parallel Wavefront Algorithm for Efficient Biological Sequence Comparison

    No full text
    Abstract. In this paper we present a parallel wavefront algorithm for computing an alignment between two strings A and C, with |A | = m and |C | = n. On a distributed memory parallel computer of p processors each with O((m + n)/p) memory, the proposed algorithm requires O(p) communication rounds and O(mn/p) local computing time. The novelty of this algorithm is based on a compromise between the workload of each processor and the number of communication rounds required, expressed by a parameter called α. The proposed algorithm is expressed in terms of this parameter that can be tuned to obtain the best overall parallel time in a given implementation. We show very promising experimental results obtained on a 64-node Beowulf machine. A characteristic of the wavefront communication requirement is that each processor communicates with few other processors. This makes it very suitable as a potential application for grid computing.
    corecore