261 research outputs found

    MPI and Pthreads in parallel programming

    Get PDF
    By programming in parallel, large problem is divided in smaller ones, which are solved concurrently. Two of techniques that makes this possible are Message Passing Interface (MPI) and POSIX threads (Pthreads). In this article is highlighted the difference between these two different implementation of parallelism in a software by showing their main characteristics and the advantages of each of them

    Algoritmos paralelos de memoria compartida que determinan el menor tama˜no de un ´arbol binario al refinar un simplex regular

    Get PDF
    En el ambito de la optimizacion global basada en tecnicas de ramificacion y acotacion, cuando el espacio de busqueda es un n-sımplex regular es habitual utilizar como regla de division la biseccion por el lado mayor, debido a que garantiza la convergencia del algoritmo. Cuando la dimension del n-sımplex es mayor de 2 existen varios lados mayores que pueden utilizarse para realizar la biseccion. La eleccion del lado mayor influye en el tamaño del arbol binario completo que se genera. Una seleccion eficiente del lado mayor puede reducir el coste computacional de los algoritmos de ramificacion y acotacion mencionados. En este estudio estamos interesados en conocer el tamaño del arbol o arboles mınimo(s). Para obtener una solucion de las instancias mas complejas del problema en un tiempo razonable es necesario el desarrollo de algoritmos paralelos. La complejidad del problema es debida a la necesidad de analizar todas y cada una de las posibles combinaciones de selecciones de los distintos lados mayores en el refinamiento. Aquí se comparan la eficiencia de distintas propuestas de algoritmos paralelos para sistemas de memoria compartida.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech

    GLTO: On the Adequacy of Lightweight Thread Approaches for OpenMP Implementations

    Get PDF
    OpenMP is the de facto standard application programming interface (API) for on-node parallelism. The most popular OpenMP runtimes rely on POSIX threads (pthreads) implementations that offer an excellent performance for coarse-grained parallelism and match perfectly with the current hardware. However, a recent trend in runtimes/applications points in the direction of leveraging massive on-node parallelism in conjunction with fine-grained and dynamic scheduling paradigms. It has been demonstrated that lightweight thread (LWT) solutions are more appropriate for these new parallel paradigms. We have developed GLTO, an OpenMP implementation over the recently-emerged Generic Lightweight Threads (GLT) API. GLT exports a common API for LWT libraries that offers the possibility of running the same application over different native LWT solutions. In this paper we use GLTO to analyze different scenarios where OpenMP implementations may benefit from the use of either LWT or pthreads. Our study reveals that none of the threading approaches obtains the best performance in all the scenarios, but that there are important gaps among them.The Researchers from the Universitat Jaume I de Castelló were supported by project TIN2014-53495-R of the MINECO and FEDER, the Generalitat Valenciana fellowship programme Vali+d 2015. Antonio J. Peña is cofinancied by the Spanish Ministry of Economy and Competitiveness under Juan de la Cierva fellowship number IJCI-2015-23266. This work was partially supported by the U.S. Dept. of Energy, Office of Science, Office of Advanced Scientific Computing Research (SC-21), under contract DE-AC02-06CH11357. We gratefully acknowledge the computing resources provided and operated by the Joint Laboratory for System Evaluation (JLSE) at Argonne National Laboratory.Peer ReviewedPostprint (author's final draft

    Speculative Thread Framework for Transient Management and Bumpless Transfer in Reconfigurable Digital Filters

    Full text link
    There are many methods developed to mitigate transients induced when abruptly changing dynamic algorithms such as those found in digital filters or controllers. These "bumpless transfer" methods have a computational burden to them and take time to implement, causing a delay in the desired switching time. This paper develops a method that automatically reconfigures the computational resources in order to implement a transient management method without any delay in switching times. The method spawns a speculative thread when it predicts if a switch in algorithms is imminent so that the calculations are done prior to the switch being made. The software framework is described and experimental results are shown for a switching between filters in a filter bank.Comment: 6 pages, 7 figures, to be presented at American Controls Conference 201

    Foldalign 2.5:multithreaded implementation for pairwise structural RNA alignment

    Get PDF
    Motivation: Structured RNAs can be hard to search for as they often are not well conserved in their primary structure and are local in their genomic or transcriptomic context. Thus, the need for tools which in particular can make local structural alignments of RNAs is only increasing. Results: To meet the demand for both large-scale screens and hands on analysis through web servers, we present a new multithreaded version of Foldalign. We substantially improve execution time while maintaining all previous functionalities, including carrying out local structural alignments of sequences with low similarity. Furthermore, the improvements allow for comparing longer RNAs and increasing the sequence length. For example, lengths in the range 2000–6000 nucleotides improve execution up to a factor of five. Availability and implementation: The Foldalign software and the web server are available at http://rth.dk/resources/foldalign Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online

    PARALIGN: rapid and sensitive sequence similarity searches powered by parallel computing technology

    Get PDF
    PARALIGN is a rapid and sensitive similarity search tool for the identification of distantly related sequences in both nucleotide and amino acid sequence databases. Two algorithms are implemented, accelerated Smith–Waterman and ParAlign. The ParAlign algorithm is similar to Smith–Waterman in sensitivity, while as quick as BLAST for protein searches. A form of parallel computing technology known as multimedia technology that is available in modern processors, but rarely used by other bioinformatics software, has been exploited to achieve the high speed. The software is also designed to run efficiently on computer clusters using the message-passing interface standard. A public search service powered by a large computer cluster has been set-up and is freely available at , where the major public databases can be searched. The software can also be downloaded free of charge for academic use

    On the adequacy of lightweight thread approaches for high-level parallel programming models

    Get PDF
    High-level parallel programming models (PMs) are becoming crucial in order to extract the computational power of current on-node multi-threaded parallelism. The most popular PMs, such as OpenMP or OmpSs, are directive-based: the complexity of the hardware is hidden by the underlying runtime system, improving coding productivity. The implementations of OpenMP usually rely on POSIX threads (pthreads), offering excellent performance for coarse-grained parallelism and a perfect match with the current hardware. OmpSs is a task oriented PM based on an ad hoc runtime solution called Nanos++; it is the precursor of the tasking parallelism in the OpenMP tasking specification. A recent trend in runtimes and applications points to leveraging massive on-node parallelism in conjunction with fine-grained and dynamic scheduling paradigms. In this paper we analyze the behavior of the OpenMP and OmpSs PMs on top of the recently emerged Generic Lightweight Threads (GLT) API. GLT exposes a common API for lightweight thread (LWT) libraries that offers the possibility of running the same application over different native LWT solutions. We describe the design details of those high-level PMs implemented on top of GLT and analyze different scenarios in order to assess where the use of LWTs may benefit application performance. Our work reveals those scenarios where LWTs overperform pthread-based solutions and compares the performance between an ad hoc solution and a generic implementation.The researchers from the Universitat Jaume I de Castelló were supported by project TIN2014-53495-R of the MINECO, Spain and FEDER, Spain, the Generalitat Valenciana fellowship programme, Spain Vali+d 2015. Antonio J. Peña is cofinanced by the Spanish Ministry of Economy and Competitiveness, Spain under Juan de la Cierva fellowship number IJCI-2015-23266. This work was partially supported by the U.S. Dept. of Energy, Office of Science, Office of Advanced Scientific Computing Research (SC-21), under contract DE-AC02-06CH11357. We gratefully acknowledge Enrique S. Quintana-Ortí (Universitat Jaume I) and Sangmin Seo (Samsung Corp.) for their advice in this work and the computing resources provided and operated by the Joint Laboratory for System Evaluation (JLSE) at Argonne National Laboratory.Peer ReviewedPostprint (author's final draft

    Speeding up IP lookup procedure in software routers by means of parallelization

    Full text link
    • …
    corecore