5 research outputs found

    A scalable architecture for ordered parallelism

    Get PDF
    We present Swarm, a novel architecture that exploits ordered irregular parallelism, which is abundant but hard to mine with current software and hardware techniques. In this architecture, programs consist of short tasks with programmer-specified timestamps. Swarm executes tasks speculatively and out of order, and efficiently speculates thousands of tasks ahead of the earliest active task to uncover ordered parallelism. Swarm builds on prior TLS and HTM schemes, and contributes several new techniques that allow it to scale to large core counts and speculation windows, including a new execution model, speculation-aware hardware task management, selective aborts, and scalable ordered commits. We evaluate Swarm on graph analytics, simulation, and database benchmarks. At 64 cores, Swarm achieves 51--122脳 speedups over a single-core system, and out-performs software-only parallel algorithms by 3--18脳.National Science Foundation (U.S.) (Award CAREER-145299

    A Survey on Thread-Level Speculation Techniques

    Get PDF
    Producci贸n Cient铆ficaThread-Level Speculation (TLS) is a promising technique that allows the parallel execution of sequential code without relying on a prior, compile-time-dependence analysis. In this work, we introduce the technique, present a taxonomy of TLS solutions, and summarize and put into perspective the most relevant advances in this field.MICINN (Spain) and ERDF program of the European Union: HomProg-HetSys project (TIN2014-58876-P), CAPAP-H5 network (TIN2014-53522-REDT), and COST Program Action IC1305: Network for Sustainable Ultrascale Computing (NESUS)

    Boomerang: a Metadata-Free Architecture for Control Flow Delivery

    Get PDF

    A data dependency recovery system for a heterogeneous multicore processor

    Get PDF
    Multicore processors often increase the performance of applications. However, with their deeper pipelining, they have proven increasingly difficult to improve. In an attempt to deliver enhanced performance at lower power requirements, semiconductor microprocessor manufacturers have progressively utilised chip-multicore processors. Existing research has utilised a very common technique known as thread-level speculation. This technique attempts to compute results before the actual result is known. However, thread-level speculation impacts operation latency, circuit timing, confounds data cache behaviour and code generation in the compiler. We describe an software framework codenamed Lyuba that handles low-level data hazards and automatically recovers the application from data hazards without programmer and speculation intervention for an asymmetric chip-multicore processor. The problem of determining correct execution of multiple threads when data hazards occur on conventional symmetrical chip-multicore processors is a significant and on-going challenge. However, there has been very little focus on the use of asymmetrical (heterogeneous) processors with applications that have complex data dependencies. The purpose of this thesis is to: (i) define the development of a software framework for an asymmetric (heterogeneous) chip-multicore processor; (ii) present an optimal software control of hardware for distributed processing and recovery from violations;(iii) provides performance results of five applications using three datasets. Applications with a small dataset showed an improvement of 17% and a larger dataset showed an improvement of 16% giving overall 11% improvement in performance

    Design and evaluation of a Thread-Level Speculation runtime library

    Get PDF
    En los pr贸ximos a帽os es m谩s que probable que m谩quinas con cientos o incluso miles de procesadores sean algo habitual. Para aprovechar estas m谩quinas, y debido a la dificultad de programar de forma paralela, ser铆a deseable disponer de sistemas de compilaci贸n o ejecuci贸n que extraigan todo el paralelismo posible de las aplicaciones existentes. As铆 en los 煤ltimos tiempos se han propuesto multitud de t茅cnicas paralelas. Sin embargo, la mayor铆a de ellas se centran en c贸digos simples, es decir, sin dependencias entre sus instrucciones. La paralelizaci贸n especulativa surge como una soluci贸n para estos c贸digos complejos, posibilitando la ejecuci贸n de cualquier tipo de c贸digos, con o sin dependencias. Esta t茅cnica asume de forma optimista que la ejecuci贸n paralela de cualquier tipo de c贸digo no de lugar a errores y, por lo tanto, necesitan de un mecanismo que detecte cualquier tipo de colisi贸n. Para ello, constan de un monitor responsable que comprueba constantemente que la ejecuci贸n no sea err贸nea, asegurando que los resultados obtenidos de forma paralela sean similares a los de cualquier ejecuci贸n secuencial. En caso de que la ejecuci贸n fuese err贸nea los threads se detendr铆an y reiniciar铆an su ejecuci贸n para asegurar que la ejecuci贸n sigue la sem谩ntica secuencial. Nuestra contribuci贸n en este campo incluye (1) una nueva librer铆a de ejecuci贸n especulativa f谩cil de utilizar; (2) nuevas propuestas que permiten reducir de forma significativa el n煤mero de accesos requeridos en las peraciones especulativas, as铆 como consejos para reducir la memoria a utilizar; (3) propuestas para mejorar los m茅todos de scheduling centradas en la gesti贸n din谩mica de los bloques de iteraciones utilizados en las ejecuciones especulativas; (4) una soluci贸n h铆brida que utiliza memoria transaccional para implementar las secciones cr铆ticas de una librer铆a de paralelizaci贸n especulativa; y (5) un an谩lisis de las t茅cnicas especulativas en uno de los dispositivos m谩s vanguardistas del momento, los coprocesadores Intel Xeon Phi. Como hemos podido comprobar, la paralelizaci贸n especulativa es un campo de investigaci贸n activo. Nuestros resultados demuestran que esta t茅cnica permite obtener mejoras de rendimiento en un gran n煤mero de aplicaciones. As铆, esperamos que este trabajo contribuya a facilitar el uso de soluciones especulativas en compiladores comerciales y/o modelos de programaci贸n paralela de memoria compartida.Departamento de Inform谩tica (Arquitectura y Tecnolog铆a de Computadores, Ciencias de la Computaci贸n e Inteligencia Artificial, Lenguajes y Sistemas Inform谩ticos
    corecore