Search CORE

920 research outputs found

Improving cache locality for thread-level speculation

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

Thread partitioning and value prediction for exploiting speculative thread-level parallelism

Author: González Colás Antonio María
Marcuello Pedro
Tubella Murgadas Jordi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2004
Field of study

Speculative thread-level parallelism has been recently proposed as a source of parallelism to improve the performance in applications where parallel threads are hard to find. However, the efficiency of this execution model strongly depends on the performance of the control and data speculation techniques. Several hardware-based schemes for partitioning the program into speculative threads are analyzed and evaluated. In general, we find that spawning threads associated to loop iterations is the most effective technique. We also show that value prediction is critical for the performance of all of the spawning policies. Thus, a new value predictor, the increment predictor, is proposed. This predictor is specially oriented for this kind of architecture and clearly outperforms the adapted versions of conventional value predictors such as the last value, the stride, and the context-based, especially for small-sized history tables.Peer ReviewedPostprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Clustered multithreading for speculative execution

Author: Marukatat Rangsipan
Publication venue: The University of Edinburgh
Publication date: 01/01/2003
Field of study

Edinburgh Research Archive

A Survey on Thread-Level Speculation Techniques

Author: Estébanez López Álvaro
González Escribano Arturo
Llanos Ferraris Diego Rafael
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

Producción CientíficaThread-Level Speculation (TLS) is a promising technique that allows the parallel execution of sequential code without relying on a prior, compile-time-dependence analysis. In this work, we introduce the technique, present a taxonomy of TLS solutions, and summarize and put into perspective the most relevant advances in this field.MICINN (Spain) and ERDF program of the European Union: HomProg-HetSys project (TIN2014-58876-P), CAPAP-H5 network (TIN2014-53522-REDT), and COST Program Action IC1305: Network for Sustainable Ultrascale Computing (NESUS)

Repositorio Documental de la Universidad de Valladolid

Recommended from our members

HELIX-RC

Author: Brooks David
Brownell Kevin
Campanoni Simone
Jones Timothy M
Kanev Svilen
Wei Gu-Yeon
Publication venue: ACM SIGARCH Computer Architecture News
Publication date: 29/07/2014
Field of study

Data dependences in sequential programs limit parallelization because extracted threads cannot run independently. Although thread-level speculation can avoid the need for precise dependence analysis, communication overheads required to synchronize actual dependences counteract the benefits of parallelization. To address these challenges, we propose a lightweight architectural enhancement co-designed with a parallelizing compiler, which together can decouple communication from thread execution. Simulations of these approaches, applied to a processor with 16 Intel Atom-like cores, show an average of 6.85x performance speedup for six SPEC CINT2000 benchmarksThis work was possible thanks to the sponsorship of the Royal Academy of Engineering, EPSRC and the National Science Foundation (award number IIS-0926148).This is the accepted manuscript. The final version is available from IEEE and ACM at http://dl.acm.org/citation.cfm?doid=2678373.2665705

Apollo (Cambridge)

Loopapalooza: Investigating Limits of Loop-Level Parallelism with a Compiler-Driven Approach

Author: Gabrielli Giacomo
Iordanou Konstantinos
Luján Mikel
Zaidi Ali
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/04/2021
Field of study

The University of Manchester - Institutional Repository

Dynamic web worker pool management for highly parallel javascript web applications

Author: Costa Prats Juan José
Pajuelo González Manuel Alejandro
Verdú Mulà Javier
Publication venue: 'Wiley'
Publication date: 01/01/2016
Field of study

JavaScript web applications are improving performance mainly thanks to the inclusion of new standards by HTML5. Among others, web workers API allows multithreaded JavaScript web apps to exploit parallel processors. However, developers have difficulties to determine the minimum number of web workers that provide the highest performance. But even if developers found out this optimal number, it is a static value configured at the beginning of the execution. Because users tend to execute other applications in background, the estimated number of web workers could be non-optimal, because it may overload or underutilize the system. In this paper, we propose a solution for highly parallel web apps to dynamically adapt the number of running web workers to the actual available resources, avoiding the hassle to estimate a static optimal number of threads. The solution consists in the inclusion of a web worker pool and a simple management algorithm in the web app. Even though there are co-running applications, the results show our approach dynamically enables a number of web workers close to the optimal. Our proposal, which is independent of the web browser, overcomes the lack of knowledge of the underlying processor architecture as well as dynamic resources availability changes.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Recommended from our members

Galois : a system for parallel execution of irregular algorithms

Author: Nguyen Donald Do
Publication venue
Publication date: 04/09/2015
Field of study

textA programming model which allows users to program with high productivity and which produces high performance executions has been a goal for decades. This dissertation makes progress towards this elusive goal by describing the design and implementation of the Galois system, a parallel programming model for shared-memory, multicore machines. Central to the design is the idea that scheduling of a program can be decoupled from the core computational operator and data structures. However, efficient programs often require application-specific scheduling to achieve best performance. To bridge this gap, an extensible and abstract scheduling policy language is proposed, which allows programmers to focus on selecting high-level scheduling policies while delegating the tedious task of implementing the policy to a scheduler synthesizer and runtime system. Implementations of deterministic and prioritized scheduling also are described. An evaluation of a well-studied benchmark suite reveals that factoring programs into operators, schedulers and data structures can produce significant performance improvements over unfactored approaches. Comparison of the Galois system with existing programming models for graph analytics shows significant performance improvements, often orders of magnitude more, due to (1) better support for the restrictive programming models of existing systems and (2) better support for more sophisticated algorithms and scheduling, which cannot be expressed in other systems.Computer Science

Texas ScholarWorks