Search CORE

9 research outputs found

How to speedup fault-tolerant clock generation in VLSI systems-on-chip via pipelining

Author: Andreas Dielacher
Matthias Függer
Ulrich Schmid
Publication venue
Publication date: 01/01/2009
Field of study

Fault-tolerant clocking schemes become inevitable when it comes to highly-reliable chip designs. Because of the additional hardware overhead, existing solutions are considerably slower than their non-reliable counterparts. In this paper, we demonstrate that pipelining is a viable approach to speed up the distributed fault-tolerant DARTS clock generation approach introduced in (Függer, Schmid, Fuchs, Kempf, EDCC'06), where a distributed Byzantine fault-tolerant tick generation algorithm has been used to replace the traditional quartz oscillator and highly balanced clock tree in VLSI Systems-on-Chip (SoCs). We provide a pipelined version of the original DARTS algorithm, termed pDARTS, together with a novel modeling and analysis framework for hardware-implemented asynchronous fault-tolerant distributed algorithms, which is employed for rigorously analyzing its correctness & performance. Our results, which have also been confirmed by the experimental evaluation of an FPGA prototype implementation, reveal that pipelining indeed allows to entirely remove the adverse effect of large interconnect delays on the achievable clock frequency, and demonstrate again that methods and results from distributed algorithms research can successfully be applied in the VLSI context

CiteSeerX

A hardware/software framework for supporting transactional memory in a MPSoC environment

Author: C. Ferri
L. Benini
M. Herlihy
R. I. Bahar
T. Moreshet
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2007
Field of study

Manufacturers are focusing on multiprocessor-system-on-a-chip (MPSoC) architectures in order to provide increased concurrency, rather than increased clock speed, for both large-scale as well as embedded systems. Traditionally lock-based synchronization is provided to support concurrency; however, managing locks can be very difficult and error prone. In addition, the performance and power cost of lock-based synchronization can be high. Transactional memories have been extensively investigated as an alternative to lock-based synchronization in general-purpose systems. It has been shown that transactional memory has advantages over locks in terms of ease of programming, performance and energy consumption. However, their applicability to embedded multi-core platforms has not been explored yet. In this paper, we demonstrate a complete hardware transactional memory solution for an embedded multi-core architecture, consisting of a cache-coherent ARM-based cluster, similar to ARM's MPCore. Using cycle accurate power and performance models for the transactional memory hardware, we evaluate our architectural framework over a set of different system and application settings, and show that transactional memory is a promising solution, even for resource-constrained embedded multiprocessors

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

A hardware/software framework for supporting transactional memory in a MPSoC environment

Author: Borkar S.
Cesare Ferri
Luca Benini
Maurice Herlihy
Moreshet T.
R. Iris Bahar
Rundberg P.
Tali Moreshet
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Reconciling fault-tolerant distributed computing and systems-on-chip

Author: A. Bar-Noy
A.J. Martin
C. Constantinescu
C. Ferri
C. Metra
C.J. Myers
D. Dolev
D. Powell
D.D. Thaker
D.L. Black
E. Normand
E.G. Friedman
E.M. Clarke
H. Attiya
H. Kopetz
I. Koren
J. Widder
J.C. Barros
J.C. Ebergen
J.Y. Halpern
K.S. Stevens
L. Lamport
L. Lamport
L. Marino
M.J. Fischer
M.J. Gadlage
Matthias Függer
N. Lynch
P. Ramanathan
P. Teehan
P.J. Restle
R. Baumann
R. Bhamidipati
R. Friedman
R.M. Kieckhafer
S. Dolev
S. Hauck
S. Mitra
T. Karnik
T.K. Srikanth
U. Schmid
Ulrich Schmid
Y. Semiat
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Exploiting software transactional memory in the context of asymmetric architectures

Author: Baldassin Alexandro José
Publication venue: [s.n.]
Publication date: 15/08/2018
Field of study

Orientador: Paulo Cesar CentoducatteTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: A adoção dos microprocessadores com múltiplos núcleos de execução pela indústria semicondutora tem criado uma crescente necessidade por novas linguagens, metodologias e ferramentas que tornem o desenvolvimento de sistemas concorrentes mais rápido, eficiente e acessível aos programadores de todos os níveis. Uma das principais dificuldades em programação concorrente com memória compartilhada é garantir a correta sincronização do código, evitando assim condições de corrida que podem levar o sistema a um estado inconsistente. A sincronização tem sido tradicionalmente realizada através de métodos baseados em travas, reconhecidos amplamente por serem de difícil uso e pelas anomalias causadas. Um novo mecanismo, conhecido como memória transacional (TM), tem sido alvo de muita pesquisa recentemente e promete simplificar o processo de sincronização, além de possibilitar maior oportunidade para extração de paralelismo e consequente desempenho. O cerne desta tese é formado por três trabalhos desenvolvidos no contexto dos sistemas de memória transacional em software (STM). Primeiramente, apresentamos uma implementação de STM para processadores assimétricos, usando a arquitetura Cell/B.E. como foco. Como principal resultado, constatamos que o uso de sistemas transacionais em arquiteturas assimétricas também é promissor, principalmente pelo fator escalabilidade. No segundo trabalho, adotamos uma abordagem diferente e sugerimos um sistema de STM especialmente voltado para o domínio de jogos computacionais. O principal motivo que nos levou nesta direção é o baixo desempenho das implementações atuais de STM. Um estudo de caso conduzido a partir de um jogo complexo mostra a eficácia do sistema proposto. Finalmente, apresentamos pela primeira vez uma caracterização do consumo de energia de um sistema de STM considerado estado da arte. Além da caracterização, também propomos uma técnica para redução do consumo em casos de alta contenção. Resultados obtidos a partir dessa técnica revelam ganhos de até 87% no consumo de energiaAbstract: The shift towards multicore processors taken by the semiconductor industry has initiated an era in which new languages, methodologies and tools are of paramount importance to the development of efficient concurrent systems that can be built in a timely way by all kinds of programmers. One of the main obstacles faced by programmers when dealing with shared memory programming concerns the use of synchronization mechanisms so as to avoid race conditions that could possibly lead the system to an inconsistent state. Synchronization has been traditionally achieved by means of locks (or variations thereof), widely known by their anomalies and hard-to-get-it-right facets. A new mechanism, known as transactional memory (TM), has recently been the focus of a lot of research and shows potential to simplify code synchronization as well as delivering more parallelism and, therefore, better performance. This thesis presents three works focused on different aspects of software transactional memory (STM) systems. Firstly, we show an STM implementation for asymmetric processors, focusing on the architecture of Cell/B.E. As an important result, we find out that memory transactions are indeed promising for asymmetric architectures, specially due to their scalability. Secondly, we take a different approach to STM implementation by devising a system specially targeted at computer games. The decision was guided by poor performance figures usually seen on current STM implementations. We also conduct a case study using a complex game that effectively shows the system's efficiency. Finally, we present the energy consumption characterization of a state-of-the-art STM for the first time. Based on the observed characterization, we also propose a technique aimed at reducing energy consumption in highly contended scenarios. Our results show that the technique is indeed effective in such cases, improving the energy consumption by up to 87%DoutoradoSistemas de ComputaçãoDoutor em Ciência da Computaçã

Repositorio da Producao Cientifica e Intelectual da Unicamp

Performance Optimization Strategies for Transactional Memory Applications

Author: Schindewolf Martin Otto
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2013
Field of study

This thesis presents tools for Transactional Memory (TM) applications that cover multiple TM systems (Software, Hardware, and hybrid TM) and use information of all different layers of the TM software stack. Therefore, this thesis addresses a number of challenges to extract static information, information about the run time behavior, and expert-level knowledge to develop these new methods and strategies for the optimization of TM applications

KITopen

Using Multiple Abstraction Levels To Speedup An Mpsoc Virtual Platform Simulator

Author: Azevedo R.
Baldassin A.
Centoducatte P.
Klein F.
Moreira J.
Rigo S.
Publication venue
Publication date
Field of study

Virtual platforms are of paramount importance for design space exploration and their usage in early software development and verification is crucial. In particular, enabling accurate and fast simulation is specially useful, but such features are usually conflicting and tradeoffs have to be made. In this paper we describe how we integrated TLM communication mechanisms into a state-of-the-art, cycle-accurate, MPSoC simulation platform. More specifically, we show how we adapted ArchC fast functional instruction set simulators to the MPARM platform in order to achieve both fast simulation speed and accuracy. Our implementation led to a much faster hybrid platform, reaching speedups of up to 2.9 and 2.1x on average with negligible impact on power estimation accuracy (average 3.26% and 2.25% of standard deviation). © 2011 IEEE.99105 Institution of Electrical Engineers (IEEE) Reliability Society,Karlsruhe Institute of Technology (KIT)Benini, L., Bertozzi, D., Bogliolo, A., Menichelli, F., Olivieri, M., MPARM: Exploring the multi-processor SoC design space with systemC (2005) Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology, 41 (2), pp. 169-182. , DOI 10.1007/s11265-005-6648-1Chen, J., Dubois, M., Stenstrom, P., Integrating complete-system and user-level performance/power simulators: The simwattch approach (2003) ISPASS '03: Proceedings of the 2003 IEEE International Symposium on Performance Analysis of Systems and Software, pp. 1-10Rigo, S., Araujo, G., Bartholomeu, M., Azevedo, R., ArchC: A SystemC-based architecture description language (2004) Proceedings - Symposium on Computer Architecture and High Performance Computing, pp. 66-73. , Proceedings - 16th Symposium on Computer Architecture and High Performance ComputingBlack, D.C., Donovan, J., (2004) SystemC: From the Ground UpDales, M., (2003), www.cl.cam.ac.uk/mwd24/phd/swarm.html, FebruaryLoghi, M., Poncino, M., Benini, L., Cycle-accurate power analysis for multiprocessor systems-on-a-chip (2004) GLSVLSI '04: Proceedings of the 14th ACM Great Lakes Symposium on VLSI, pp. 410-406Ferri, C., Moreshet, T., Bahar, R.I., Benini, L., Herlihy, M., A hardware/software framework for supporting transactional memory in a mpsoc environment (2007) SIGARCH Comput. Archit. News, 35 (1), pp. 47-54Baldassin, A., Klein, F., Araujo, G., Azevedo, R., Centoducatte, P., Characterizing the energy consumption of software transactional memory (2009) Computer Architecture Letters, 8 (2), pp. 56-59. , FebMagnusson, P.S., Christensson, M., Eskilson, J., Forsgren, D., Hallberg, G., Hogberg, J., Larsson, F., Werner, B., Simics: A full system simulation platform (2002) Computer, 35, pp. 50-58Brooks, D.M., Bose, P., Schuster, S.E., Jacobson, H., Kudva, P.N., Buyuktosunoglu, A., Wellman, J.-D., Cook, P.W., Power-aware microarchitecture: Design and modeling challenges for next-generation microprocessors (2000) IEEE Micro, 20 (6), pp. 26-44. , DOI 10.1109/40.888701Austin, T., Larson, E., Ernst, D., Simplescalar: An infrastructure for computer system modeling (2002) Computer, 35 (2), pp. 59-67Azevedo, R., Rigo, S., Bartholomeu, M., Araujo, G., Araujo, C., Barros, E., The ArchC architecture description language and tools (2005) International Journal of Parallel Programming, 33 (5), pp. 453-484. , DOI 10.1007/s10766-005-7301-0Ghenassia, F., (2006) Transaction-Level Modeling with Systemc: Tlm Concepts and Applications for Embedded SystemsCao Minh, C., Chung, J., Kozyrakis, C., Olukotun, K., STAMP: Stanford transactional applications for multi-processing (2008) IISWC '08: Proceedings of the IEEE International Symposium on Workload Characterization, , Septembe

Repositorio da Producao Cientifica e Intelectual da Unicamp