Search CORE

42 research outputs found

Object oriented execution model (OOM)

Author: Cristal Kestelman Adrián
González Blanco Ruben
Markovic Nikola
Nemirovsky Daniel
Unsal Osman Sabri
Valero Cortés Mateo
Publication venue: INRIA
Publication date: 01/01/2011
Field of study

This paper considers implementing the Object Oriented Programming Model directly in the hardware to serve as a base to exploit object-level parallelism, speculation and heterogeneous computing. Towards this goal, we present a new execution model called Object Oriented execution Model - OOM - that implements the OO Programming Models. All OOM hardware structures are objects and the OOM Instruction Set directly utilizes objects while hiding other complex hardware structures. OOM maintains all high-level programming language information until execution time. This enables efficient extraction of available parallelism in OO serial code at execution time with minimal compiler support. Our results show that OOM utilizes the available parallelism better than the OoO (Out-of-Order) modelPeer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Efficient Resource Allocation on a Dynamic Simultaneous Multithreaded Architecture

Author: Ortiz-Arroyo Daniel
Publication venue
Publication date: 01/01/2006
Field of study

VBN

Thread-spawning schemes for speculative multithreading

Author: González Colás Antonio María
Marcuello Pascual Pedro
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2002
Field of study

Speculative multithreading has been recently proposed to boost performance by means of exploiting thread-level parallelism in applications difficult to parallelize. The performance of these processors heavily depends on the partitioning policy used to split the program into threads. Previous work uses heuristics to spawn speculative threads based on easily-detectable program constructs such as loops or subroutines. In this work we propose a profile-based mechanism to divide programs into threads by searching for those parts of the code that have certain features that could benefit from potential thread-level parallelism. Our profile-based spawning scheme is evaluated on a Clustered Speculative Multithreaded Processor and results show large performance benefits. When the proposed spawning scheme is compared with traditional heuristics, we outperform them by almost 20%. When a realistic value predictor and a 8-cycle thread initialization penalty is considered, the performance difference between them is maintained. The speed-up over a single thread execution is higher than 5x for a 16-thread-unit processor and close to 2x for a 4-thread-unit processor.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Design of High performance and Low power Simultaneous Multi-Threaded Processor

Author: Arora Krishan
Mehra Parul
Singh Gill Paramveer
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 01/06/2013
Field of study

In this paper, we present the design of a High Performance Multi-Threaded Processor. Processing of high quality images is inevitable in applications such as, HD TV, Gaming Multimedia, etc. which require a great processing power with low power consumption. This can be achived with multi-threaded processors which optimally utilises the Functional Units (Fus). The speed of processing is as good as multi-core processors with lesser area. A conflict resolver (CR) is designed for scheduling the instructions, which involves allocation of Fu. The data move instructions are in majority in any of the programs; the corresponding logic blocks are replicated and speed of execution is further improved. We illustrated for two-threaded processorHowever, it is possible to extend the design for any number of threads by suitably redesigning the CR, and also replicate Transfer Logic and CPU Registers.DOI:http://dx.doi.org/10.11591/ijece.v3i3.253

Institute of Advanced Engineering and Science

Architecture for object-oriented programming model

Author: Cristal Kestelman Adrián
González Rubén
Markovic Nikola
Unsal Osman Sabri
Valero Cortés Mateo
Publication venue
Publication date: 01/01/2009
Field of study

Current mainstream architectures have ISAs that are not able to maintain all the information provided by the application programmer using a high level programming language. Typically, the information that is lost in compiling to a low-level ISA is related to parallelism and speculation [14]. For example some loops are typically expressed as parallel loops by the programmer but later the processor is not able to determine this level of parallelism; conditional execution might apply control independent execution that at execution time is basically impossible to detect; function and object-level parallelism is lost when code is transformed into a low-level ISA that is oblivious to programmer intentions and high-level programming structures. Object Oriented Programming Languages are arguably the most successful programming medium because they help the programmer to use well-known practices about data distribution through operations related with the associated data. Therefore object oriented models express data/execution locality more naturally and in an efficient manner. Other OO software mechanisms such as derivation and polymorphism further help the programmer to exploit locality better. Once object oriented programs have been compiled then all information about data/execution locality is completely lost in current assembly code (ISA code). Maintaining this information until runtime is crucial to improve locality and security. Finally, Object Oriented Programming Models maintain the idea of memory (data memory) far from the programmer. These are all desirable qualities that is mostly lost in the compilation to a low-level ISA that is oblivious to the Object-Oriented Programming model. This report considers implementing the Object Oriented (OO) Programming Model directly in the hardware to serve as a base to exploit object/level parallelism, speculation and heterogeneous computing. Towards this goal, we present new computer architecture that implements the OO Programming Models. All its hardware structures are objects and its Instruction Set directly utilizes objects hiding totally the notion of memory and other complex hardware structures. It also maintains all high-level programming language information until execution time. This enables efficient extraction of available parallelism in OO serial or parallel code at execution time with minimal compiler support. We will demonstrate the potential of this novel computer architecture through several examples.Postprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Dynamic Simultaneous Multithreaded Architecture

Author: Lee B.
Ortiz-Arroyo Daniel
Publication venue: International Society of Computers and Their Applications
Publication date: 01/01/2003
Field of study

VBN

Boosting single-thread performance in multi-core systems through fine-grain multi-threading

Author: Alejandro Martinez
Antonio Gonzalez
Carlos Madriles
Enric Gibert
Fernando Latorre
Josep M. Codina
Kahle J. A.
Kernighan B.
Marcuello P.
Pedro López
Raúl Martinez
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Recommended from our members

Analysis of the effectiveness of multithreading for interrupts on communication processors

Author: Pattery Vinu J.
Publication venue: 'Oregon State University'
Publication date
Field of study

High bandwidth of networks demands high performance communication processors that integrate application processing, network processing, and system support functions into a single, low cost System-On-Chip (SOC) solution. However, conventional processors, when used in network related applications, are beset by the overhead of save/restore of register context, cache misses due to fetching interrupt handler from memory, and the possibility of NIC buffer overflow. Therefore, this paper analyzes the effectiveness of multithreading to service interrupts on an embedded processor from the perspective of a Network processor and a Communication processor. A Simulation environment enhanced with a multithreaded hardware execution model is used and our results reveal that multithreading for interrupts from a single NIC brings a fair improvement in performance of Network processors and little or no effect on Communication processors. However, our analysis also show that multithreading for interrupts has a lot of potential when applied to communication processors with multiple interrupt sources, such as Ethernet, ATM, USB, and HDLC. Index terms: Multithreading, UDP, IP, device driver, interrupt processing, communication processor

ScholarsArchive@OSU