Hardware Support for Dynamic Languages by Schleuniger, Pascal et al.
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
  
General rights 
Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners 
and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. 
 
• Users may download and print one copy of any publication from the public portal for the purpose of private study or research. 
• You may not further distribute the material or use it for any profit-making activity or commercial gain 
• You may freely distribute the URL identifying the publication in the public portal  
 
If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately 
and investigate your claim. 
   
 
Downloaded from orbit.dtu.dk on: Dec 19, 2017
Hardware Support for Dynamic Languages
Schleuniger, Pascal; Karlsson, Sven ; Probst, Christian W.
Publication date:
2011
Document Version
Publisher's PDF, also known as Version of record
Link back to DTU Orbit
Citation (APA):
Schleuniger, P., Karlsson, S., & Probst, C. W. (2011). Hardware Support for Dynamic Languages. Poster
session presented at 7th International Summer School on Advanced Computer Architecture and Compilation for
High-Performance and Embedded Systems, Fiuggi, Italy.
Hardware Support for Dynamic Languages
Pascal Schleuniger and Sven Karlsson and Christian W. Probst
Technical University of Denmark
Motivation
IDynamic programming languages:
I enjoy increasing popularity
I run on a virtual machine
I have a long execution time
IExploiting parallelism is difficult:
I runtime execution, just-in-time compilation
I no time for intensive code analysis
I e.g. JavaScript is single threaded by design
I Software speculation is an effective method to exploit
parallelism and speedup the code execution time
IWe aim for hardware support for software speculation
Predicated Instructions
I Instruction that is executed if a condition that is
specified in the operation code is true, otherwise the
instruction is annulled
IPredicated instruction example: convert a control
dependence into data dependence
// C-code sequence:
if (a == 0){b = c + d ;}
// Predicated Instruction:
ADD b, c, d #a
IEliminate some control dependencies
IEases code analysis for parallelization process
Hardware Support for Rollback/Commit
I Software speculation can be applied for:
I thread level, functions, types
IWe aim for HW-support for rollback/commit:
I shadow register-file with status bits
I checkpoint/rollback/commit instructions
IThread level speculation example: Loop iterations
are handled as threads and are executed speculatively
in parallel. If dependencies among threads are
detected, the execution is rolled back to the
checkpoint and executed sequentially instead.
set checkpoint
loop barrier
rollback instruction
- swap current register-file 
  status bits with the
  checkpoint copy
- go back to checkpoint
commit instruction
- trigger write-back
checkpoint instruction
- take a copy of the register- 
  file status bits
- prevent write-back
conflict check ()
parallel loops (threads)
Hardware Support for Exceptions
I Suppress exceptions while code is executed
speculatively
IHardware support for conflict check when executing
code speculatively (monitor data dependencies)
Hardware Support for Data Pre-fetching
I Speculative fetching of data and pre-computing
IHides some of the memory access latency
IE.g. makes subsequent page loads of web
applications faster
Hardware Experimentation Platform
dyn. lang.
support
co
re
Tin
us
o
I$
D$
L2 $
$ 
controller dyn. lang.supportNI
dyn. lang.
support
co
re
Tin
us
o
I$
P
P
MC
R R
R
R
Processing Tile
Router
Memory Controller
Network Interface
Scratchpad Mem.
Cache
P    
R   
MC 
NI  
SP
$
SP
D$
ITinuso Processor Core:
I 32-bit, single-issue, RISC processor
I 8-stage pipeline, full forwarding
I predicated instructions
I instruction- and datacache
I barrel-shifter, multiplication unit
I optimized for FPGA implementation
I Xilinx Virtex6(-3): 370MHz
IProcessing Tile:
I two Tinuso cores in one processing tile
I network-interface
I 2-nd level cache*
I scratchpad memory*
I hardware support for cache coherency*
INetwork-on-Chip:
I packet-switched, mesh-4 network
I non-blocking, XY-routing
*implementation in progress
DTU Informatics - Technical University of Denmark pass@imm.dtu.dk http://www.imm.dtu.dk
