53 research outputs found
RITSim: distributed systemC simulation
Parallel or distributed simulation is becoming more than a novel way to speedup design evaluation; it is becoming necessary for simulating modern processors in a reasonable timeframe. As architectural features become faster, smaller, and more complex, designers are interested in obtaining detailed and accurate performance and power estimations. Uniprocessor simulators may not be able to meet such demands. The RITSim project uses SystemC to model a processor microarchitecture and memory subsystem in great detail. SystemC is a C++ library built on a discrete-event simulation kernel. Many projects have successfully implemented parallel discrete-event simulation (PDES) frameworks to distribute simulation among several hosts. The field promises significant simulation speedup, possibly leading to faster turnaround time in design space exploration and commercial production. However, parallel implementation of such simulators is not an easy task. It requires modification of the simulation kernel for effective partitioning and synchronization. This thesis explores PDES techniques and presents a distributed version of the SystemC simulation environment. With minimal user interaction, SystemC models can executed on a cluster of workstations using a message-passing library such as the Message Passing Interface (MPI). The implementation is designed for transparency; distribution and synchronization happen with little intervention by the model author. Modification of SystemC is fashioned to promote maintainability with future releases. Furthermore, only freely available libraries are used for maximum flexibility and portability
Submicron Systems Architecture Project: Semiannual Technial Report
No abstract available
Submicron Systems Architecture Project : Semiannual Technical Report
The Mosaic C is an experimental fine-grain multicomputer
based on single-chip nodes. The Mosaic C chip includes 64KB of fast dynamic RAM,
processor, packet interface, ROM for bootstrap and self-test, and a two-dimensional selftimed
router. The chip architecture provides low-overhead and low-latency handling of
message packets, and high memory and network bandwidth. Sixty-four Mosaic chips are
packaged by tape-automated bonding (TAB) in an 8 x 8 array on circuit boards that can, in
turn, be arrayed in two dimensions to build arbitrarily large machines. These 8 x 8 boards are
now in prototype production under a subcontract with Hewlett-Packard. We are planning
to construct a 16K-node Mosaic C system from 256 of these boards. The suite of Mosaic
C hardware also includes host-interface boards and high-speed communication cables. The
hardware developments and activities of the past eight months are described in section 2.1.
The programming system that we are developing for the Mosaic C is based on the
same message-passing, reactive-process, computational model that we have used with earlier
multicomputers, but the model is implemented for the Mosaic in a way that supports finegrain
concurrency. A process executes only in response to receiving a message, and may in
execution send messages, create new processes, and modify its persistent variables before
it either exits or becomes dormant in preparation for receiving another message. These
computations are expressed in an object-oriented programming notation, a derivative of
C++ called C+-. The computational model and the C+- programming notation are
described in section 2.2. The Mosaic C runtime system, which is written in C+-, provides
automatic process placement and highly distributed management of system resources. The
Mosaic C runtime system is described in section 2.3
å€å°ç¹æ¥ç¶ãªã¢ã«ã¿ã€ã åã¢ããªã±ãŒã·ã§ã³ã«é©çšããåæ£åŠçåéä¿¡æ¹åŒ
ä»®æ³åæè¡ã®é²å±ã«ãã£ãŠïŒæ§ã
ãªã¢ããªã±ãŒã·ã§ã³ããããã¯ãŒã¯å
ã®ã¯ã©ãŠãäžã§åäœå¯èœãšãªããïŒåºåãªãããã¯ãŒã¯ãä»ããŠå€å°ç¹ééä¿¡ãè¡ããªã¢ã«ã¿ã€ã åã¢ããªã±ãŒã·ã§ã³ã§ã¯ïŒäœé
延ãªãšã³ãâãšã³ãéä¿¡ãå®çŸããéä¿¡æè¡ã®ç¢ºç«ã課é¡ã§ããïŒæ¬ç 究ã¯ïŒãããã¯ãŒã¯äžã§åäœããã¢ããªã±ãŒã·ã§ã³ãéä¿¡ãµãŒãã¹ãšããŠæäŸããå Žåã®ãšã³ãâãšã³ãã®éä¿¡é
延æéãäœæžããããšãç®çãšããåæ£åŠçåéä¿¡æ¹åŒã§ããïŒææ¡æ¹åŒã¯ïŒãããã¯ãŒã¯å
ã§ãŠãŒã¶ç«¯æ«ãšè¿ããã±ãŒã·ã§ã³ã«é
åãããè€æ°ã®ãµãŒããçšããŠã¢ããªã±ãŒã·ã§ã³ãåæ£åŠçãããïŒãŠãŒã¶ç«¯æ«ã¯è€æ°ã®ãµãŒããããšã³ãâãšã³ãã®é
延æéãæå°åãããµãŒããéžæãïŒåæ£åŠçããè€æ°ã®ãµãŒãéã§ã¯åŠççµæã®åå ±éä¿¡ãè¡ãïŒãŠãŒã¶ç«¯æ«ãšãããã¯ãŒã¯å
ã«é
åããããµãŒããšã®éä¿¡ã¯ïŒãŠãŒã¶ç«¯æ«ããšã«ãµãŒããšã®éä¿¡é
延æéãç°ãªãããïŒå®éã®ã€ãã³ãçºçé åºãšãããã¯ãŒã¯ãä»ãããµãŒããžã®ã€ãã³ãå°çé åºãç°ãªãå¯èœæ§ãããïŒã€ãã³ãã®åŠçé åºãè£æ£ããä»çµã¿ãå¿
èŠãšãªãïŒåæ£åŠçããåãµãŒãã§ã¯ïŒã€ãã³ãã®é åºæ§ãåçŸããããã«ïŒçŸåšæå»ããã€ãã³ãã®é åºæ§ãåçŸå¯èœãªæå»ãŸã§æéãé
ãããä»®æ³æå»ããããããèšç®ãïŒä»®æ³æå»äžã§ã€ãã³ãã®é åºæ§ãåçŸããïŒåæ£åŠçããããµãŒãã§ã¯ïŒåãŠãŒã¶ç«¯æ«ãšã®éä¿¡é
延æéãäºåã«æž¬å®ããŠããïŒãŠãŒã¶ç«¯æ«æ¯ã®éä¿¡é
延æéã«å¿ããåŸ
ã¡åãããè¡ãããšã§ïŒä»®æ³æå»äžã§ã€ãã³ãçºçé åºãåçŸããïŒãããã¯ãŒã¯å
ã®è€æ°ã®ãµãŒãããïŒãšã³ãâãšã³ãã®éä¿¡é
延æéãæå°åãããµãŒãã決å®ããããã®ãµãŒãéžæåé¡ãšããŠïŒçŸåšæå»ãšä»®æ³æå»ã®å·®ã§ãããŠãŒã¶ç«¯æ«è£æ£æéãæå°åããïŒãµãŒãéžæåé¡ã«ã€ããŠã®èšç®è€é床ã®è©äŸ¡ãïŒæ¬åé¡ã¯NP å°é£ã§ããããšã瀺ãïŒãµãŒãéžæåé¡ãç·åœ¢èšç»åé¡ãšããŠå®åŒåãïŒãšã³ãâãšã³ãã®éä¿¡é
延æéãæå°åããä»®æ³æå»ãšåãŠãŒã¶ç«¯æ«ãéžæãããµãŒããç·åœ¢èšç»åé¡ã解ãããšã§æ±ºå®ããïŒææ¡æ¹åŒã®æ§èœè©äŸ¡ãšããŠïŒãµãŒãéãããã¯ãŒã¯ããããžããã³ãããã¯ãŒã¯äžã®ãµãŒãé
åç®æã«ãããšã³ãâãšã³ãã®éä¿¡é
延æéãè©äŸ¡ããïŒãµãŒãéãããã¯ãŒã¯ããããžã®è©äŸ¡ã§ã¯ïŒåäžã®ãµãŒãé
åç®æã§ç°ãªããªã³ã¯ããããžã§æ¹åå¹æãæ¯èŒãïŒãã«ã¡ãã·ã¥åããªã³ã°åã®ããã«ãµãŒãéãïŒæçè·é¢ã«è¿ãè·é¢ã®ãªã³ã¯ãæã£ãŠããããããžã®ã»ããïŒé
延ç¹æ§ã®æ¹åå¹æãé«ãããšã瀺ãïŒãŸãïŒãµãŒãé
åç®æãšããŠã¯ïŒãããŠãŒã¶ã«è¿ããã±ãŒã·ã§ã³ã«ãµãŒããé
åãããšé
延ç¹æ§ã®æ¹åå¹æãé«ãããšã瀺ãïŒãµãŒãéžæåé¡ã®è©äŸ¡ãšããŠã¯ïŒç¹å®ãšãªã¢å
ã«äžæ§ååžãã200 å°ã®ãŠãŒã¶ç«¯æ«ã«ã€ããŠïŒæ¬ç 究ã§å®åŒåããæé©ååé¡ã解ãããšã§ïŒé
延æéãæå°ã«ãããµãŒããéžæãããããšã瀺ãïŒå®éã®ãããã¯ãŒã¯ããããžã«è¿ãæ¡ä»¶ã«ãããç¹æ§æ¹åå¹æã®ç¢ºèªãšããŠïŒæ¥æ¬ã®ããã¯ããŒã³ãããã¯ãŒã¯ã®å
žåçãªã¢ãã«ãçšããŠïŒå
šåœã«åæ£ããè€æ°ã®ãµãŒãã§åæ£åŠçããå ŽåãšïŒå°ã®ãµãŒãã§éäžåŠçããå Žåãæ¯èŒãïŒæ±äº¬ã®ãµãŒãã§éäžåŠçããå Žåãšã®æ¯èŒã§ã¯çŽ25 ïŒ
ã®æ¹åå¹æïŒéäžåŠçåã§æãé
延ç¹æ§ã®ããåæå±±ã®ãµãŒãã§éäžåŠçããå Žåãšã®æ¯èŒã§ã¯çŽ2 ïŒ
ã®æ¹åå¹æãããããšç€ºãïŒææ¡æ¹åŒã®ç¬¬äžã®æ¡åŒµãšããŠïŒéä¿¡é
延æéã«èš±å®¹æ倧å€ã®ããã¢ããªã±ãŒã·ã§ã³ãžã®é©çšãèæ
®ããïŒæ¬ã±ãŒã¹ãžã®é©çšãšããŠïŒé
延蚱容æéãå°å
¥ããïŒå®åŒåããæé©ååé¡ãæ¡åŒµãïŒç¬¬äžç®çé¢æ°ãšããŠé
延蚱容æéãè¶
ããŠã¢ããªã±ãŒã·ã§ã³ãå©çšã§ããªããŠãŒã¶ç«¯æ«æ°ïŒç¬¬äºç®çé¢æ°ããŠãŒã¶ç«¯æ«è£æ£æéãšããŠïŒããããæå°åããé
延蚱容æéãèæ
®ãããµãŒãéžæåé¡ãšããŠå®åŒåããïŒææ¡æ¹åŒã®ç¬¬äžã®æ¡åŒµã«é¢ããæ§èœè©äŸ¡ãšããŠïŒé
延蚱容æéãå€åãããå Žåã®é
延蚱容æéãè¶ãããŠãŒã¶ç«¯æ«æ°ãšãŠãŒã¶ç«¯æ«è£æ£æéã«ã€ããŠïŒéäžåŠçåãšåæ£åŠçåã®æ¯èŒãè¡ãïŒãããã®è©äŸ¡ããïŒææ¡æ¹åŒã¯ïŒç¬¬äžã®æ¡åŒµã«ãã£ãŠïŒé
延蚱容æéãè¶
ããŠå©çšã§ããªããŠãŒã¶ç«¯æ«æ°ãéäžåŠçåãããåæžããïŒãŠãŒã¶ç«¯æ«è£æ£æéãçãããšããïŒããå€ãã®ãŠãŒã¶ãå©çšå¯èœã§ïŒãã€ïŒé
延ç¹æ§ã«åªããéä¿¡æ¹åŒã§ããããšã瀺ãïŒææ¡æ¹åŒã®ç¬¬äºã®æ¡åŒµãšããŠïŒãããã¯ãŒã¯èŒ»èŒ³æã®é
延å€åãèæ
®ããïŒæ¬ã±ãŒã¹ãžã®é©çšãšããŠïŒç¬¬äžã®æ¡åŒµãè¡ã£ãæé©ååé¡ã®ãŠãŒã¶ç«¯æ«ãšãµãŒãéã®éä¿¡é
延æéã«é
延å€åçãå°å
¥ãïŒé
延å€åæã®æ倧é
延æéããŠãŒã¶ç«¯æ«ãšãµãŒãéã®é
延æéãšããŠæ±ãïŒãŸãïŒåè¿°ã®æ倧é
延æéã«ã€ããŠïŒå
šãŠãŒã¶ç·åãæé©ååé¡ã®ç¬¬äžã®ç®çé¢æ°ãšããŠå°å
¥ãïŒé
延å€åãæå°åãããµãŒãéžæåé¡ãšããŠå®åŒåãã. å®åŒåããæé©ååé¡ã¯ïŒãšã³ãâãšã³ãã®éä¿¡é
延æéãæå°åããäžã§ïŒåãŠãŒã¶ç«¯æ«ãè€æ°ã®ãµãŒãããé
延æéã®æãå°ãªããµãŒããéžæããïŒé
延æéãšäŒéè·é¢ãæ¯äŸããæ¡ä»¶ã«ãããŠã¯ïŒå®åŒåããæé©ååé¡ã解ãããšã§ïŒãŠãŒã¶ç«¯æ«ãè€æ°ã®ãµãŒããšæ¥ç¶å¯èœãªãããã¯ãŒã¯ã«ãããŠïŒããäŒéè·é¢ã®çããµãŒãéãéžæãããããã¯ãŒã¯èšèšæ³ããŠãå©çšå¯èœã§ããïŒææ¡æ¹åŒã®ç¬¬äºã®æ¡åŒµã«é¢ããæ§èœè©äŸ¡ãšããŠïŒåè¿°ã®æ¥æ¬ã®ããã¯ããŒã³ãããã¯ãŒã¯ã®å
žåçãªã¢ãã«ã®é¢æ±ãšãªã¢ããŒãããµãŒããé
åãããŠããæ ç¹ãšããŠïŒé¢æ±ãšãªã¢å
ã«200 å°ã®ãŠãŒã¶ç«¯æ«ãäžæ§ååžããæ¡ä»¶ã§è©äŸ¡ããïŒãããã¯ãŒã¯èŒ»èŒ³æã®é
延æéã®è©äŸ¡ãšããŠïŒãŠãŒã¶ç«¯æ«ãéžæãããµãŒããïŒææ¡æ¹åŒã«ç¬¬äºã®æ¡åŒµãè¡ã£ãèšèšæ³ãšïŒæ¡åŒµãè¡ããªãèšèšæ³ã§æ¯èŒè©äŸ¡ãè¡ãïŒææ¡æ¹åŒã«ç¬¬äºã®æ¡åŒµãè¡ã£ãèšèšæ³ã¯ïŒãããã¯ãŒã¯ã茻茳ããŠãŠãŒã¶ç«¯æ«ãšç¹å®ãµãŒããšã®é
延æéãå¢å ããå Žåã«ïŒèš±å®¹é
延æéãè¶ãããŠãŒã¶ç«¯æ«æ°ãå°ãªãïŒãããã¯ãŒã¯èŒ»èŒ³ãèæ
®ãããŠãŒã¶ç«¯æ«ãéžæãããµãŒãã®æ±ºå®ãå¯èœã§ããããšã瀺ãïŒææ¡æ¹åŒã®ç¬¬äžã®æ¡åŒµãšããŠïŒæéçµéãšãšãã«ãŠãŒã¶ç«¯æ«ãé©å®è¿œå ãããã¢ããªã±ãŒã·ã§ã³ãžã®é©çšãèæ
®ããé次åå åã®ãŠãŒã¶åå æ¹æ³ãå°å
¥ããïŒé次åå åã®ãŠãŒã¶åå æ¹æ³ã§ã¯ïŒæé©ååé¡ã®æ±ºå®å€æ°ãšããŠæ±ã£ãŠãããªã³ã¯ã®å©çšæç¡ãšãµãŒãã®å©çšæç¡ãè¡šããã©ã¡ãŒã¿ãïŒå©çšäžãŠãŒã¶ç«¯æ«ã«ã€ããŠã¯ïŒæ±ºå®ããå€ãšããŠæ±ãããšã§ïŒéžæãããµãŒããå€æŽããªãå¶çŽæ¡ä»¶ãå å³ãããµãŒãéžæåé¡ãšããŠæ¡åŒµããïŒãŸãïŒæ°èŠãŠãŒã¶åå æã®åŸ
ã¡æéãççž®ãããŠãŒã¶ç«¯æ«åå æ¹æ³ãšããŠïŒèšç®å¯Ÿè±¡ãæ°èŠãŠãŒã¶ç«¯æ«ã«éå®ããããšã§ïŒèšç®éãåæžããïŒç¬¬äžã®æ¡åŒµã«é¢ããæ§èœè©äŸ¡ãšããŠïŒé
延ç¹æ§ãšèšç®éã«ã€ããŠè©äŸ¡ãè¡ãïŒé次åå åã§ã¯ïŒãŠãŒã¶ç«¯æ«ãåäžã®é
åç®æã§ãåå ããé åºã«ãããŠãŒã¶ç«¯æ«è£æ£æéãå€ãããã®ã®ïŒé次åå åã®ãããã®ãã¿ãŒã³ã«ãããŠãïŒéäžåŠçåããäœãå€ãšãªã£ãŠããïŒãããã®çµæããïŒé次åå åã§ãŠãŒã¶ç«¯æ«ãåå ããå©çšåœ¢æ
ã®ã¢ããªã±ãŒã·ã§ã³ã«ãããŠãïŒåæ£åŠçåéä¿¡æ¹åŒã®æå¹æ§ã確èªããïŒãŠãŒã¶ç«¯æ«ãã¢ããªã±ãŒã·ã§ã³ãå©çšéå§ããéã®åŸ
ã¡æéã®è©äŸ¡ãšããŠïŒãŠãŒã¶åå æéã®ççž®åã®æ¡åŒµãè¡ã£ãåå æ¹æ³ã«ã€ããŠïŒèšç®æéã®ççž®å¹æã«ã€ããŠè©äŸ¡ãè¡ãïŒé次åå åã¯ïŒå©çšäžãŠãŒã¶ç«¯æ«ãéžæãããµãŒããæ¢ã«å©çšäžã®ãµãŒããéžæããå¶çŽæ¡ä»¶ãšããŠããããïŒãµãŒããéžæããããã®èšç®éãåæžããïŒäžæåå åãšæ¯èŒã1/100 以äžã«åŠçæéãçããªã£ãŠããïŒãŠãŒã¶ç«¯æ«ã®åŸ
ã¡æéãççž®åããããŠãŒã¶åå æ¹æ³ã§ããããšã瀺ãïŒãŸãïŒãŠãŒã¶åå æéãççž®åãããŠãŒã¶åå æ¹æ³ã®å°å
¥ã«ããïŒããã«70 ïŒ
以äžèšç®æéãåæžãããããšã瀺ãïŒåè¿°ã®è©äŸ¡çµæããïŒææ¡ããåæ£åŠçåéä¿¡æ¹åŒã¯ïŒèš±å®¹é
延æéã®ããã¢ããªã±ãŒã·ã§ã³ã§ã¯å©çšãŠãŒã¶ç«¯æ«æ°ãæ倧åããããšãå¯èœã§ïŒäœé
延ãªãšã³ãâãšã³ãéä¿¡ãå¹
åºããŠãŒã¶ã«æäŸå¯èœãªéä¿¡æ¹åŒã§ããïŒãŸãïŒãããã¯ãŒã¯èŒ»èŒ³ãé次åå åã®ãŠãŒã¶åå æ¹æ³ã«ã€ããŠãææ¡æ¹åŒã®æ¡åŒµãè¡ãïŒæ§ã
ãªã¢ããªã±ãŒã·ã§ã³ãéä¿¡ç°å¢ãžã®é©çšãå¯èœãšãªãïŒä»®æ³åæè¡ã®é²å±ãšãšãã«ïŒãããã¯ãŒã¯å
ã«æ§ã
ãªã¢ããªã±ãŒã·ã§ã³ãé
åããç°å¢ã«ãããŠïŒæ¬ç 究ã«ããé
延ç¹æ§ã«åªããéä¿¡ç°å¢ãå®çŸããããšãå¯èœãšãªãïŒããç°¡æã«ã¢ããªã±ãŒã·ã§ã³ãå©çšãããããã¯ãŒã¯ãµãŒãã¹ã®å®çŸãæåŸ
ãããïŒé»æ°é信倧åŠ201
A scalable architecture for ordered parallelism
We present Swarm, a novel architecture that exploits ordered irregular parallelism, which is abundant but hard to mine with current software and hardware techniques. In this architecture, programs consist of short tasks with programmer-specified timestamps. Swarm executes tasks speculatively and out of order, and efficiently speculates thousands of tasks ahead of the earliest active task to uncover ordered parallelism. Swarm builds on prior TLS and HTM schemes, and contributes several new techniques that allow it to scale to large core counts and speculation windows, including a new execution model, speculation-aware hardware task management, selective aborts, and scalable ordered commits.
We evaluate Swarm on graph analytics, simulation, and database benchmarks. At 64 cores, Swarm achieves 51--122Ã speedups over a single-core system, and out-performs software-only parallel algorithms by 3--18Ã.National Science Foundation (U.S.) (Award CAREER-145299
- âŠ