Search CORE

53 research outputs found

Demand-driven, concurrent discrete event simulation

Author: Smart Colin
Publication venue: The University of Edinburgh
Publication date: 01/01/2001
Field of study

Edinburgh Research Archive

Area virtual time

Author: Schneiders Johannes
Publication venue: The University of Edinburgh
Publication date: 01/01/2004
Field of study

Edinburgh Research Archive

RITSim: distributed systemC simulation

Author: Cox David Richard
Publication venue: RIT Scholar Works
Publication date: 07/09/2005
Field of study

Parallel or distributed simulation is becoming more than a novel way to speedup design evaluation; it is becoming necessary for simulating modern processors in a reasonable timeframe. As architectural features become faster, smaller, and more complex, designers are interested in obtaining detailed and accurate performance and power estimations. Uniprocessor simulators may not be able to meet such demands. The RITSim project uses SystemC to model a processor microarchitecture and memory subsystem in great detail. SystemC is a C++ library built on a discrete-event simulation kernel. Many projects have successfully implemented parallel discrete-event simulation (PDES) frameworks to distribute simulation among several hosts. The field promises significant simulation speedup, possibly leading to faster turnaround time in design space exploration and commercial production. However, parallel implementation of such simulators is not an easy task. It requires modification of the simulation kernel for effective partitioning and synchronization. This thesis explores PDES techniques and presents a distributed version of the SystemC simulation environment. With minimal user interaction, SystemC models can executed on a cluster of workstations using a message-passing library such as the Message Passing Interface (MPI). The implementation is designed for transparency; distribution and synchronization happen with little intervention by the model author. Modification of SystemC is fashioned to promote maintainability with future releases. Furthermore, only freely available libraries are used for maximum flexibility and portability

RIT Scholar Works

Submicron Systems Architecture Project: Semiannual Technial Report

Author: Seitz Charles L.
Publication venue: 'California Institute of Technology Library'
Publication date: 01/01/1989
Field of study

No abstract available

Caltech Authors

Submicron Systems Architecture Project : Semiannual Technical Report

Author: Martin Alain J.
Seitz Charles L.
Van de Snepscheut Jan L. A.
Publication venue: 'California Institute of Technology Library'
Publication date: 01/01/1992
Field of study

The Mosaic C is an experimental fine-grain multicomputer based on single-chip nodes. The Mosaic C chip includes 64KB of fast dynamic RAM, processor, packet interface, ROM for bootstrap and self-test, and a two-dimensional selftimed router. The chip architecture provides low-overhead and low-latency handling of message packets, and high memory and network bandwidth. Sixty-four Mosaic chips are packaged by tape-automated bonding (TAB) in an 8 x 8 array on circuit boards that can, in turn, be arrayed in two dimensions to build arbitrarily large machines. These 8 x 8 boards are now in prototype production under a subcontract with Hewlett-Packard. We are planning to construct a 16K-node Mosaic C system from 256 of these boards. The suite of Mosaic C hardware also includes host-interface boards and high-speed communication cables. The hardware developments and activities of the past eight months are described in section 2.1. The programming system that we are developing for the Mosaic C is based on the same message-passing, reactive-process, computational model that we have used with earlier multicomputers, but the model is implemented for the Mosaic in a way that supports finegrain concurrency. A process executes only in response to receiving a message, and may in execution send messages, create new processes, and modify its persistent variables before it either exits or becomes dormant in preparation for receiving another message. These computations are expressed in an object-oriented programming notation, a derivative of C++ called C+-. The computational model and the C+- programming notation are described in section 2.2. The Mosaic C runtime system, which is written in C+-, provides automatic process placement and highly distributed management of system resources. The Mosaic C runtime system is described in section 2.3

Caltech Authors

多地点接続リアルタイム型アプリケーションに適用する分散処理型通信方式

Author: Akio Kawabata
川端明生
Publication venue
Publication date: 07/10/2016
Field of study

仮想化技術の進展によって，様々なアプリケーションがネットワーク内のクラウド上で動作可能となるが，広域なネットワークを介して多地点間通信を行うリアルタイム型アプリケーションでは，低遅延なエンド‐エンド通信を実現する通信技術の確立が課題である．本研究は，ネットワーク上で動作するアプリケーションを通信サービスとして提供する場合のエンド‐エンドの通信遅延時間を低減することを目的とする分散処理型通信方式である．提案方式は，ネットワーク内でユーザ端末と近いロケーションに配備された複数のサーバを用いてアプリケーションを分散処理をする．ユーザ端末は複数のサーバからエンド‐エンドの遅延時間を最小化するサーバを選択し，分散処理する複数のサーバ間では処理結果の同報通信を行う．ユーザ端末とネットワーク内に配備されたサーバとの通信は，ユーザ端末ごとにサーバとの通信遅延時間が異なるため，実際のイベント発生順序とネットワークを介したサーバへのイベント到着順序が異なる可能性があり，イベントの処理順序を補正する仕組みが必要となる．分散処理する各サーバでは，イベントの順序性を再現するために，現在時刻からイベントの順序性を再現可能な時刻まで時間を遅らせた仮想時刻をあらかじめ計算し，仮想時刻上でイベントの順序性を再現する．分散処理をするサーバでは，各ユーザ端末との通信遅延時間を事前に測定しておき，ユーザ端末毎の通信遅延時間に応じた待ち合わせを行うことで，仮想時刻上でイベント発生順序を再現する．ネットワーク内の複数のサーバから，エンド‐エンドの通信遅延時間を最小化するサーバを決定するためのサーバ選択問題として，現在時刻と仮想時刻の差であるユーザ端末補正時間を最小化する．サーバ選択問題についての計算複雑度の評価し，本問題はNP 困難であることを示す．サーバ選択問題を線形計画問題として定式化し，エンド‐エンドの通信遅延時間を最小化する仮想時刻と各ユーザ端末が選択するサーバを線形計画問題を解くことで決定する．提案方式の性能評価として，サーバ間ネットワークトポロジおよびネットワーク上のサーバ配備箇所によるエンド‐エンドの通信遅延時間を評価する．サーバ間ネットワークトポロジの評価では，同一のサーバ配備箇所で異なるリンクトポロジで改善効果を比較し，フルメッシュ型やリング型のようにサーバ間が，最短距離に近い距離のリンクを持っているトポロジのほうが，遅延特性の改善効果が高いことを示す．また，サーバ配備箇所としては，よりユーザに近いロケーションにサーバを配備すると遅延特性の改善効果が高いことを示す．サーバ選択問題の評価としては，特定エリア内に一様分布した200 台のユーザ端末について，本研究で定式化した最適化問題を解くことで，遅延時間を最小にするサーバが選択されることを示す．実際のネットワークトポロジに近い条件における特性改善効果の確認として，日本のバックボーンネットワークの典型的なモデルを用いて，全国に分散した複数のサーバで分散処理する場合と１台のサーバで集中処理する場合を比較し，東京のサーバで集中処理する場合との比較では約25 ％の改善効果，集中処理型で最も遅延特性のよい和歌山のサーバで集中処理する場合との比較では約2 ％の改善効果があること示す．提案方式の第一の拡張として，通信遅延時間に許容最大値のあるアプリケーションへの適用を考慮する．本ケースへの適用として，遅延許容時間を導入する．定式化した最適化問題を拡張し，第一目的関数として遅延許容時間を超えてアプリケーションが利用できないユーザ端末数，第二目的関数をユーザ端末補正時間として，これらを最小化する遅延許容時間を考慮したサーバ選択問題として定式化する．提案方式の第一の拡張に関する性能評価として，遅延許容時間を変化させた場合の遅延許容時間を越えたユーザ端末数とユーザ端末補正時間について，集中処理型と分散処理型の比較を行う．これらの評価から，提案方式は，第一の拡張によって，遅延許容時間を超えて利用できないユーザ端末数が集中処理型よりも削減され，ユーザ端末補正時間も短いことから，より多くのユーザが利用可能で，かつ，遅延特性に優れた通信方式であることを示す．提案方式の第二の拡張として，ネットワーク輻輳時の遅延変動を考慮する．本ケースへの適用として，第一の拡張を行った最適化問題のユーザ端末とサーバ間の通信遅延時間に遅延変動率を導入し，遅延変動時の最大遅延時間をユーザ端末とサーバ間の遅延時間として扱う．また，前述の最大遅延時間について，全ユーザ総和を最適化問題の第三の目的関数として導入し，遅延変動を最小化するサーバ選択問題として定式化する. 定式化した最適化問題は，エンド‐エンドの通信遅延時間を最小化した上で，各ユーザ端末が複数のサーバから遅延時間の最も少ないサーバを選択する．遅延時間と伝送距離が比例する条件においては，定式化した最適化問題を解くことで，ユーザ端末が複数のサーバと接続可能なネットワークにおいて，より伝送距離の短いサーバ間を選択するネットワーク設計法しても利用可能である．提案方式の第二の拡張に関する性能評価として，前述の日本のバックボーンネットワークの典型的なモデルの関東エリアノードをサーバが配備されている拠点として，関東エリア内に200 台のユーザ端末が一様分布した条件で評価する．ネットワーク輻輳時の遅延時間の評価として，ユーザ端末が選択するサーバを，提案方式に第二の拡張を行った設計法と，拡張を行わない設計法で比較評価を行う．提案方式に第二の拡張を行った設計法は，ネットワークが輻輳してユーザ端末と特定サーバとの遅延時間が増加した場合に，許容遅延時間を越えるユーザ端末数が少なく，ネットワーク輻輳を考慮したユーザ端末が選択するサーバの決定が可能であることを示す．提案方式の第三の拡張として，時間経過とともにユーザ端末が適宜追加されるアプリケーションへの適用を考慮した逐次参加型のユーザ参加方法を導入する．逐次参加型のユーザ参加方法では，最適化問題の決定変数として扱っていたリンクの利用有無とサーバの利用有無を表すパラメータを，利用中ユーザ端末については，決定した値として扱うことで，選択するサーバを変更しない制約条件を加味したサーバ選択問題として拡張する．また，新規ユーザ参加時の待ち時間を短縮するユーザ端末参加方法として，計算対象を新規ユーザ端末に限定することで，計算量を削減する．第三の拡張に関する性能評価として，遅延特性と計算量について評価を行う．逐次参加型では，ユーザ端末が同一の配備箇所でも参加する順序によりユーザ端末補正時間が変わるものの，逐次参加型のいずれのパターンにおいても，集中処理型より低い値となっている．これらの結果から，逐次参加型でユーザ端末が参加する利用形態のアプリケーションにおいても，分散処理型通信方式の有効性を確認する．ユーザ端末がアプリケーションを利用開始する際の待ち時間の評価として，ユーザ参加時間の短縮化の拡張を行った参加方法について，計算時間の短縮効果について評価を行う．逐次参加型は，利用中ユーザ端末が選択するサーバを既に利用中のサーバを選択する制約条件としているため，サーバを選択するための計算量が削減され，一斉参加型と比較し1/100 以下に処理時間が短くなっており，ユーザ端末の待ち時間が短縮化されたユーザ参加方法であることを示す．また，ユーザ参加時間を短縮化したユーザ参加方法の導入により，さらに70 ％以上計算時間が削減されることを示す．前述の評価結果から，提案する分散処理型通信方式は，許容遅延時間のあるアプリケーションでは利用ユーザ端末数を最大化することが可能で，低遅延なエンド‐エンド通信を幅広いユーザに提供可能な通信方式である．また，ネットワーク輻輳や逐次参加型のユーザ参加方法についても提案方式の拡張を行い，様々なアプリケーションや通信環境への適用が可能となる．仮想化技術の進展とともに，ネットワーク内に様々なアプリケーションを配備する環境において，本研究により遅延特性に優れた通信環境を実現することが可能となり，より簡易にアプリケーションを利用するネットワークサービスの実現が期待される．電気通信大学201

Creative Repository of Electro-Communications

An efficient graph representation for arithmetic circuit verification

Author: R.E. Bryant
Yirng-An Chen
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

A scalable architecture for ordered parallelism

Author: Emer Joel
Jeffrey Mark Christopher
Sanchez Daniel
Subramanian Suvinay
Yan Cong
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/12/2015
Field of study

We present Swarm, a novel architecture that exploits ordered irregular parallelism, which is abundant but hard to mine with current software and hardware techniques. In this architecture, programs consist of short tasks with programmer-specified timestamps. Swarm executes tasks speculatively and out of order, and efficiently speculates thousands of tasks ahead of the earliest active task to uncover ordered parallelism. Swarm builds on prior TLS and HTM schemes, and contributes several new techniques that allow it to scale to large core counts and speculation windows, including a new execution model, speculation-aware hardware task management, selective aborts, and scalable ordered commits. We evaluate Swarm on graph analytics, simulation, and database benchmarks. At 64 cores, Swarm achieves 51--122× speedups over a single-core system, and out-performs software-only parallel algorithms by 3--18×.National Science Foundation (U.S.) (Award CAREER-145299

DSpace@MIT

Crossref