Search CORE

182 research outputs found

Performance mapping of a class of fully decoupled architecture

Author: Crawford Alan W. R.
Publication venue: The University of Edinburgh
Publication date: 01/01/1999
Field of study

Edinburgh Research Archive

Limits of a decoupled out-of-order superscalar architecture

Author: Jones Graham P.
Publication venue: The University of Edinburgh
Publication date: 01/01/1999
Field of study

Edinburgh Research Archive

Prodigy: Improving the Memory Latency of Data-Indirect Irregular Workloads Using Hardware-Software Co-Design

Author: Ahmadi Agreen
Austin Todd
Behroozi Armand
Dreslinski Ronald
Kaszyk Kuba
Li Lu
Mahlke Scott
May Kyle
Morton John Magnus
Mudge Trevor
Nguyen Brandon
O'Boyle Michael F P
Sun Jiawen
Talati Nishil
Vasiladiotis Christos
Verma Tarunesh
Yang Yichen
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/04/2021
Field of study

Edinburgh Research Explorer

ADAM : a decentralized parallel computer architecture featuring fast thread and data migration and a uniform hardware abstraction

Author: Huang Andrew S. (Andrew Shane)
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2002
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2002.Includes bibliographical references (p. 247-256).The furious pace of Moore's Law is driving computer architecture into a realm where the the speed of light is the dominant factor in system latencies. The number of clock cycles to span a chip are increasing, while the number of bits that can be accessed within a clock cycle is decreasing. Hence, it is becoming more difficult to hide latency. One alternative solution is to reduce latency by migrating threads and data, but the overhead of existing implementations has previously made migration an unserviceable solution so far. I present an architecture, implementation, and mechanisms that reduces the overhead of migration to the point where migration is a viable supplement to other latency hiding mechanisms, such as multithreading. The architecture is abstract, and presents programmers with a simple, uniform fine-grained multithreaded parallel programming model with implicit memory management. In other words, the spatial nature and implementation details (such as the number of processors) of a parallel machine are entirely hidden from the programmer. Compiler writers are encouraged to devise programming languages for the machine that guide a programmer to express their ideas in terms of objects, since objects exhibit an inherent physical locality of data and code. The machine implementation can then leverage this locality to automatically distribute data and threads across the physical machine by using a set of high performance migration mechanisms.(cont.) An implementation of this architecture could migrate a null thread in 66 cycles - over a factor of 1000 improvement over previous work. Performance also scales well; the time required to move a typical thread is only 4 to 5 times that of a null thread. Data migration performance is similar, and scales linearly with data block size. Since the performance of the migration mechanism is on par with that of an L2 cache, the implementation simulated in my work has no data caches and relies instead on multithreading and the migration mechanism to hide and reduce access latencies.by Andrew "bunnie" Huang.Ph.D

DSpace@MIT

ADAM: A Decentralized Parallel Computer Architecture Featuring Fast Thread and Data Migration and a Uniform Hardware Abstraction

Author: Huang Andrew "bunnie"
Publication venue
Publication date: 01/06/2002
Field of study

The furious pace of Moore's Law is driving computer architecture into a realm where the the speed of light is the dominant factor in system latencies. The number of clock cycles to span a chip are increasing, while the number of bits that can be accessed within a clock cycle is decreasing. Hence, it is becoming more difficult to hide latency. One alternative solution is to reduce latency by migrating threads and data, but the overhead of existing implementations has previously made migration an unserviceable solution so far. I present an architecture, implementation, and mechanisms that reduces the overhead of migration to the point where migration is a viable supplement to other latency hiding mechanisms, such as multithreading. The architecture is abstract, and presents programmers with a simple, uniform fine-grained multithreaded parallel programming model with implicit memory management. In other words, the spatial nature and implementation details (such as the number of processors) of a parallel machine are entirely hidden from the programmer. Compiler writers are encouraged to devise programming languages for the machine that guide a programmer to express their ideas in terms of objects, since objects exhibit an inherent physical locality of data and code. The machine implementation can then leverage this locality to automatically distribute data and threads across the physical machine by using a set of high performance migration mechanisms. An implementation of this architecture could migrate a null thread in 66 cycles -- over a factor of 1000 improvement over previous work. Performance also scales well; the time required to move a typical thread is only 4 to 5 times that of a null thread. Data migration performance is similar, and scales linearly with data block size. Since the performance of the migration mechanism is on par with that of an L2 cache, the implementation simulated in my work has no data caches and relies instead on multithreading and the migration mechanism to hide and reduce access latencies

DSpace@MIT

Space station automation and robotics study. Operator-systems interface

Author
Publication venue
Publication date
Field of study

This is the final report of a Space Station Automation and Robotics Planning Study, which was a joint project of the Boeing Aerospace Company, Boeing Commercial Airplane Company, and Boeing Computer Services Company. The study is in support of the Advanced Technology Advisory Committee established by NASA in accordance with a mandate by the U.S. Congress. Boeing support complements that provided to the NASA Contractor study team by four aerospace contractors, the Stanford Research Institute (SRI), and the California Space Institute. This study identifies automation and robotics (A&R) technologies that can be advanced by requirements levied by the Space Station Program. The methodology used in the study is to establish functional requirements for the operator system interface (OSI), establish the technologies needed to meet these requirements, and to forecast the availability of these technologies. The OSI would perform path planning, tracking and control, object recognition, fault detection and correction, and plan modifications in connection with extravehicular (EV) robot operations

NASA Technical Reports Server

The Vector-Thread Architecture

Author: Brian Pharris
Christopher Batten
Jared Casper
Kitagawa K.
Krste Asanovic
Mark Hampton
Ronny Krashinsky
Steve Gerding
Zhang M.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Aerospace Applications of Microprocessors

Author
Publication venue
Publication date
Field of study

An assessment of the state of microprocessor applications is presented. Current and future requirements and associated technological advances which allow effective exploitation in aerospace applications are discussed

NASA Technical Reports Server

A study of the university role in engineering research for NASA particularized to the Stanford University case Final report

Author
Publication venue
Publication date
Field of study

University role in engineering research for NAS

NASA Technical Reports Server