Search CORE

1,194 research outputs found

The "MIND" Scalable PIM Architecture

Author: Brodowicz Maciej
Sterling Thomas
Publication venue
Publication date: 01/01/2005
Field of study

MIND (Memory, Intelligence, and Network Device) is an advanced parallel computer architecture for high performance computing and scalable embedded processing. It is a Processor-in-Memory (PIM) architecture integrating both DRAM bit cells and CMOS logic devices on the same silicon die. MIND is multicore with multiple memory/processor nodes on each chip and supports global shared memory across systems of MIND components. MIND is distinguished from other PIM architectures in that it incorporates mechanisms for efficient support of a global parallel execution model based on the semantics of message-driven multithreaded split-transaction processing. MIND is designed to operate either in conjunction with other conventional microprocessors or in standalone arrays of like devices. It also incorporates mechanisms for fault tolerance, real time execution, and active power management. This paper describes the major elements and operational methods of the MIND architecture

Caltech Authors

Macroservers: An Execution Model for DRAM Processor-In-Memory Arrays

Author: Sterling Thomas L.
Zima Hans P.
Publication venue: 'California Institute of Technology Library'
Publication date: 01/01/2000
Field of study

The emergence of semiconductor fabrication technology allowing a tight coupling between high-density DRAM and CMOS logic on the same chip has led to the important new class of Processor-In-Memory (PIM) architectures. Newer developments provide powerful parallel processing capabilities on the chip, exploiting the facility to load wide words in single memory accesses and supporting complex address manipulations in the memory. Furthermore, large arrays of PIMs can be arranged into a massively parallel architecture. In this report, we describe an object-based programming model based on the notion of a macroserver. Macroservers encapsulate a set of variables and methods; threads, spawned by the activation of methods, operate asynchronously on the variables' state space. Data distributions provide a mechanism for mapping large data structures across the memory region of a macroserver, while work distributions allow explicit control of bindings between threads and data. Both data and work distributuions are first-class objects of the model, supporting the dynamic management of data and threads in memory. This offers the flexibility required for fully exploiting the processing power and memory bandwidth of a PIM array, in particular for irregular and adaptive applications. Thread synchronization is based on atomic methods, condition variables, and futures. A special type of lightweight macroserver allows the formulation of flexible scheduling strategies for the access to resources, using a monitor-like mechanism

CiteSeerX

Caltech Authors

Continuum computer architecture for exaflops computation

Author: Sterling Thomas
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/03/2001
Field of study

The ultimate computers in our long-term future will deliver exaflops-scale performance (or greater) and will look very different from today’s micro-processors and massively parallel computers. Ironically, however, their alien structures and operational behavior can be inferred from the same technology trends driving development of today’s conventional computing systems

Caltech Authors

Dynamic adaptive parallel architecture integrates advanced technologies for petaflops-scale computing

Author: Sterling Thomas
Publication venue: Society of Photo-optical Instrumentation Engineers (SPIE)
Publication date: 17/11/2000
Field of study

Teraflops-scale computing systems are becoming available to an increasingly broad range of users as the performance of the constituent processing elements increases and their relative cost (e.g. per Mflops) decreases. To the original DOE ASCI Red machine has been added the ASCI Blue systems and additional 1 Teraflops commercial systems at key national centers. Clusters of low cost PCs employing COTS network technologies (e.g. Beowulf-class systems) will make peak Teraflops performance available for less than 2M in the near future for certain classes of well behaved problems. Future larger systems include the Japanese Earth Simulator with a peak performance of 40 Teraflops and three larger ASCI systems anticipated to provide peak performance of 10, 30, and 100 Teraflops culminating in 2005. These systems use existing or near term conventional technologies and architectures with some specialized integration logic and networking. While the peak performance goals can be satisfied through this strategy over the next decade, two major challenges confront the high performance computing community: (1) how to aggressively accelerate performance to the operational regime beyond a Petaflops, and (2) how to achieve high efficiency for a wide range of applications. The Hybrid Technology Multithreaded (HTMT) computer is under development by an interdisciplinary team of investigators to address both problems through an innovative combination of advanced technologies and dynamic adaptive architecture. This paper describes the strategy embodied by the HTMT architecture and discusses the key factors that may enable it to achieve two to three orders of magnitude performance with respect to today's largest systems at a cost and power consumption of only a factor of two to three times those same present day systems

Caltech Authors

The Branch on Which the Blossom Hangs

Author: Coffey Thomas Sterling
Publication venue: ScholarWorks@UARK
Publication date: 01/07/2020
Field of study

The Branch on Which the Blossom Hangs is a body of paintings which address the relationship between landscape or physical presence and the primary experiences of emotion and perception. Through this examination of phenomenology and the malleability of the perceptual apparatus, the paintings express my feeling of dislocation caused by a cycle between depression, dissociation, and mental well-being. They question how an individual relates to their environment. The paintings seek to elicit the allusive and embodied qualities of poetry, framing and evoking a broader experience without defining it. By using the recognizable visual language of landscape, abstracted to the point of familiarity if not identifiability, and immersive, human scale, the works envelop the viewer in a space in which they cannot comfortably situate themselves. This elusive quality is vital – to accurately express these fluctuating states, they must balance a sense of both distance and belonging

ScholarWorks@UARK

UARK (University of Arkansas )

Powdery mildew of poinsettia

Author: Nelson Scot
Thomas Sterling
Publication venue: 'University of Hawaii Press (Project Muse)'
Publication date: 01/05/2012
Field of study

This study discusses the symptoms of powdery mildew of poinsettia, describes the pathogen and suggests integrated practices for effective prevention and management of the disease

ScholarSpace at University of Hawai'i at Manoa

Custom-Enabled System Architectures for High End Computing

Author: Kogge Peter
Sterling Thomas
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2004
Field of study

The US Federal Government has convened a major committee to determine future directions for government sponsored high end computing system acquisitions and enabling research. The High End Computing Revitalization Task Force was inaugurated in 2003 involving all Federal agencies for which high end computing is critical to meeting mission goals. As part of the HECRTF agenda, a multi-day community wide workshop was conducted involving experts from academia, industry, and the national laboratories and centers to provide the broadest perspective on important issues related to the HECRTF purview. Among the most critical issues in establishing future directions is the relative merits of commodity based systems such as clusters and MPPs versus custom system architecture strategies. This paper presents a perspective on the importance and value of the custom architecture approach in meeting future US requirements in supercomputing. The contents of this paper reflect the ideas of the participants of the working group chartered to explore custom enabled system architectures for high end computing. As in any such consensus presentation, while this paper captures the key ideas and tradeoffs, it does not exactly match the viewpoint of any single contributor, and there remains much room for constructive disagreement and refinement of the essential conclusions

Caltech Authors

A survey of current software for network analysis in molecular biology

Author: Bonchev Danail
Thomas Sterling
Publication venue: VCU Scholars Compass
Publication date: 01/01/2010
Field of study

Software for network motifs and modules is briefly reviewed, along with programs for network comparison. The three major software packages for network analysis, CYTOSCAPE, INGENUITY and PATHWAY STUDIO, and their associated databases, are compared in detail. A comparative test evaluated how these software packages perform the search for key terms and the creation of network from those terms and from experimental expression data

CiteSeerX

Crossref

Directory of Open Access Journals

PubMed Central

VCU Scholars Compass

Analysis and Modeling of Advanced PIM Architecture Design Tradeoffs

Author: Brockman Jay
Sterling Thomas
Upchurch Ed
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/11/2004
Field of study

A major trend in high performance computer architecture over the last two decades is the migration of memory in the form of high speed caches onto the microprocessor semiconductor die. Where temporal locality in the computation is high, caches prove very effective at hiding memory access latency and contention for communication resources. However where temporal locality is absent, caches may exhibit low hit rates resulting in poor operational efficiency. Vector computing exploiting pipelined arithmetic units and memory access address this challenge for certain forms of data access patterns, for example involving long contiguous data sets exhibiting high spatial locality. But for many advanced applications for science, technology, and national security at least some data access patterns are not consistent to the restricted forms well handled by either caches or vector processing. An important alternative is the reverse strategy; that of migrating logic in to the main memory (DRAM) and performing those operations directly on the data stored there. Processor in Memory (PIM) architecture has advanced to the point where it may fill this role and provide an important new mechanism for improving performance and efficiency of future supercomputers for a broad range of applications. One important project considering both the role of PIM in supercomputer architecture and the design of such PIM components is the Cray Cascade Project sponsored by the DARPA High Productivity Computing Program. Cascade is a Petaflops scale computer targeted for deployment at the end of the decade that merges the raw speed of an advanced custom vector architecture with the high memory bandwidth processing delivered by an innovative class of PIM architecture. The work represented here was performed under the Cascade project to explore critical design space issues that will determine the value of PIM in supercomputers and contribute to the optimization of its design. But this work also has strong relevance to hybrid systems comprising a combination of conventional microprocessors and advanced PIM based intelligent main memory

Caltech Authors