2,381 research outputs found
The "MIND" Scalable PIM Architecture
MIND (Memory, Intelligence, and Network Device) is an advanced parallel computer architecture for high performance computing and scalable embedded processing. It is a
Processor-in-Memory (PIM) architecture integrating both DRAM bit cells and CMOS logic devices on the same silicon die. MIND is multicore with multiple memory/processor nodes on
each chip and supports global shared memory across systems of MIND components. MIND is distinguished from other PIM architectures in that it incorporates mechanisms for efficient support of a global parallel execution model based on the semantics of message-driven multithreaded split-transaction processing. MIND is designed to operate either in conjunction with other conventional microprocessors or in standalone arrays of like devices. It also incorporates mechanisms for fault tolerance, real time execution, and active power management. This paper describes the major elements and operational methods of the MIND
architecture
A Survey of Techniques For Improving Energy Efficiency in Embedded Computing Systems
Recent technological advances have greatly improved the performance and
features of embedded systems. With the number of just mobile devices now
reaching nearly equal to the population of earth, embedded systems have truly
become ubiquitous. These trends, however, have also made the task of managing
their power consumption extremely challenging. In recent years, several
techniques have been proposed to address this issue. In this paper, we survey
the techniques for managing power consumption of embedded systems. We discuss
the need of power management and provide a classification of the techniques on
several important parameters to highlight their similarities and differences.
This paper is intended to help the researchers and application-developers in
gaining insights into the working of power management techniques and designing
even more efficient high-performance embedded systems of tomorrow
Microprocessor control and networking for the amps breadboard
Future space missions will require more sophisticated power systems, implying higher costs and more extensive crew and ground support involvement. To decrease this human involvement, as well as to protect and most efficiently utilize this important resource, NASA has undertaken major efforts to promote progress in the design and development of autonomously managed power systems. Two areas being actively pursued are autonomous power system (APS) breadboards and knowledge-based expert system (KBES) applications. The former are viewed as a requirement for the timely development of the latter. Not only will they serve as final testbeds for the various KBES applications, but will play a major role in the knowledge engineering phase of their development. The current power system breadboard designs are of a distributed microprocessor nature. The distributed nature, plus the need to connect various external computer capabilities (i.e., conventional host computers and symbolic processors), places major emphasis on effective networking. The communications and networking technologies for the first power system breadboard/test facility are described
Power Management Techniques for Data Centers: A Survey
With growing use of internet and exponential growth in amount of data to be
stored and processed (known as 'big data'), the size of data centers has
greatly increased. This, however, has resulted in significant increase in the
power consumption of the data centers. For this reason, managing power
consumption of data centers has become essential. In this paper, we highlight
the need of achieving energy efficiency in data centers and survey several
recent architectural techniques designed for power management of data centers.
We also present a classification of these techniques based on their
characteristics. This paper aims to provide insights into the techniques for
improving energy efficiency of data centers and encourage the designers to
invent novel solutions for managing the large power dissipation of data
centers.Comment: Keywords: Data Centers, Power Management, Low-power Design, Energy
Efficiency, Green Computing, DVFS, Server Consolidatio
A comprehensive approach to DRAM power management
This paper describes a comprehensive approach for using the memory controller to improve DRAM energy efficiency and manage DRAM power. We make three contributions: (1) we describe a simple power-down policy for exploiting low power modes of modern DRAMs; (2) we show how the idea of adaptive history-based memory schedulers can be naturally extended to manage power and energy; and (3) for situations in which additional DRAM power reduction is needed, we present a throttling approach that arbitrarily reduces DRAM activity by delaying the issuance of memory commands. Using detailed microarchitectural simulators of the IBM Power5+ and a DDR2-533 SDRAM, we show that our first two techniques combine to increase DRAM energy efficiency by an average of 18.2%, 21.7%, 46.1%, and 37.1 % for the Stream, NAS, SPEC2006fp, and commercial benchmarks, respectively. We also show that our throttling approach provides performance that is within 4.4 % of an idealized oracular approach.
Adaptive runtime-assisted block prefetching on chip-multiprocessors
Memory stalls are a significant source of performance degradation in modern processors. Data prefetching is a widely adopted and well studied technique used to alleviate this problem. Prefetching can be performed by the hardware, or be initiated and controlled by software. Among software controlled prefetching we find a wide variety of schemes, including runtime-directed prefetching and more specifically runtime-directed block prefetching. This paper proposes a hybrid prefetching mechanism that integrates a software driven block prefetcher with existing hardware prefetching techniques. Our runtime-assisted software prefetcher brings large blocks of data on-chip with the support of a low cost hardware engine, and synergizes with existing hardware prefetchers that manage locality at a finer granularity. The runtime system that drives the prefetch engine dynamically selects which cache to prefetch to. Our evaluation on a set of scientific benchmarks obtains a maximum speed up of 32 and 10 % on average compared to a baseline with hardware prefetching only. As a result, we also achieve a reduction of up to 18 and 3 % on average in energy-to-solution.Peer ReviewedPostprint (author's final draft
- …