31,303 research outputs found
The "MIND" Scalable PIM Architecture
MIND (Memory, Intelligence, and Network Device) is an advanced parallel computer architecture for high performance computing and scalable embedded processing. It is a
Processor-in-Memory (PIM) architecture integrating both DRAM bit cells and CMOS logic devices on the same silicon die. MIND is multicore with multiple memory/processor nodes on
each chip and supports global shared memory across systems of MIND components. MIND is distinguished from other PIM architectures in that it incorporates mechanisms for efficient support of a global parallel execution model based on the semantics of message-driven multithreaded split-transaction processing. MIND is designed to operate either in conjunction with other conventional microprocessors or in standalone arrays of like devices. It also incorporates mechanisms for fault tolerance, real time execution, and active power management. This paper describes the major elements and operational methods of the MIND
architecture
Recommended from our members
Computing infrastructure issues in distributed communications systems : a survey of operating system transport system architectures
The performance of distributed applications (such as file transfer, remote login, tele-conferencing, full-motion video, and scientific visualization) is influenced by several factors that interact in complex ways. In particular, application performance is significantly affected both by communication infrastructure factors and computing infrastructure factors. Several communication infrastructure factors include channel speed, bit-error rate, and congestion at intermediate switching nodes. Computing infrastructure factors include (among other things) both protocol processing activities (such as connection management, flow control, error detection, and retransmission) and general operating system factors (such as memory latency, CPU speed, interrupt and context switching overhead, process architecture, and message buffering). Due to a several orders of magnitude increase in network channel speed and an increase in application diversity, performance bottlenecks are shifting from the network factors to the transport system factors.This paper defines an abstraction called an "Operating System Transport System Architecture" (OSTSA) that is used to classify the major components and services in the computing infrastructure. End-to-end network protocols such as TCP, TP4, VMTP, XTP, and Delta-t typically run on general-purpose computers, where they utilize various operating system resources such as processors, virtual memory, and network controllers. The OSTSA provides services that integrate these resources to support distributed applications running on local and wide area networks.A taxonomy is presented to evaluate OSTSAs in terms of their support for protocol processing activities. We use this taxonomy to compare and contrast five general-purpose commercial and experimental operating systems including System V UNIX, BSD UNIX, the x-kernel, Choices, and Xinu
GPU Accelerated Particle Visualization with Splotch
Splotch is a rendering algorithm for exploration and visual discovery in
particle-based datasets coming from astronomical observations or numerical
simulations. The strengths of the approach are production of high quality
imagery and support for very large-scale datasets through an effective mix of
the OpenMP and MPI parallel programming paradigms. This article reports our
experiences in re-designing Splotch for exploiting emerging HPC architectures
nowadays increasingly populated with GPUs. A performance model is introduced
for data transfers, computations and memory access, to guide our re-factoring
of Splotch. A number of parallelization issues are discussed, in particular
relating to race conditions and workload balancing, towards achieving optimal
performances. Our implementation was accomplished by using the CUDA programming
paradigm. Our strategy is founded on novel schemes achieving optimized data
organisation and classification of particles. We deploy a reference simulation
to present performance results on acceleration gains and scalability. We
finally outline our vision for future work developments including possibilities
for further optimisations and exploitation of emerging technologies.Comment: 25 pages, 9 figures. Astronomy and Computing (2014
- …