8,878 research outputs found
ToPoliNano: Nanoarchitectures Design Made Real
Many facts about emerging nanotechnologies are yet to be assessed. There are still major concerns, for instance, about maximum achievable device density, or about which architecture is best fit for a specific application. Growing complexity requires taking into account many aspects of technology, application and architecture at the same time. Researchers face problems that are not new per se, but are now subject to very different constraints, that need to be captured by design tools. Among the emerging nanotechnologies, two-dimensional nanowire based arrays represent promising nanostructures, especially for massively parallel computing architectures. Few attempts have been done, aimed at giving the possibility to explore architectural solutions, deriving information from extensive and reliable nanoarray characterization. Moreover, in the nanotechnology arena there is still not a clear winner, so it is important to be able to target different technologies, not to miss the next big thing. We present a tool, ToPoliNano, that enables such a multi-technological characterization in terms of logic behavior, power and timing performance, area and layout constraints, on the basis of specific technological and topological descriptions. This tool can aid the design process, beside providing a comprehensive simulation framework for DC and timing simulations, and detailed power analysis. Design and simulation results will be shown for nanoarray-based circuits. ToPoliNano is the first real design tool that tackles the top down design of a circuit based on emerging technologie
Report from the MPP Working Group to the NASA Associate Administrator for Space Science and Applications
NASA's Office of Space Science and Applications (OSSA) gave a select group of scientists the opportunity to test and implement their computational algorithms on the Massively Parallel Processor (MPP) located at Goddard Space Flight Center, beginning in late 1985. One year later, the Working Group presented its report, which addressed the following: algorithms, programming languages, architecture, programming environments, the way theory relates, and performance measured. The findings point to a number of demonstrated computational techniques for which the MPP architecture is ideally suited. For example, besides executing much faster on the MPP than on conventional computers, systolic VLSI simulation (where distances are short), lattice simulation, neural network simulation, and image problems were found to be easier to program on the MPP's architecture than on a CYBER 205 or even a VAX. The report also makes technical recommendations covering all aspects of MPP use, and recommendations concerning the future of the MPP and machines based on similar architectures, expansion of the Working Group, and study of the role of future parallel processors for space station, EOS, and the Great Observatories era
Optimising Simulation Data Structures for the Xeon Phi
In this paper, we propose a lock-free architecture
to accelerate logic gate circuit simulation using SIMD multi-core
machines. We evaluate its performance on different test circuits
simulated on the Intel Xeon Phi and 2 other machines. Comparisons
are presented of this software/hardware combination with
reported performances of GPU and other multi-core simulation
platforms. Comparisons are also given between the lock free
architecture and a leading commercial simulator running on the
same Intel hardware
Memory and information processing in neuromorphic systems
A striking difference between brain-inspired neuromorphic processors and
current von Neumann processors architectures is the way in which memory and
processing is organized. As Information and Communication Technologies continue
to address the need for increased computational power through the increase of
cores within a digital processor, neuromorphic engineers and scientists can
complement this need by building processor architectures where memory is
distributed with the processing. In this paper we present a survey of
brain-inspired processor architectures that support models of cortical networks
and deep neural networks. These architectures range from serial clocked
implementations of multi-neuron systems to massively parallel asynchronous ones
and from purely digital systems to mixed analog/digital systems which implement
more biological-like models of neurons and synapses together with a suite of
adaptation and learning mechanisms analogous to the ones found in biological
nervous systems. We describe the advantages of the different approaches being
pursued and present the challenges that need to be addressed for building
artificial neural processing systems that can display the richness of behaviors
seen in biological systems.Comment: Submitted to Proceedings of IEEE, review of recently proposed
neuromorphic computing platforms and system
Recommended from our members
LPS Algorithms: A Detailed Examination
LPS is a Logic Programming System currently under development and specifically targeted for implementation on massively parallel architectures. We present a detailed explanation of algorithms under development for parallel execution of LPS programs. The explanation is significantly more detailed than those published previously. An abstract proof procedure is developed which encompasses these algorithms and several variants, as well as the standard sequential Prolog algorithm. This abstract procedure provides a conceptual basis for our discussion and. in a companion paper, for a critical analysis of various execution strategies. The algorithms have been successfully implemented and demonstrated in simulation on a number of small programs. Work is currently underway to transfer this implementation to a working prototype machine based on the DADO parallel architecture. Due to the depth of our treatment we assume that the reader has read previously published literature in the area
Selective Decoding in Associative Memories Based on Sparse-Clustered Networks
Associative memories are structures that can retrieve previously stored
information given a partial input pattern instead of an explicit address as in
indexed memories. A few hardware approaches have recently been introduced for a
new family of associative memories based on Sparse-Clustered Networks (SCN)
that show attractive features. These architectures are suitable for
implementations with low retrieval latency, but are limited to small networks
that store a few hundred data entries. In this paper, a new hardware
architecture of SCNs is proposed that features a new data-storage technique as
well as a method we refer to as Selective Decoding (SD-SCN). The SD-SCN has
been implemented using a similar FPGA used in the previous efforts and achieves
two orders of magnitude higher capacity, with no error-performance penalty but
with the cost of few extra clock cycles per data access.Comment: 4 pages, Accepted in IEEE Global SIP 2013 conferenc
JANUS: an FPGA-based System for High Performance Scientific Computing
This paper describes JANUS, a modular massively parallel and reconfigurable
FPGA-based computing system. Each JANUS module has a computational core and a
host. The computational core is a 4x4 array of FPGA-based processing elements
with nearest-neighbor data links. Processors are also directly connected to an
I/O node attached to the JANUS host, a conventional PC. JANUS is tailored for,
but not limited to, the requirements of a class of hard scientific applications
characterized by regular code structure, unconventional data manipulation
instructions and not too large data-base size. We discuss the architecture of
this configurable machine, and focus on its use on Monte Carlo simulations of
statistical mechanics. On this class of application JANUS achieves impressive
performances: in some cases one JANUS processing element outperfoms high-end
PCs by a factor ~ 1000. We also discuss the role of JANUS on other classes of
scientific applications.Comment: 11 pages, 6 figures. Improved version, largely rewritten, submitted
to Computing in Science & Engineerin
Parallel VLSI architecture emulation and the organization of APSA/MPP
The Applicative Programming System Architecture (APSA) combines an applicative language interpreter with a novel parallel computer architecture that is well suited for Very Large Scale Integration (VLSI) implementation. The Massively Parallel Processor (MPP) can simulate VLSI circuits by allocating one processing element in its square array to an area on a square VLSI chip. As long as there are not too many long data paths, the MPP can simulate a VLSI clock cycle very rapidly. The APSA circuit contains a binary tree with a few long paths and many short ones. A skewed H-tree layout allows every processing element to simulate a leaf cell and up to four tree nodes, with no loss in parallelism. Emulation of a key APSA algorithm on the MPP resulted in performance 16,000 times faster than a Vax. This speed will make it possible for the APSA language interpreter to run fast enough to support research in parallel list processing algorithms
- …