31,841 research outputs found
Optimization of FPGA Based Neural Network Processor
Neural information processing is an emerging new field, providing an alternative
form of computation for demanding tasks such as pattern recognition problems
which are usually reserved for human attention. Neural network computation i s
sought after where classification of input data is difficult to be worked out using
equations or sets of rules.
Technological advances in integrated circuits such as Field Programmable Gate
Array (FPGA) systems have made it easier to develop and implement hardware
devices based on these neural network architectures. The motivation in hardware
implementation of neural networks is its fast processing speed and suitability in
parallel and pipelined processing.
The project revolves around the design of an optimized neural network processor.
The processor design is based on the feedforward network architecture type with
BackPropagation trained weights for the Exclusive-OR non-linear problem.
Among the highlights of the project is the improvement in neural network
architecture through reconfigurable and recursive computation of a single hidden
layer for multiple layer applications. Improvements in processor organization were
also made which enables the design to parallel process with similar processors.
Other improvements include design considerations to reduce the amount of logic
required for implementation without much sacrifice of processing speed
Report from the MPP Working Group to the NASA Associate Administrator for Space Science and Applications
NASA's Office of Space Science and Applications (OSSA) gave a select group of scientists the opportunity to test and implement their computational algorithms on the Massively Parallel Processor (MPP) located at Goddard Space Flight Center, beginning in late 1985. One year later, the Working Group presented its report, which addressed the following: algorithms, programming languages, architecture, programming environments, the way theory relates, and performance measured. The findings point to a number of demonstrated computational techniques for which the MPP architecture is ideally suited. For example, besides executing much faster on the MPP than on conventional computers, systolic VLSI simulation (where distances are short), lattice simulation, neural network simulation, and image problems were found to be easier to program on the MPP's architecture than on a CYBER 205 or even a VAX. The report also makes technical recommendations covering all aspects of MPP use, and recommendations concerning the future of the MPP and machines based on similar architectures, expansion of the Working Group, and study of the role of future parallel processors for space station, EOS, and the Great Observatories era
Memory and information processing in neuromorphic systems
A striking difference between brain-inspired neuromorphic processors and
current von Neumann processors architectures is the way in which memory and
processing is organized. As Information and Communication Technologies continue
to address the need for increased computational power through the increase of
cores within a digital processor, neuromorphic engineers and scientists can
complement this need by building processor architectures where memory is
distributed with the processing. In this paper we present a survey of
brain-inspired processor architectures that support models of cortical networks
and deep neural networks. These architectures range from serial clocked
implementations of multi-neuron systems to massively parallel asynchronous ones
and from purely digital systems to mixed analog/digital systems which implement
more biological-like models of neurons and synapses together with a suite of
adaptation and learning mechanisms analogous to the ones found in biological
nervous systems. We describe the advantages of the different approaches being
pursued and present the challenges that need to be addressed for building
artificial neural processing systems that can display the richness of behaviors
seen in biological systems.Comment: Submitted to Proceedings of IEEE, review of recently proposed
neuromorphic computing platforms and system
Runtime Optimizations for Prediction with Tree-Based Models
Tree-based models have proven to be an effective solution for web ranking as
well as other problems in diverse domains. This paper focuses on optimizing the
runtime performance of applying such models to make predictions, given an
already-trained model. Although exceedingly simple conceptually, most
implementations of tree-based models do not efficiently utilize modern
superscalar processor architectures. By laying out data structures in memory in
a more cache-conscious fashion, removing branches from the execution flow using
a technique called predication, and micro-batching predictions using a
technique called vectorization, we are able to better exploit modern processor
architectures and significantly improve the speed of tree-based models over
hard-coded if-else blocks. Our work contributes to the exploration of
architecture-conscious runtime implementations of machine learning algorithms
Recommended from our members
Parallel data compression
Data compression schemes remove data redundancy in communicated and stored data and increase the effective capacities of communication and storage devices. Parallel algorithms and implementations for textual data compression are surveyed. Related concepts from parallel computation and information theory are briefly discussed. Static and dynamic methods for codeword construction and transmission on various models of parallel computation are described. Included are parallel methods which boost system speed by coding data concurrently, and approaches which employ multiple compression techniques to improve compression ratios. Theoretical and empirical comparisons are reported and areas for future research are suggested
Optimization of FPGA Based Neural Network Processor
Neural information processing is an emerging new field, providing an alternative
form of computation for demanding tasks such as pattern recognition problems
which are usually reserved for human attention. Neural network computation i s
sought after where classification of input data is difficult to be worked out using
equations or sets of rules.
Technological advances in integrated circuits such as Field Programmable Gate
Array (FPGA) systems have made it easier to develop and implement hardware
devices based on these neural network architectures. The motivation in hardware
implementation of neural networks is its fast processing speed and suitability in
parallel and pipelined processing.
The project revolves around the design of an optimized neural network processor.
The processor design is based on the feedforward network architecture type with
BackPropagation trained weights for the Exclusive-OR non-linear problem.
Among the highlights of the project is the improvement in neural network
architecture through reconfigurable and recursive computation of a single hidden
layer for multiple layer applications. Improvements in processor organization were
also made which enables the design to parallel process with similar processors.
Other improvements include design considerations to reduce the amount of logic
required for implementation without much sacrifice of processing speed
- …