116 research outputs found
DANNA2: Dynamic Adaptive Neural Network Arrays
Traditional Von Neumann architectures have been at the center of computing for decades thanks in part to Moore\u27s Law and Dennard Scaling. However, MOSFET scaling is rapidly approaching its physical limits spelling the end of an era. This is causing researchers to examine alternative solutions. Neuromorphic computing is a paradigm shift which may offer increased capabilities and efficiency by borrowing concepts from biology and incorporating them into an alternative computing platform.The TENNLab group explores these architectures and the associated challenges. The group currently has a mature hardware platform referred to as Dynamic Adaptive Neural Network Arrays (DANNA). DANNA is a digital discrete spiking neural network architecture with software, FPGA, and VLSI implementations. This work introduces a successor architecture built on the lessons learned from prior models. The DANNA2 model offers an order of magnitude improvement over DANNA in both simulation speed and hardware clock frequency while expanding functionality and improving effective density
Scalable High-Speed Communications for Neuromorphic Systems
Field-programmable gate arrays (FPGA), application-specific integrated circuits (ASIC), and other chip/multi-chip level implementations can be used to implement Dynamic Adaptive Neural Network Arrays (DANNA). In some applications, DANNA interfaces with a traditional computing system to provide neural network configuration information, provide network input, process network outputs, and monitor the state of the network. The present host-to-DANNA network communication setup uses a Cypress USB 3.0 peripheral controller (FX3) to enable host-to-array communication over USB 3.0. This communications setup has to run commands in batches and does not have enough bandwidth to meet the maximum throughput requirements of the DANNA device, resulting in output packet loss. Also, the FX3 is unable to scale to support larger single-chip or multi-chip configurations. To alleviate communication limitations and to expand scalability, a new communications solution is presented which takes advantage of the GTX/GTH high-speed serial transceivers found on Xilinx FPGAs. A Xilinx VC707 evaluation kit is used to prototype the new communications board. The high-speed transceivers are used to communicate to the host computer via PCIe and to communicate to the DANNA arrays with the link layer protocol Aurora. The new communications board is able to outperform the FX3, reducing the latency in the communication and increasing the throughput of data. This new communications setup will be used to further DANNA research by allowing the DANNA arrays to scale to larger sizes and for multiple DANNA arrays to be connected to a single communication board
Middleware and Services for Dynamic Adaptive Neural Network Arrays
Dynamic Adaptive Neural Network Arrays (DANNAs) are neuromorphic systems that exhibit spiking behaviors and can be designed using evolutionary optimization. Array elements are rapidly reconfigurable and can function as either neurons or synapses with programmable interconnections and parameters. Visualization applications can examine DANNA element connections, parameters, and functionality, and evolutionary optimization applications can utilize DANNA to speedup neural network simulations. To facilitate interactions with DANNAs from these applications, we have developed a language-agnostic application programming interface (API) that abstracts away low-level communication details with a DANNA and provides a high-level interface for reprogramming and controlling a DANNA. The library has also been designed in modules in order to adapt to future changes in the design of DANNA, including changes to the DANNA element design, DANNA communication protocol, and connection. In addition to communicating with DANNAs, it is also beneficial for applications to store networks with known functionality. Hence, a Representational State Transfer (REST) API with a MongoDB database back-end has been developed to encourage the collection and exploration of networks
A Language and Hardware Independent Approach to Quantum-Classical Computing
Heterogeneous high-performance computing (HPC) systems offer novel
architectures which accelerate specific workloads through judicious use of
specialized coprocessors. A promising architectural approach for future
scientific computations is provided by heterogeneous HPC systems integrating
quantum processing units (QPUs). To this end, we present XACC (eXtreme-scale
ACCelerator) --- a programming model and software framework that enables
quantum acceleration within standard or HPC software workflows. XACC follows a
coprocessor machine model that is independent of the underlying quantum
computing hardware, thereby enabling quantum programs to be defined and
executed on a variety of QPUs types through a unified application programming
interface. Moreover, XACC defines a polymorphic low-level intermediate
representation, and an extensible compiler frontend that enables language
independent quantum programming, thus promoting integration and
interoperability across the quantum programming landscape. In this work we
define the software architecture enabling our hardware and language independent
approach, and demonstrate its usefulness across a range of quantum computing
models through illustrative examples involving the compilation and execution of
gate and annealing-based quantum programs
Aging-Aware Request Scheduling for Non-Volatile Main Memory
Modern computing systems are embracing non-volatile memory (NVM) to implement
high-capacity and low-cost main memory. Elevated operating voltages of NVM
accelerate the aging of CMOS transistors in the peripheral circuitry of each
memory bank. Aggressive device scaling increases power density and temperature,
which further accelerates aging, challenging the reliable operation of
NVM-based main memory. We propose HEBE, an architectural technique to mitigate
the circuit aging-related problems of NVM-based main memory. HEBE is built on
three contributions. First, we propose a new analytical model that can
dynamically track the aging in the peripheral circuitry of each memory bank
based on the bank's utilization. Second, we develop an intelligent memory
request scheduler that exploits this aging model at run time to de-stress the
peripheral circuitry of a memory bank only when its aging exceeds a critical
threshold. Third, we introduce an isolation transistor to decouple parts of a
peripheral circuit operating at different voltages, allowing the decoupled
logic blocks to undergo long-latency de-stress operations independently and off
the critical path of memory read and write accesses, improving performance. We
evaluate HEBE with workloads from the SPEC CPU2017 Benchmark suite. Our results
show that HEBE significantly improves both performance and lifetime of
NVM-based main memory.Comment: To appear in ASP-DAC 202
- …