11,753 research outputs found

    A scalable multi-core architecture with heterogeneous memory structures for Dynamic Neuromorphic Asynchronous Processors (DYNAPs)

    Full text link
    Neuromorphic computing systems comprise networks of neurons that use asynchronous events for both computation and communication. This type of representation offers several advantages in terms of bandwidth and power consumption in neuromorphic electronic systems. However, managing the traffic of asynchronous events in large scale systems is a daunting task, both in terms of circuit complexity and memory requirements. Here we present a novel routing methodology that employs both hierarchical and mesh routing strategies and combines heterogeneous memory structures for minimizing both memory requirements and latency, while maximizing programming flexibility to support a wide range of event-based neural network architectures, through parameter configuration. We validated the proposed scheme in a prototype multi-core neuromorphic processor chip that employs hybrid analog/digital circuits for emulating synapse and neuron dynamics together with asynchronous digital circuits for managing the address-event traffic. We present a theoretical analysis of the proposed connectivity scheme, describe the methods and circuits used to implement such scheme, and characterize the prototype chip. Finally, we demonstrate the use of the neuromorphic processor with a convolutional neural network for the real-time classification of visual symbols being flashed to a dynamic vision sensor (DVS) at high speed.Comment: 17 pages, 14 figure

    Simple Load Balancing for Distributed Hash Tables

    Full text link
    Distributed hash tables have recently become a useful building block for a variety of distributed applications. However, current schemes based upon consistent hashing require both considerable implementation complexity and substantial storage overhead to achieve desired load balancing goals. We argue in this paper that these goals can b e achieved more simply and more cost-effectively. First, we suggest the direct application of the "power of two choices" paradigm, whereby an item is stored at the less loaded of two (or more) random alternatives. We then consider how associating a small constant number of hash values with a key can naturally b e extended to support other load balancing methods, including load-stealing or load-shedding schemes, as well as providing natural fault-tolerance mechanisms

    Modeling Data-Plane Power Consumption of Future Internet Architectures

    Full text link
    With current efforts to design Future Internet Architectures (FIAs), the evaluation and comparison of different proposals is an interesting research challenge. Previously, metrics such as bandwidth or latency have commonly been used to compare FIAs to IP networks. We suggest the use of power consumption as a metric to compare FIAs. While low power consumption is an important goal in its own right (as lower energy use translates to smaller environmental impact as well as lower operating costs), power consumption can also serve as a proxy for other metrics such as bandwidth and processor load. Lacking power consumption statistics about either commodity FIA routers or widely deployed FIA testbeds, we propose models for power consumption of FIA routers. Based on our models, we simulate scenarios for measuring power consumption of content delivery in different FIAs. Specifically, we address two questions: 1) which of the proposed FIA candidates achieves the lowest energy footprint; and 2) which set of design choices yields a power-efficient network architecture? Although the lack of real-world data makes numerous assumptions necessary for our analysis, we explore the uncertainty of our calculations through sensitivity analysis of input parameters

    Overview of Swallow --- A Scalable 480-core System for Investigating the Performance and Energy Efficiency of Many-core Applications and Operating Systems

    Full text link
    We present Swallow, a scalable many-core architecture, with a current configuration of 480 x 32-bit processors. Swallow is an open-source architecture, designed from the ground up to deliver scalable increases in usable computational power to allow experimentation with many-core applications and the operating systems that support them. Scalability is enabled by the creation of a tile-able system with a low-latency interconnect, featuring an attractive communication-to-computation ratio and the use of a distributed memory configuration. We analyse the energy and computational and communication performances of Swallow. The system provides 240GIPS with each core consuming 71--193mW, dependent on workload. Power consumption per instruction is lower than almost all systems of comparable scale. We also show how the use of a distributed operating system (nOS) allows the easy creation of scalable software to exploit Swallow's potential. Finally, we show two use case studies: modelling neurons and the overlay of shared memory on a distributed memory system.Comment: An open source release of the Swallow system design and code will follow and references to these will be added at a later dat

    A Multifunctional Processing Board for the Fast Track Trigger of the H1 Experiment

    Get PDF
    The electron-proton collider HERA is being upgraded to provide higher luminosity from the end of the year 2001. In order to enhance the selectivity on exclusive processes a Fast Track Trigger (FTT) with high momentum resolution is being built for the H1 Collaboration. The FTT will perform a 3-dimensional reconstruction of curved tracks in a magnetic field of 1.1 Tesla down to 100 MeV in transverse momentum. It is able to reconstruct up to 48 tracks within 23 mus in a high track multiplicity environment. The FTT consists of two hardware levels L1, L2 and a third software level. Analog signals of 450 wires are digitized at the first level stage followed by a quick lookup of valid track segment patterns. For the main processing tasks at the second level such as linking, fitting and deciding, a multifunctional processing board has been developed by the ETH Zurich in collaboration with Supercomputing Systems (Zurich). It integrates a high-density FPGA (Altera APEX 20K600E) and four floating point DSPs (Texas Instruments TMS320C6701). This presentation will mainly concentrate on second trigger level hardware aspects and on the implementation of the algorithms used for linking and fitting. Emphasis is especially put on the integrated CAM (content addressable memory) functionality of the FPGA, which is ideally suited for implementing fast search tasks like track segment linking.Comment: 6 pages, 4 figures, submitted to TN
    corecore