4,343 research outputs found

    Adaptive Routing Approaches for Networked Many-Core Systems

    Get PDF
    Through advances in technology, System-on-Chip design is moving towards integrating tens to hundreds of intellectual property blocks into a single chip. In such a many-core system, on-chip communication becomes a performance bottleneck for high performance designs. Network-on-Chip (NoC) has emerged as a viable solution for the communication challenges in highly complex chips. The NoC architecture paradigm, based on a modular packet-switched mechanism, can address many of the on-chip communication challenges such as wiring complexity, communication latency, and bandwidth. Furthermore, the combined benefits of 3D IC and NoC schemes provide the possibility of designing a high performance system in a limited chip area. The major advantages of 3D NoCs are the considerable reductions in average latency and power consumption. There are several factors degrading the performance of NoCs. In this thesis, we investigate three main performance-limiting factors: network congestion, faults, and the lack of efficient multicast support. We address these issues by the means of routing algorithms. Congestion of data packets may lead to increased network latency and power consumption. Thus, we propose three different approaches for alleviating such congestion in the network. The first approach is based on measuring the congestion information in different regions of the network, distributing the information over the network, and utilizing this information when making a routing decision. The second approach employs a learning method to dynamically find the less congested routes according to the underlying traffic. The third approach is based on a fuzzy-logic technique to perform better routing decisions when traffic information of different routes is available. Faults affect performance significantly, as then packets should take longer paths in order to be routed around the faults, which in turn increases congestion around the faulty regions. We propose four methods to tolerate faults at the link and switch level by using only the shortest paths as long as such path exists. The unique characteristic among these methods is the toleration of faults while also maintaining the performance of NoCs. To the best of our knowledge, these algorithms are the first approaches to bypassing faults prior to reaching them while avoiding unnecessary misrouting of packets. Current implementations of multicast communication result in a significant performance loss for unicast traffic. This is due to the fact that the routing rules of multicast packets limit the adaptivity of unicast packets. We present an approach in which both unicast and multicast packets can be efficiently routed within the network. While suggesting a more efficient multicast support, the proposed approach does not affect the performance of unicast routing at all. In addition, in order to reduce the overall path length of multicast packets, we present several partitioning methods along with their analytical models for latency measurement. This approach is discussed in the context of 3D mesh networks.Siirretty Doriast

    Experimental demonstration of associative memory with memristive neural networks

    Get PDF
    When someone mentions the name of a known person we immediately recall her face and possibly many other traits. This is because we possess the so-called associative memory - the ability to correlate different memories to the same fact or event. Associative memory is such a fundamental and encompassing human ability (and not just human) that the network of neurons in our brain must perform it quite easily. The question is then whether electronic neural networks - electronic schemes that act somewhat similarly to human brains - can be built to perform this type of function. Although the field of neural networks has developed for many years, a key element, namely the synapses between adjacent neurons, has been lacking a satisfactory electronic representation. The reason for this is that a passive circuit element able to reproduce the synapse behaviour needs to remember its past dynamical history, store a continuous set of states, and be "plastic" according to the pre-synaptic and post-synaptic neuronal activity. Here we show that all this can be accomplished by a memory-resistor (memristor for short). In particular, by using simple and inexpensive off-the-shelf components we have built a memristor emulator which realizes all required synaptic properties. Most importantly, we have demonstrated experimentally the formation of associative memory in a simple neural network consisting of three electronic neurons connected by two memristor-emulator synapses. This experimental demonstration opens up new possibilities in the understanding of neural processes using memory devices, an important step forward to reproduce complex learning, adaptive and spontaneous behaviour with electronic neural networks

    Tuning the Level of Concurrency in Software Transactional Memory: An Overview of Recent Analytical, Machine Learning and Mixed Approaches

    Get PDF
    Synchronization transparency offered by Software Transactional Memory (STM) must not come at the expense of run-time efficiency, thus demanding from the STM-designer the inclusion of mechanisms properly oriented to performance and other quality indexes. Particularly, one core issue to cope with in STM is related to exploiting parallelism while also avoiding thrashing phenomena due to excessive transaction rollbacks, caused by excessively high levels of contention on logical resources, namely concurrently accessed data portions. A means to address run-time efficiency consists in dynamically determining the best-suited level of concurrency (number of threads) to be employed for running the application (or specific application phases) on top of the STM layer. For too low levels of concurrency, parallelism can be hampered. Conversely, over-dimensioning the concurrency level may give rise to the aforementioned thrashing phenomena caused by excessive data contention—an aspect which has reflections also on the side of reduced energy-efficiency. In this chapter we overview a set of recent techniques aimed at building “application-specific” performance models that can be exploited to dynamically tune the level of concurrency to the best-suited value. Although they share some base concepts while modeling the system performance vs the degree of concurrency, these techniques rely on disparate methods, such as machine learning or analytic methods (or combinations of the two), and achieve different tradeoffs in terms of the relation between the precision of the performance model and the latency for model instantiation. Implications of the different tradeoffs in real-life scenarios are also discussed

    Computer Architecture for Object Recognition and Sensing

    Get PDF
    The notion of using many, most likely different, sensory subsystems in a computer object recognition system immediately provokes several questions: - How will multiple sensors be used in conjunction? - What object qualities are best described by which sensor, and how is sensor utilization optimized? - To what extent does the information provided by each sensor overlap with that provided by others, and how then is this used

    A distributed QoS Routing and CAC framework: performance evaluation of its SSRA and InterD Agents

    Get PDF
    In order to support multimedia communication, it is necessary to develop routing algorithms which use for routing more than one QoS parameters. This is because new services such as video on demand and remote meeting systems require better QoS. Also, for admission control of multimedia applications different QoS parameters should be considered. In our previous work, we proposed an intelligent routing and CAC strategy using cooperative agents. In this paper, we propose and evaluate the performance of SSRA algorithm and a GA-based InterD agent. Performace evaluation shows that proposed agents have a good behaviorPeer ReviewedPostprint (published version

    Path-Based partitioning methods for 3D Networks-on-Chip with minimal adaptive routing

    Full text link
    © 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Combining the benefits of 3D ICs and Networks-on-Chip (NoCs) schemes provides a significant performance gain in Chip Multiprocessors (CMPs) architectures. As multicast communication is commonly used in cache coherence protocols for CMPs and in various parallel applications, the performance of these systems can be significantly improved if multicast operations are supported at the hardware level. In this paper, we present several partitioning methods for the path-based multicast approach in 3D mesh-based NoCs, each with different levels of efficiency. In addition, we develop novel analytical models for unicast and multicast traffic to explore the efficiency of each approach. In order to distribute the unicast and multicast traffic more efficiently over the network, we propose the Minimal and Adaptive Routing (MAR) algorithm for the presented partitioning methods. The analytical and experimental results show that an advantageous method named Recursive Partitioning (RP) outperforms the other approaches. RP recursively partitions the network until all partitions contain a comparable number of switches and thus the multicast traffic is equally distributed among several subsets and the network latency is considerably decreased. The simulation results reveal that the RP method can achieve performance improvement across all workloads while performance can be further improved by utilizing the MAR algorithm. Nineteen percent average and 42 percent maximum latency reduction are obtained on SPLASH-2 and PARSEC benchmarks running on a 64-core CMP.Ebrahimi, M.; Daneshtalab, M.; Liljeberg, P.; Plosila, J.; Flich Cardo, J.; Tenhunen, H. (2014). Path-Based partitioning methods for 3D Networks-on-Chip with minimal adaptive routing. IEEE Transactions on Computers. 63(3):718-733. doi:10.1109/TC.2012.255S71873363

    Wind energy harvester interface for sensor nodes

    Get PDF
    The research topic is developping a power converting interface for the novel FLEHAP wind energy harvester allowing the produced energy to be used for powering small wireless nodes. The harvester\u2019s electrical characteristics were studied and a strategy was developped to control and mainting a maximum power transfer. The electronic power converter interface was designed, containing an AC/DC Buck-Boost converter and controlled with a low power microcontroller. Different prototypes were developped that evolved by reducing the sources of power loss and rendering the system more efficient. The validation of the system was done through simulations in the COSMIC/DITEN lab using generated signals, and then follow-up experiments were conducted with a controllable wind tunnel in the DIFI department University of Genoa. The experiment results proved the functionality of the control algorithm as well as the efficiency that was ramped up by the hardware solutions that were implemented, and generally met the requirement to provide a power source for low-power sensor nodes
    corecore