623 research outputs found

    On the Effectiveness of Source Throttling for Networks-on-Chip in Chip Multiprocessor Designs

    Get PDF
    In modern chip-multiprocessor (CMP) designs, with the increasing number of cores, traffic between different cores keeps increasing. Consequently, on-chip interconnection networks experience increasingly large communication bandwidth demand. This thesis focuses on Quality-of-Service (QoS) of Networks-on-Chip (NoC). NoC is considered as a scalable approach of interconnection network compared to conventional bus-based architecture. Like Ethernet, NoC faces common QoS issues such as bandwidth utilization and fairness. This thesis is a study on the effectiveness of source throttling for NoC, including fairness and overall performance such as program run time and packet latency. Source throttling is a well-known technique for traffic regulation. It is shown to be effective for bufferless NoC in previous studies. Due to different traffic behaviors and characteristics, however, it is not obvious if source throttling is effective for general buffered NoC. The first part of this research is a set of network simulations on various synthetic traffic cases. The results indicate that source throttling can reduce application runtime when (1) the network is congested, (2) there are dependencies among communication requests, and (3) the width of the dependence graph must be sufficiently large. The second part is full system simulations on public benchmark suites. Source throttling does not bring benefit for these relative realistic cases. Further experiment reveals that the aforementioned conditions are not satisfied. This explains why source throttling is of little use for general buffered NoC in CMP designs

    SpinLink: An interconnection system for the SpiNNaker biologically inspired multi-computer

    No full text
    SpiNNaker is a large-scale biologically-inspired multi-computer designed to model very heavily distributed problems, with the flagship application being the simulation of large neural networks. The project goal is to have one million processors included in a single machine, which consequently span many thousands of circuit boards. A computer of this scale imposes large communication requirements between these boards, and requires an extensible method of connecting to external equipment such as sensors, actuators and visualisation systems. This paper describes two systems that can address each of these problems.Firstly, SpinLink is a proposed method of connecting the SpiNNaker boards by using time-division multiplexing (TDM) to allow eight SpiNNaker links to run at maximum bandwidth between two boards. SpinLink will be deployed on Spartan-6 FPGAs and uses a locally generated clock that can be paused while the asynchronous links from SpiNNaker are sending data, thus ensuring a fast and glitch-free response. Secondly, SpiNNterceptor is a separate system, currently in the early stages of design, that will build upon SpinLink to address the important external I/O issues faced by SpiNNaker. Specifically, spare resources in the FPGAs will be used to implement the debugging and I/O interfacing features of SpiNNterceptor

    A Survey and Comparative Study of Hard and Soft Real-time Dynamic Resource Allocation Strategies for Multi/Many-core Systems

    Get PDF
    Multi-/many-core systems are envisioned to satisfy the ever-increasing performance requirements of complex applications in various domains such as embedded and high-performance computing. Such systems need to cater to increasingly dynamic workloads, requiring efficient dynamic resource allocation strategies to satisfy hard or soft real-time constraints. This article provides an extensive survey of hard and soft real-time dynamic resource allocation strategies proposed since the mid-1990s and highlights the emerging trends for multi-/many-core systems. The survey covers a taxonomy of the resource allocation strategies and considers their various optimization objectives, which have been used to provide comprehensive comparison. The strategies employ various principles, such as market and biological concepts, to perform the optimizations. The trend followed by the resource allocation strategies, open research challenges, and likely emerging research directions have also been provided

    A Survey of Prediction and Classification Techniques in Multicore Processor Systems

    Get PDF
    In multicore processor systems, being able to accurately predict the future provides new optimization opportunities, which otherwise could not be exploited. For example, an oracle able to predict a certain application\u27s behavior running on a smart phone could direct the power manager to switch to appropriate dynamic voltage and frequency scaling modes that would guarantee minimum levels of desired performance while saving energy consumption and thereby prolonging battery life. Using predictions enables systems to become proactive rather than continue to operate in a reactive manner. This prediction-based proactive approach has become increasingly popular in the design and optimization of integrated circuits and of multicore processor systems. Prediction transforms from simple forecasting to sophisticated machine learning based prediction and classification that learns from existing data, employs data mining, and predicts future behavior. This can be exploited by novel optimization techniques that can span across all layers of the computing stack. In this survey paper, we present a discussion of the most popular techniques on prediction and classification in the general context of computing systems with emphasis on multicore processors. The paper is far from comprehensive, but, it will help the reader interested in employing prediction in optimization of multicore processor systems

    Progressive congestion management based on packet marking and validation techniques

    Full text link
    © 2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Congestion management in multistage interconnection networks is a serious problem, which is not solved completely. In order to avoid the degradation of network performance when congestion appears, several congestion management mechanisms have been proposed. Most of these mechanisms are based on explicit congestion notification. For this purpose, switches detect congestion and depending on the applied strategy, packets are marked to warn the source hosts. In response, source hosts apply some corrective actions to adjust their packet injection rate. Although these proposals seem quite effective, they either exhibit some drawbacks or are partial solutions. Some of them introduce some penalties over the flows not responsible for congestion, whereas others can cope only with congestion situations that last for a short time. In this paper, we present an overview of the different strategies to detect and correct congestion in multistage interconnection networks, and propose a new mechanism referred to as Marking and Validation Congestion Management (MVCM), targeted to this kind of lossless networks, and based on a more refined packet marking strategy combined with a fair set of corrective actions, that makes the mechanism able to effectively manage congestion regardless of the congestion degree. Evaluation results show the effectiveness and robustness of the proposed mechanism.This work was supported by the Spanish MEC and MICINN, as well as European Commission FEDER funds, under Grants CSD2006-00046 and TIN2009-14475-C04-01.Ferrer Pérez, JL.; Baydal Cardona, ME.; Robles Martínez, A.; López Rodríguez, PJ.; Duato Marín, JF. (2012). Progressive congestion management based on packet marking and validation techniques. IEEE Transactions on Computers. 61(9):1296-1309. doi:10.1109/TC.2011.146S1296130961

    Electronic and photonic switching in the atm era

    Get PDF
    Broadband networks require high-capacity switches in order to properly manage large amounts of traffic fluxes. Electronic and photonic technologies are being used to achieve this objective both allowing different multiplexing and switching techniques. Focusing on the asynchronous transfer mode (ATM), the inherent different characteristics of electronics and photonics makes different architectures feasible. In this paper, different switching structures are described, several ATM switching architectures which have been recently implemented are presented and the implementation characteristics discussed. Three diverse points of view are given from the electronic research, the photonic research and the commercial switches. Although all the architectures where successfully tested, they should also follow different market requirements in order to be commercialised. The characteristics are presented and the architectures projected over them to evaluate their commercial capabilities.Peer ReviewedPostprint (published version

    A Reactive and Cycle-True IP Emulator for MPSoC Exploration

    Get PDF
    The design of MultiProcessor Systems-on-Chip (MPSoC) emphasizes intellectual-property (IP)-based communication-centric approaches. Therefore, for the optimization of the MPSoC interconnect, the designer must develop traffic models that realistically capture the application behavior as executing on the IP core. In this paper, we introduce a Reactive IP Emulator (RIPE) that enables an effective emulation of the IP-core behavior in multiple environments, including bitand cycle-true simulation. The RIPE is built as a multithreaded abstract instruction-set processor, and it can generate reactive traffic patterns. We compare the RIPE models with cycle-true functional simulation of complex application behavior (tasksynchronization, multitasking, and input/output operations). Our results demonstrate high-accuracy and significant speedups. Furthermore, via a case study, we show the potential use of the RIPE in a design-space-exploration context

    Introduction to Multiprocessor I/O Architecture

    Get PDF
    The computational performance of multiprocessors continues to improve by leaps and bounds, fueled in part by rapid improvements in processor and interconnection technology. I/O performance thus becomes ever more critical, to avoid becoming the bottleneck of system performance. In this paper we provide an introduction to I/O architectural issues in multiprocessors, with a focus on disk subsystems. While we discuss examples from actual architectures and provide pointers to interesting research in the literature, we do not attempt to provide a comprehensive survey. We concentrate on a study of the architectural design issues, and the effects of different design alternatives
    • …
    corecore