261 research outputs found

    Verifiable Network-Performance Measurements

    Get PDF
    In the current Internet, there is no clean way for affected parties to react to poor forwarding performance: when a domain violates its Service Level Agreement (SLA) with a contractual partner, the partner must resort to ad-hoc probing-based monitoring to determine the existence and extent of the violation. Instead, we propose a new, systematic approach to the problem of forwarding-performance verification. Our mechanism relies on voluntary reporting, allowing each domain to disclose its loss and delay performance to its neighbors; it does not disclose any information regarding the participating domains' topology or routing policies beyond what is already publicly available. Most importantly, it enables verifiable performance measurements, i.e., domains cannot abuse it to significantly exaggerate their performance. Finally, our mechanism is tunable, allowing each participating domain to determine how many resources to devote to it independently (i.e., without any inter-domain coordination), exposing a controllable trade-off between performance-verification quality and resource consumption. Our mechanism comes at the cost of deploying modest functionality at the participating domains' border routers; we show that it requires reasonable processing and memory resources within modern network capabilities.Comment: 14 page

    Adaptive Routing Approaches for Networked Many-Core Systems

    Get PDF
    Through advances in technology, System-on-Chip design is moving towards integrating tens to hundreds of intellectual property blocks into a single chip. In such a many-core system, on-chip communication becomes a performance bottleneck for high performance designs. Network-on-Chip (NoC) has emerged as a viable solution for the communication challenges in highly complex chips. The NoC architecture paradigm, based on a modular packet-switched mechanism, can address many of the on-chip communication challenges such as wiring complexity, communication latency, and bandwidth. Furthermore, the combined benefits of 3D IC and NoC schemes provide the possibility of designing a high performance system in a limited chip area. The major advantages of 3D NoCs are the considerable reductions in average latency and power consumption. There are several factors degrading the performance of NoCs. In this thesis, we investigate three main performance-limiting factors: network congestion, faults, and the lack of efficient multicast support. We address these issues by the means of routing algorithms. Congestion of data packets may lead to increased network latency and power consumption. Thus, we propose three different approaches for alleviating such congestion in the network. The first approach is based on measuring the congestion information in different regions of the network, distributing the information over the network, and utilizing this information when making a routing decision. The second approach employs a learning method to dynamically find the less congested routes according to the underlying traffic. The third approach is based on a fuzzy-logic technique to perform better routing decisions when traffic information of different routes is available. Faults affect performance significantly, as then packets should take longer paths in order to be routed around the faults, which in turn increases congestion around the faulty regions. We propose four methods to tolerate faults at the link and switch level by using only the shortest paths as long as such path exists. The unique characteristic among these methods is the toleration of faults while also maintaining the performance of NoCs. To the best of our knowledge, these algorithms are the first approaches to bypassing faults prior to reaching them while avoiding unnecessary misrouting of packets. Current implementations of multicast communication result in a significant performance loss for unicast traffic. This is due to the fact that the routing rules of multicast packets limit the adaptivity of unicast packets. We present an approach in which both unicast and multicast packets can be efficiently routed within the network. While suggesting a more efficient multicast support, the proposed approach does not affect the performance of unicast routing at all. In addition, in order to reduce the overall path length of multicast packets, we present several partitioning methods along with their analytical models for latency measurement. This approach is discussed in the context of 3D mesh networks.Siirretty Doriast

    Domain-specific Architectures for Data-intensive Applications

    Full text link
    Graphs' versatile ability to represent diverse relationships, make them effective for a wide range of applications. For instance, search engines use graph-based applications to provide high-quality search results. Medical centers use them to aid in patient diagnosis. Most recently, graphs are also being employed to support the management of viral pandemics. Looking forward, they are showing promise of being critical in unlocking several other opportunities, including combating the spread of fake content in social networks, detecting and preventing fraudulent online transactions in a timely fashion, and in ensuring collision avoidance in autonomous vehicle navigation, to name a few. Unfortunately, all these applications require more computational power than what can be provided by conventional computing systems. The key reason is that graph applications present large working sets that fail to fit in the small on-chip storage of existing computing systems, while at the same time they access data in seemingly unpredictable patterns, thus cannot draw benefit from traditional on-chip storage. In this dissertation, we set out to address the performance limitations of existing computing systems so to enable emerging graph applications like those described above. To achieve this, we identified three key strategies: 1) specializing memory architecture, 2) processing data near its storage, and 3) message coalescing in the network. Based on these strategies, this dissertation develops several solutions: OMEGA, which employs specialized on-chip storage units, with co-located specialized compute engines to accelerate the computation; MessageFusion, which coalesces messages in the interconnect; and Centaur, providing an architecture that optimizes the processing of infrequently-accessed data. Overall, these solutions provide 2x in performance improvements, with negligible hardware overheads, across a wide range of applications. Finally, we demonstrate the applicability of our strategies to other data-intensive domains, by exploring an acceleration solution for MapReduce applications, which achieves a 4x performance speedup, also with negligible area and power overheads.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/163186/1/abrahad_1.pd

    Network Interface Design for Network-on-Chip

    Get PDF
    In the culture of globalized integrated circuit (IC, a.k.a chip) production, the use of Intellectual Property (IP) cores, computer aided design tools (CAD) and testing services from un-trusted vendors are prevalent to reduce the time to market. Unfortunately, the globalized business model potentially creates opportunities for hardware tampering and modification from adversary, and this tampering is known as hardware Trojan (HT). Network-on-chip (NoC) has emerged as an efficient on-chip communication infrastructure. In this work, the security aspects of NoC network interface (NI), one of the most critical components in NoC will be investigated and presented. Particularly, the NI design, hardware attack models and countermeasures for NI in a NoC system are explored. An OCP compatible NI is implemented in an IBM0.18ìm CMOS technology. The synthesis results are presented and compared with existing literature. Second, comprehensive hardware attack models targeted for NI are presented from system level to circuit level. The impact of hardware Trojans on NoC functionality and performance are evaluated. Finally, a countermeasure method is proposed to address the hardware attacks in NIs

    AI/ML Algorithms and Applications in VLSI Design and Technology

    Full text link
    An evident challenge ahead for the integrated circuit (IC) industry in the nanometer regime is the investigation and development of methods that can reduce the design complexity ensuing from growing process variations and curtail the turnaround time of chip manufacturing. Conventional methodologies employed for such tasks are largely manual; thus, time-consuming and resource-intensive. In contrast, the unique learning strategies of artificial intelligence (AI) provide numerous exciting automated approaches for handling complex and data-intensive tasks in very-large-scale integration (VLSI) design and testing. Employing AI and machine learning (ML) algorithms in VLSI design and manufacturing reduces the time and effort for understanding and processing the data within and across different abstraction levels via automated learning algorithms. It, in turn, improves the IC yield and reduces the manufacturing turnaround time. This paper thoroughly reviews the AI/ML automated approaches introduced in the past towards VLSI design and manufacturing. Moreover, we discuss the scope of AI/ML applications in the future at various abstraction levels to revolutionize the field of VLSI design, aiming for high-speed, highly intelligent, and efficient implementations

    Reliability-aware and energy-efficient system level design for networks-on-chip

    Get PDF
    2015 Spring.Includes bibliographical references.With CMOS technology aggressively scaling into the ultra-deep sub-micron (UDSM) regime and application complexity growing rapidly in recent years, processors today are being driven to integrate multiple cores on a chip. Such chip multiprocessor (CMP) architectures offer unprecedented levels of computing performance for highly parallel emerging applications in the era of digital convergence. However, a major challenge facing the designers of these emerging multicore architectures is the increased likelihood of failure due to the rise in transient, permanent, and intermittent faults caused by a variety of factors that are becoming more and more prevalent with technology scaling. On-chip interconnect architectures are particularly susceptible to faults that can corrupt transmitted data or prevent it from reaching its destination. Reliability concerns in UDSM nodes have in part contributed to the shift from traditional bus-based communication fabrics to network-on-chip (NoC) architectures that provide better scalability, performance, and utilization than buses. In this thesis, to overcome potential faults in NoCs, my research began by exploring fault-tolerant routing algorithms. Under the constraint of deadlock freedom, we make use of the inherent redundancy in NoCs due to multiple paths between packet sources and sinks and propose different fault-tolerant routing schemes to achieve much better fault tolerance capabilities than possible with traditional routing schemes. The proposed schemes also use replication opportunistically to optimize the balance between energy overhead and arrival rate. As 3D integrated circuit (3D-IC) technology with wafer-to-wafer bonding has been recently proposed as a promising candidate for future CMPs, we also propose a fault-tolerant routing scheme for 3D NoCs which outperforms the existing popular routing schemes in terms of energy consumption, performance and reliability. To quantify reliability and provide different levels of intelligent protection, for the first time, we propose the network vulnerability factor (NVF) metric to characterize the vulnerability of NoC components to faults. NVF determines the probabilities that faults in NoC components manifest as errors in the final program output of the CMP system. With NVF aware partial protection for NoC components, almost 50% energy cost can be saved compared to the traditional approach of comprehensively protecting all NoC components. Lastly, we focus on the problem of fault-tolerant NoC design, that involves many NP-hard sub-problems such as core mapping, fault-tolerant routing, and fault-tolerant router configuration. We propose a novel design-time (RESYN) and a hybrid design and runtime (HEFT) synthesis framework to trade-off energy consumption and reliability in the NoC fabric at the system level for CMPs. Together, our research in fault-tolerant NoC routing, reliability modeling, and reliability aware NoC synthesis substantially enhances NoC reliability and energy-efficiency beyond what is possible with traditional approaches and state-of-the-art strategies from prior work

    On Dynamic Monitoring Methods for Networks-on-Chip

    Get PDF
    Rapid ongoing evolution of multiprocessors will lead to systems with hundreds of processing cores integrated in a single chip. An emerging challenge is the implementation of reliable and efficient interconnection between these cores as well as other components in the systems. Network-on-Chip is an interconnection approach which is intended to solve the performance bottleneck caused by traditional, poorly scalable communication structures such as buses. However, a large on-chip network involves issues related to congestion problems and system control, for instance. Additionally, faults can cause problems in multiprocessor systems. These faults can be transient faults, permanent manufacturing faults, or they can appear due to aging. To solve the emerging traffic management, controllability issues and to maintain system operation regardless of faults a monitoring system is needed. The monitoring system should be dynamically applicable to various purposes and it should fully cover the system under observation. In a large multiprocessor the distances between components can be relatively long. Therefore, the system should be designed so that the amount of energy-inefficient long-distance communication is minimized. This thesis presents a dynamically clustered distributed monitoring structure. The monitoring is distributed so that no centralized control is required for basic tasks such as traffic management and task mapping. To enable extensive analysis of different Network-on-Chip architectures, an in-house SystemC based simulation environment was implemented. It allows transaction level analysis without time consuming circuit level implementations during early design phases of novel architectures and features. The presented analysis shows that the dynamically clustered monitoring structure can be efficiently utilized for traffic management in faulty and congested Network-on-Chip-based multiprocessor systems. The monitoring structure can be also successfully applied for task mapping purposes. Furthermore, the analysis shows that the presented in-house simulation environment is flexible and practical tool for extensive Network-on-Chip architecture analysis.Siirretty Doriast

    Floorplan-Aware High Performance NoC Design

    Full text link
    Las actuales arquitecturas de m�ltiples n�cleos como los chip multiprocesadores (CMP) y soluciones multiprocesador para sistemas dentro del chip (MPSoCs) han adoptado a las redes dentro del chip (NoC) como elemento -ptimo para la inter-conexi-n de los diversos elementos de dichos sistemas. En este sentido, fabricantes de CMPs y MPSoCs han adoptado NoCs sencillas, generalmente con una topolog'a en malla o anillo, ya que son suficientes para satisfacer las necesidades de los sistemas actuales. Sin embargo a medida que los requerimientos del sistema -- baja latencia y alto rendimiento -- se hacen m�s exigentes, estas redes tan simples dejan de ser una soluci-n real. As', la comunidad investigadora ha propuesto y analizado NoCs m�s complejas. No obstante, estas soluciones son m�s dif'ciles de implementar -- especialmente los enlaces largos -- haciendo que este tipo de topolog'as complejas sean demasiado costosas o incluso inviables. En esta tesis, presentamos una metodolog'a de dise-o que minimiza la p�rdida de prestaciones de la red debido a su implementaci-n real. Los principales problemas que se encuentran al implementar una NoC son los conmutadores y los enlaces largos. En esta tesis, el conmutador se ha hecho modular, es decir, formado como uni-n de m-dulos m�s peque-os. En nuestro caso, los m-dulos son id�nticos, donde cada m-dulo es capaz de arbitrar, conmutar, y almacenar los mensajes que le llegan. Posteriormente, flexibilizamos la colocaci-n de estos m-dulos en el chip, permitiendo que m-dulos de un mismo conmutador est�n distribuidos por el chip. Esta metodolog'a de dise-o la hemos aplicado a diferentes escenarios. Primeramente, hemos introducido nuestro conmutador modular en NoCs con topolog'as conocidas como la malla 2D. Los resultados muestran como la modularidad y la distribuci-n del conmutador reducen la latencia y el consumo de potencia de la red. En segundo lugar, hemos utilizado nuestra metodolog'a de dise-o para implementar un crossbar distribuidRoca Pérez, A. (2012). Floorplan-Aware High Performance NoC Design [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/17844Palanci

    Reducing Internet Latency : A Survey of Techniques and their Merit

    Get PDF
    Bob Briscoe, Anna Brunstrom, Andreas Petlund, David Hayes, David Ros, Ing-Jyh Tsang, Stein Gjessing, Gorry Fairhurst, Carsten Griwodz, Michael WelzlPeer reviewedPreprin
    corecore