10 research outputs found

    Efficient techniques to provide scalability for token-based cache coherence protocols

    Full text link
    Cache coherence protocols based on tokens can provide low latency without relying on non-scalable interconnects thanks to the use of efficient requests that are unordered. However, when these unordered requests contend for the same memory block, they may cause protocols races. To resolve the races and ensure the completion of all the cache misses, token protocols use a starvation prevention mechanism that is inefficient and non-scalable in terms of required storage structures and generated traffic. Besides, token protocols use non-silent invalidations which increase the latency of write misses proportionally to the system size. All these problems make token protocols non-scalable. To overcome the main problems of token protocols and increase their scalability, we propose a new starvation prevention mechanism named Priority Requests. This mechanism resolves contention by an efficient, elegant, and flexible method based on ordered requests. Furthermore, thanks to Priority Requests, efficient techniques can be applied to limit the storage requirements of the starvation prevention mechanism, to reduce the total traffic generated for managing protocol races, and to reduce the latency of write misses. Thus, the main problems of token protocols can be solved, which, in turn, contributes to wide their efficiency and scalability.Cuesta Sáez, BA. (2009). Efficient techniques to provide scalability for token-based cache coherence protocols [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/6024Palanci

    Broadcast-oriented wireless network-on-chip : fundamentals and feasibility

    Get PDF
    Premi extraordinari doctorat UPC curs 2015-2016, àmbit Enginyeria de les TICRecent years have seen the emergence and ubiquitous adoption of Chip Multiprocessors (CMPs), which rely on the coordinated operation of multiple execution units or cores. Successive CMP generations integrate a larger number of cores seeking higher performance with a reasonable cost envelope. For this trend to continue, however, important scalability issues need to be solved at different levels of design. Scaling the interconnect fabric is a grand challenge by itself, as new Network-on-Chip (NoC) proposals need to overcome the performance hurdles found when dealing with the increasingly variable and heterogeneous communication demands of manycore processors. Fast and flexible NoC solutions are needed to prevent communication become a performance bottleneck, situation that would severely limit the design space at the architectural level and eventually lead to the use of software frameworks that are slow, inefficient, or less programmable. The emergence of novel interconnect technologies has opened the door to a plethora of new NoCs promising greater scalability and architectural flexibility. In particular, wireless on-chip communication has garnered considerable attention due to its inherent broadcast capabilities, low latency, and system-level simplicity. Most of the resulting Wireless Network-on-Chip (WNoC) proposals have set the focus on leveraging the latency advantage of this paradigm by creating multiple wireless channels to interconnect far-apart cores. This strategy is effective as the complement of wired NoCs at moderate scales, but is likely to be overshadowed at larger scales by technologies such as nanophotonics unless bandwidth is unrealistically improved. This dissertation presents the concept of Broadcast-Oriented Wireless Network-on-Chip (BoWNoC), a new approach that attempts to foster the inherent simplicity, flexibility, and broadcast capabilities of the wireless technology by integrating one on-chip antenna and transceiver per processor core. This paradigm is part of a broader hybrid vision where the BoWNoC serves latency-critical and broadcast traffic, tightly coupled to a wired plane oriented to large flows of data. By virtue of its scalable broadcast support, BoWNoC may become the key enabler of a wealth of unconventional hardware architectures and algorithmic approaches, eventually leading to a significant improvement of the performance, energy efficiency, scalability and programmability of manycore chips. The present work aims not only to lay the fundamentals of the BoWNoC paradigm, but also to demonstrate its viability from the electronic implementation, network design, and multiprocessor architecture perspectives. An exploration at the physical level of design validates the feasibility of the approach at millimeter-wave bands in the short term, and then suggests the use of graphene-based antennas in the terahertz band in the long term. At the link level, this thesis provides an insightful context analysis that is used, afterwards, to drive the design of a lightweight protocol that reliably serves broadcast traffic with substantial latency improvements over state-of-the-art NoCs. At the network level, our hybrid vision is evaluated putting emphasis on the flexibility provided at the network interface level, showing outstanding speedups for a wide set of traffic patterns. At the architecture level, the potential impact of the BoWNoC paradigm on the design of manycore chips is not only qualitatively discussed in general, but also quantitatively assessed in a particular architecture for fast synchronization. Results demonstrate that the impact of BoWNoC can go beyond simply improving the network performance, thereby representing a possible game changer in the manycore era.Avenços en el disseny de multiprocessadors han portat a una àmplia adopció dels Chip Multiprocessors (CMPs), que basen el seu potencial en la operació coordinada de múltiples nuclis de procés. Generacions successives han anat integrant més nuclis en la recerca d'alt rendiment amb un cost raonable. Per a que aquesta tendència continuï, però, cal resoldre importants problemes d'escalabilitat a diferents capes de disseny. Escalar la xarxa d'interconnexió és un gran repte en ell mateix, ja que les noves propostes de Networks-on-Chip (NoC) han de servir un tràfic eminentment variable i heterogeni dels processadors amb molts nuclis. Són necessàries solucions ràpides i flexibles per evitar que les comunicacions dins del xip es converteixin en el pròxim coll d'ampolla de rendiment, situació que limitaria en gran mesura l'espai de disseny a nivell d'arquitectura i portaria a l'ús d'arquitectures i models de programació lents, ineficients o poc programables. L'aparició de noves tecnologies d'interconnexió ha possibilitat la creació de NoCs més flexibles i escalables. En particular, la comunicació intra-xip sense fils ha despertat un interès considerable en virtut de les seva baixa latència, simplicitat, i bon rendiment amb tràfic broadcast. La majoria de les Wireless NoC (WNoC) proposades fins ara s'han centrat en aprofitar l'avantatge en termes de latència d'aquest nou paradigma creant múltiples canals sense fils per interconnectar nuclis allunyats entre sí. Aquesta estratègia és efectiva per complementar a NoCs clàssiques en escales mitjanes, però és probable que altres tecnologies com la nanofotònica puguin jugar millor aquest paper a escales més grans. Aquesta tesi presenta el concepte de Broadcast-Oriented WNoC (BoWNoC), un nou enfoc que intenta rendibilitzar al màxim la inherent simplicitat, flexibilitat, i capacitats broadcast de la tecnologia sense fils integrant una antena i transmissor/receptor per cada nucli del processador. Aquest paradigma forma part d'una visió més àmplia on un BoWNoC serviria tràfic broadcast i urgent, mentre que una xarxa convencional serviria fluxos de dades més pesats. En virtut de la escalabilitat i del seu suport broadcast, BoWNoC podria convertir-se en un element clau en una gran varietat d'arquitectures i algoritmes poc convencionals que milloressin considerablement el rendiment, l'eficiència, l'escalabilitat i la programabilitat de processadors amb molts nuclis. El present treball té com a objectius no només estudiar els aspectes fonamentals del paradigma BoWNoC, sinó també demostrar la seva viabilitat des dels punts de vista de la implementació, i del disseny de xarxa i arquitectura. Una exploració a la capa física valida la viabilitat de l'enfoc usant tecnologies longituds d'ona milimètriques en un futur proper, i suggereix l'ús d'antenes de grafè a la banda dels terahertz ja a més llarg termini. A capa d'enllaç, la tesi aporta una anàlisi del context de l'aplicació que és, més tard, utilitzada per al disseny d'un protocol d'accés al medi que permet servir tràfic broadcast a baixa latència i de forma fiable. A capa de xarxa, la nostra visió híbrida és avaluada posant èmfasi en la flexibilitat que aporta el fet de prendre les decisions a nivell de la interfície de xarxa, mostrant grans millores de rendiment per una àmplia selecció de patrons de tràfic. A nivell d'arquitectura, l'impacte que el concepte de BoWNoC pot tenir sobre el disseny de processadors amb molts nuclis no només és debatut de forma qualitativa i genèrica, sinó també avaluat quantitativament per una arquitectura concreta enfocada a la sincronització. Els resultats demostren que l'impacte de BoWNoC pot anar més enllà d'una millora en termes de rendiment de xarxa; representant, possiblement, un canvi radical a l'era dels molts nuclisAward-winningPostprint (published version

    Broadcast-oriented wireless network-on-chip : fundamentals and feasibility

    Get PDF
    Premi extraordinari doctorat UPC curs 2015-2016, àmbit Enginyeria de les TICRecent years have seen the emergence and ubiquitous adoption of Chip Multiprocessors (CMPs), which rely on the coordinated operation of multiple execution units or cores. Successive CMP generations integrate a larger number of cores seeking higher performance with a reasonable cost envelope. For this trend to continue, however, important scalability issues need to be solved at different levels of design. Scaling the interconnect fabric is a grand challenge by itself, as new Network-on-Chip (NoC) proposals need to overcome the performance hurdles found when dealing with the increasingly variable and heterogeneous communication demands of manycore processors. Fast and flexible NoC solutions are needed to prevent communication become a performance bottleneck, situation that would severely limit the design space at the architectural level and eventually lead to the use of software frameworks that are slow, inefficient, or less programmable. The emergence of novel interconnect technologies has opened the door to a plethora of new NoCs promising greater scalability and architectural flexibility. In particular, wireless on-chip communication has garnered considerable attention due to its inherent broadcast capabilities, low latency, and system-level simplicity. Most of the resulting Wireless Network-on-Chip (WNoC) proposals have set the focus on leveraging the latency advantage of this paradigm by creating multiple wireless channels to interconnect far-apart cores. This strategy is effective as the complement of wired NoCs at moderate scales, but is likely to be overshadowed at larger scales by technologies such as nanophotonics unless bandwidth is unrealistically improved. This dissertation presents the concept of Broadcast-Oriented Wireless Network-on-Chip (BoWNoC), a new approach that attempts to foster the inherent simplicity, flexibility, and broadcast capabilities of the wireless technology by integrating one on-chip antenna and transceiver per processor core. This paradigm is part of a broader hybrid vision where the BoWNoC serves latency-critical and broadcast traffic, tightly coupled to a wired plane oriented to large flows of data. By virtue of its scalable broadcast support, BoWNoC may become the key enabler of a wealth of unconventional hardware architectures and algorithmic approaches, eventually leading to a significant improvement of the performance, energy efficiency, scalability and programmability of manycore chips. The present work aims not only to lay the fundamentals of the BoWNoC paradigm, but also to demonstrate its viability from the electronic implementation, network design, and multiprocessor architecture perspectives. An exploration at the physical level of design validates the feasibility of the approach at millimeter-wave bands in the short term, and then suggests the use of graphene-based antennas in the terahertz band in the long term. At the link level, this thesis provides an insightful context analysis that is used, afterwards, to drive the design of a lightweight protocol that reliably serves broadcast traffic with substantial latency improvements over state-of-the-art NoCs. At the network level, our hybrid vision is evaluated putting emphasis on the flexibility provided at the network interface level, showing outstanding speedups for a wide set of traffic patterns. At the architecture level, the potential impact of the BoWNoC paradigm on the design of manycore chips is not only qualitatively discussed in general, but also quantitatively assessed in a particular architecture for fast synchronization. Results demonstrate that the impact of BoWNoC can go beyond simply improving the network performance, thereby representing a possible game changer in the manycore era.Avenços en el disseny de multiprocessadors han portat a una àmplia adopció dels Chip Multiprocessors (CMPs), que basen el seu potencial en la operació coordinada de múltiples nuclis de procés. Generacions successives han anat integrant més nuclis en la recerca d'alt rendiment amb un cost raonable. Per a que aquesta tendència continuï, però, cal resoldre importants problemes d'escalabilitat a diferents capes de disseny. Escalar la xarxa d'interconnexió és un gran repte en ell mateix, ja que les noves propostes de Networks-on-Chip (NoC) han de servir un tràfic eminentment variable i heterogeni dels processadors amb molts nuclis. Són necessàries solucions ràpides i flexibles per evitar que les comunicacions dins del xip es converteixin en el pròxim coll d'ampolla de rendiment, situació que limitaria en gran mesura l'espai de disseny a nivell d'arquitectura i portaria a l'ús d'arquitectures i models de programació lents, ineficients o poc programables. L'aparició de noves tecnologies d'interconnexió ha possibilitat la creació de NoCs més flexibles i escalables. En particular, la comunicació intra-xip sense fils ha despertat un interès considerable en virtut de les seva baixa latència, simplicitat, i bon rendiment amb tràfic broadcast. La majoria de les Wireless NoC (WNoC) proposades fins ara s'han centrat en aprofitar l'avantatge en termes de latència d'aquest nou paradigma creant múltiples canals sense fils per interconnectar nuclis allunyats entre sí. Aquesta estratègia és efectiva per complementar a NoCs clàssiques en escales mitjanes, però és probable que altres tecnologies com la nanofotònica puguin jugar millor aquest paper a escales més grans. Aquesta tesi presenta el concepte de Broadcast-Oriented WNoC (BoWNoC), un nou enfoc que intenta rendibilitzar al màxim la inherent simplicitat, flexibilitat, i capacitats broadcast de la tecnologia sense fils integrant una antena i transmissor/receptor per cada nucli del processador. Aquest paradigma forma part d'una visió més àmplia on un BoWNoC serviria tràfic broadcast i urgent, mentre que una xarxa convencional serviria fluxos de dades més pesats. En virtut de la escalabilitat i del seu suport broadcast, BoWNoC podria convertir-se en un element clau en una gran varietat d'arquitectures i algoritmes poc convencionals que milloressin considerablement el rendiment, l'eficiència, l'escalabilitat i la programabilitat de processadors amb molts nuclis. El present treball té com a objectius no només estudiar els aspectes fonamentals del paradigma BoWNoC, sinó també demostrar la seva viabilitat des dels punts de vista de la implementació, i del disseny de xarxa i arquitectura. Una exploració a la capa física valida la viabilitat de l'enfoc usant tecnologies longituds d'ona milimètriques en un futur proper, i suggereix l'ús d'antenes de grafè a la banda dels terahertz ja a més llarg termini. A capa d'enllaç, la tesi aporta una anàlisi del context de l'aplicació que és, més tard, utilitzada per al disseny d'un protocol d'accés al medi que permet servir tràfic broadcast a baixa latència i de forma fiable. A capa de xarxa, la nostra visió híbrida és avaluada posant èmfasi en la flexibilitat que aporta el fet de prendre les decisions a nivell de la interfície de xarxa, mostrant grans millores de rendiment per una àmplia selecció de patrons de tràfic. A nivell d'arquitectura, l'impacte que el concepte de BoWNoC pot tenir sobre el disseny de processadors amb molts nuclis no només és debatut de forma qualitativa i genèrica, sinó també avaluat quantitativament per una arquitectura concreta enfocada a la sincronització. Els resultats demostren que l'impacte de BoWNoC pot anar més enllà d'una millora en termes de rendiment de xarxa; representant, possiblement, un canvi radical a l'era dels molts nuclisAward-winningPostprint (published version

    Fault-Tolerant Coherence Protocol for Distributed Shared Memory Systems

    Get PDF
    Distributed Shared Memory (DSM) systems are becoming increasingly more significant as a result of being used more extensively in modern computing environments. DSM gives the illusion of shared memory on a loosely couple system. In a scenario where systems are connected across a network, DSM coherence protocols should be able to scale well to larger networks. When real-time applications run on distributed systems, providing a high degree of reliability is an inherently error-prone environment is a formidable task. Regardless, fault tolerance, in terms of highly available data-access and uninterrupted service, should be provided. Recovery is the process of restoring a system to its normal operational state in the event of a failure. Reliability ensures the consistency of the data after recovery. Existing DSM systems provide reliability by replicating data, either in stable storage or in the main memories of different processors. But these systems suspend the DSM service during recovery. In time-critical applications, providing uninterrupted DSM service to the greatest possible extent is a necessity and a challenge. Research has been reported in the literature on architectures with a single server and multiple clients. This thesis reports on investigations on finding a solution where the server is not a single point of failure, faster recovery is possible in the event of a failure, and increased throughput can be obtained during normal operation of the system. It was found that better performance can be obtained by using the multi-server protocol when the user application exhibits locality of reference. The server was not a single point of failure and recovery form a single site failure was appoximately 50% faster when 2 servers, instead of 1, were used

    Mobility-based Routing Overhead Management in Reconfigurable Wireless Ad hoc Networks

    Get PDF
    Mobility-Based Routing Overhead Management in Reconfigurable Wireless Ad Hoc Networks Routing Overheads are the non-data message packets whose roles are establishment and maintenance of routes for data packets as well as neighbourhood discovery and maintenance. They have to be broadcasted in the network either through flooding or other techniques that can ensure that a path exists before data packets can be sent to various destinations. They can be sent reactively or periodically to neighbours so as to keep nodes updated on their neighbourhoods. While we cannot do without these overhead packets, they occupy much of the limited wireless bandwidth available in wireless networks. In a reconfigurable wireless ad hoc network scenario, these packets have more negative effects, as links need to be confirmed more frequently than in traditional networks mainly because of the unpredictable behaviour of the ad hoc networks. We therefore need suitable algorithms that will manage these overheads so as to allow data packet to have more access to the wireless medium, save node energy for longer life of the network, increased efficiency, and scalability. Various protocols have been suggested in the research area. They mostly address routing overheads for suitability of particular protocols leading to lack of standardisation and inapplicability to other protocol classes. In this dissertation ways of ensuring that the routing overheads are kept low are investigated. The issue is addressed both at node and network levels with a common goal of improving efficiency and performance of ad hoc networks without dedicating ourselves to a particular class of routing protocol. At node level, a method hereby referred to as "link availability forecast", that minimises routing overheads used for maintenance of neighbourhood, is derived. The targeted packets are packets that are broadcasted periodically (e.g. hello messages). The basic idea in this method is collection of mobility parameters from the neighbours and predictions or forecasts of these parameters in future. Using these parameters in simple calculations helps in identifying link availabilities between nodes participating in maintenance of networks backbone. At the network level, various approaches have been suggested. The first approach is the cone flooding method that broadcasts route request messages through a predetermined cone shaped region. This region is determined through computation using last known mobility parameters of the destination. Another approach is what is hereby referred as "destination search reverse zone method". In this method, a node will keep routes to destinations for a long time and use these routes for tracing the destination. The destination will then initiate route search in a reverse manner, whereby the source selects the best route for next delivery. A modification to this method is for the source node to determine the zone of route search and define the boundaries within which the packet should be broadcasted. The later method has been used for simulation purposes. The protocol used for verification of the improvements offered by the schemes was the AODV. The link availability forecast scheme was implemented on the AODV and labelled AODV_LA while the network level implementation was labelled AODV_RO. A combination of the two schemes was labelled AODV_LARO

    Energy-efficient electrical and silicon-photonic networks in many core systems

    Full text link
    Thesis (Ph.D.)--Boston UniversityDuring the past decade, the very large scale integration (VLSI) community has migrated towards incorporating multiple cores on a single chip to sustain the historic performance improvement in computing systems. As the core count continuously increases, the performance of network-on-chip (NoC), which is responsible for the communication between cores, caches and memory controllers, is increasingly becoming critical for sustaining the performance improvement. In this dissertation, we propose several methods to improve the energy efficiency of both electrical and silicon-photonic NoCs. Firstly, for electrical NoC, we propose a flow control technique, Express Virtual Channel with Taps (EVC-T), to transmit both broadcast and data packets efficiently in a mesh network. A low-latency notification tree network is included to maintain t he order of broadcast packets. The EVC-T technique improves the NoC latency by 24% and the system energy efficiency in terms of energy-delay product (EDP) by 13%. In the near future, the silicon-photonic links are projected to replace the electrical links for global on-chip communication due to their lower data-dependent power and higher bandwidth density, but the high laser power can more than offset these advantages. Therefore, we propose a silicon-photonic multi-bus NoC architecture and a methodology that can reduce the laser power by 49% on average through bandwidth reconfiguration at runtime based on the variations in bandwidth requirements of applications. We also propose a technique to reduce the laser power by dynamically activating/deactivating the 12 cache banks and switching ON/ OFF the corresponding silicon-photonic links in a crossbar NoC. This cache-reconfiguration based technique can save laser power by 23.8% and improves system EDP by 5.52% on average. In addition, we propose a methodology for placing and sharing on-chip laser sources by jointly considering the bandwidth requirements, thermal constraints and physical layout constraints. Our proposed methodology for placing and sharing of on-chip laser sources reduces laser power. In addition to reducing the laser power to improve the energy efficiency of silicon-photonic NoCs, we propose to leverage the large bandwidth provided by silicon-photonic NoC to share computing resources. The global sharing of floating-point units can save system area by 13.75% and system power by 10%

    Intellectual Property: Law & The Information Society Selected Statutes & Treaties

    Get PDF
    This book is a collection of the primary sources of Federal intellectual property law, the set of private legal rights that allows individuals and corporations to control intangible creations and marks—from drugs, to books, to logos—and the exceptions and limitations that define those rights

    Intellectual Property: Law & The Information Society

    Get PDF
    This book is an introduction to intellectual property law, the set of private legal rights that allows individuals and corporations to control intangible creations and marks—from logos to novels to drug formulae—and the exceptions and limitations that define those rights. It focuses on the three main forms of US federal intellectual property—trademark, copyright and patent—but many of the ideas discussed here apply far beyond those legal areas and far beyond the law of the United States

    Evaluation of Spatio-Temporal Queries in Sensor Networks

    Get PDF
    corecore