30 research outputs found
NoCo: ILP-based worst-case contention estimation for mesh real-time manycores
Manycores are capable of providing the computational demands required by functionally-advanced critical applications in domains such as automotive and avionics. In manycores a network-on-chip (NoC) provides access to shared caches and memories and hence concentrates most of the contention that tasks suffer, with effects on the worst-case contention delay (WCD) of packets and tasks' WCET. While several proposals minimize the impact of individual NoC parameters on WCD, e.g. mapping and routing, there are strong dependences among these NoC parameters. Hence, finding the optimal NoC configurations requires optimizing all parameters simultaneously, which represents a multidimensional optimization problem. In this paper we propose NoCo, a novel approach that combines ILP and stochastic optimization to find NoC configurations in terms of packet routing, application mapping, and arbitration weight allocation. Our results show that NoCo improves other techniques that optimize a subset of NoC parameters.This work has been partially supported by the Spanish Ministry of Economy and Competitiveness under grant TIN2015-
65316-P and the HiPEAC Network of Excellence. It also received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (agreement No. 772773). Carles Hernández
is jointly supported by the MINECO and FEDER funds
through grant TIN2014-60404-JIN. Jaume Abella has been
partially supported by the Spanish Ministry of Economy and
Competitiveness under Ramon y Cajal postdoctoral fellowship
number RYC-2013-14717. Enrico Mezzetti has been partially
supported by the Spanish Ministry of Economy and Competitiveness
under Juan de la Cierva-Incorporaci´on postdoctoral
fellowship number IJCI-2016-27396.Peer ReviewedPostprint (author's final draft
Voltage island based heterogeneous NoC design through constraint programming
This paper discusses heterogeneous Network-on-Chip (NoC) design from a Constraint Programming (CP) perspective and extends the formulation to solving Voltage-Frequency Island (VFI) problem. In general, VFI is a superior design alternative in terms of thermal constraints, power consumption as well as performance considerations. Given a Communication Task Graph (CTG) and subsequent task assignments for cores, cores are allocated to the best possible places on the chip in the first stage to minimize the overall communication cost among cores. We then solve the application scheduling problem to determine the optimum core types from a list of technological alternatives and to minimize the makespan. Moreover, an elegant CP model is proposed to solve VFI problem by mapping and grouping cores at the same time with scheduling the computation tasks as a limited capacity resource allocation model. The paper reports results based on real benchmark datasets from the literature. © 2014 Elsevier Ltd. All rights reserved
Automatic synthesis and optimization of chip multiprocessors
The microprocessor technology has experienced an enormous growth during the last decades. Rapid downscale of the CMOS technology has led to higher operating frequencies and performance densities, facing the fundamental issue of power dissipation. Chip Multiprocessors (CMPs) have become the latest paradigm to improve the power-performance efficiency of computing systems by exploiting the parallelism inherent in applications. Industrial and prototype implementations have already demonstrated the benefits achieved by CMPs with hundreds of cores.CMP architects are challenged to take many complex design decisions. Only a few of them are:- What should be the ratio between the core and cache areas on a chip?- Which core architectures to select?- How many cache levels should the memory subsystem have?- Which interconnect topologies provide efficient on-chip communication?These and many other aspects create a complex multidimensional space for architectural exploration. Design Automation tools become essential to make the architectural exploration feasible under the hard time-to-market constraints. The exploration methods have to be efficient and scalable to handle future generation on-chip architectures with hundreds or thousands of cores.Furthermore, once a CMP has been fabricated, the need for efficient deployment of the many-core processor arises. Intelligent techniques for task mapping and scheduling onto CMPs are necessary to guarantee the full usage of the benefits brought by the many-core technology. These techniques have to consider the peculiarities of the modern architectures, such as availability of enhanced power saving techniques and presence of complex memory hierarchies.This thesis has several objectives. The first objective is to elaborate the methods for efficient analytical modeling and architectural design space exploration of CMPs. The efficiency is achieved by using analytical models instead of simulation, and replacing the exhaustive exploration with an intelligent search strategy. Additionally, these methods incorporate high-level models for physical planning. The related contributions are described in Chapters 3, 4 and 5 of the document.The second objective of this work is to propose a scalable task mapping algorithm onto general-purpose CMPs with power management techniques, for efficient deployment of many-core systems. This contribution is explained in Chapter 6 of this document.Finally, the third objective of this thesis is to address the issues of the on-chip interconnect design and exploration, by developing a model for simultaneous topology customization and deadlock-free routing in Networks-on-Chip. The developed methodology can be applied to various classes of the on-chip systems, ranging from general-purpose chip multiprocessors to application-specific solutions. Chapter 7 describes the proposed model.The presented methods have been thoroughly tested experimentally and the results are described in this dissertation. At the end of the document several possible directions for the future research are proposed
Network-on-Chip
Addresses the Challenges Associated with System-on-Chip Integration Network-on-Chip: The Next Generation of System-on-Chip Integration examines the current issues restricting chip-on-chip communication efficiency, and explores Network-on-chip (NoC), a promising alternative that equips designers with the capability to produce a scalable, reusable, and high-performance communication backbone by allowing for the integration of a large number of cores on a single system-on-chip (SoC). This book provides a basic overview of topics associated with NoC-based design: communication infrastructure design, communication methodology, evaluation framework, and mapping of applications onto NoC. It details the design and evaluation of different proposed NoC structures, low-power techniques, signal integrity and reliability issues, application mapping, testing, and future trends. Utilizing examples of chips that have been implemented in industry and academia, this text presents the full architectural design of components verified through implementation in industrial CAD tools. It describes NoC research and developments, incorporates theoretical proofs strengthening the analysis procedures, and includes algorithms used in NoC design and synthesis. In addition, it considers other upcoming NoC issues, such as low-power NoC design, signal integrity issues, NoC testing, reconfiguration, synthesis, and 3-D NoC design. This text comprises 12 chapters and covers: The evolution of NoC from SoC—its research and developmental challenges NoC protocols, elaborating flow control, available network topologies, routing mechanisms, fault tolerance, quality-of-service support, and the design of network interfaces The router design strategies followed in NoCs The evaluation mechanism of NoC architectures The application mapping strategies followed in NoCs Low-power design techniques specifically followed in NoCs The signal integrity and reliability issues of NoC The details of NoC testing strategies reported so far The problem of synthesizing application-specific NoCs Reconfigurable NoC design issues Direction of future research and development in the field of NoC Network-on-Chip: The Next Generation of System-on-Chip Integration covers the basic topics, technology, and future trends relevant to NoC-based design, and can be used by engineers, students, and researchers and other industry professionals interested in computer architecture, embedded systems, and parallel/distributed systems
Cross-layer modeling and optimization of next-generation internet networks
Scaling traditional telecommunication networks so that they are able to cope with the volume of future traffic demands and the stringent European Commission (EC) regulations on emissions would entail unaffordable investments. For this very reason, the design of an innovative ultra-high bandwidth power-efficient network architecture is nowadays a bold topic within the research community. So far, the independent evolution of network layers has resulted in isolated, and hence, far-from-optimal contributions, which have eventually led to the issues today's networks are facing such as inefficient energy strategy, limited network scalability and flexibility, reduced network manageability and increased overall network and customer services costs. Consequently, there is currently large consensus among network operators and the research community that cross-layer interaction and coordination is fundamental for the proper architectural design of next-generation Internet networks.
This thesis actively contributes to the this goal by addressing the modeling, optimization and performance analysis of a set of potential technologies to be deployed in future cross-layer network architectures. By applying a transversal design approach (i.e., joint consideration of several network layers), we aim for achieving the maximization of the integration of the different network layers involved in each specific problem. To this end, Part I provides a comprehensive evaluation of optical transport networks (OTNs) based on layer 2 (L2) sub-wavelength switching (SWS) technologies, also taking into consideration the impact of physical layer impairments (PLIs) (L0 phenomena). Indeed, the recent and relevant advances in optical technologies have dramatically increased the impact that PLIs have on the optical signal quality, particularly in the context of SWS networks. Then, in Part II of the thesis, we present a set of case studies where it is shown that the application of operations research (OR) methodologies in the desing/planning stage of future cross-layer Internet network architectures leads to the successful joint optimization of key network performance indicators (KPIs) such as cost (i.e., CAPEX/OPEX), resources usage and energy consumption. OR can definitely play an important role by allowing network designers/architects to obtain good near-optimal solutions to real-sized problems within practical running times
Recommended from our members
On thermal sensor calibration and software techniques for many-core thermal management
The high power density of a many-core processor results in increased temperature which negatively impacts system reliability and performance. Dynamic thermal management applies thermal-aware techniques at run time to avoid overheating using temperature information collected from on-chip thermal sensors. Temperature sensing and thermal control schemes are two critical technologies for successfully maintaining thermal safety. In this dissertation, on-line thermal sensor calibration schemes are developed to provide accurate temperature information.
Software-based dynamic thermal management techniques are proposed using calibrated thermal sensors. Due to process variation and silicon aging, on-chip thermal sensors require periodic calibration before use in DTM. However, the calibration cost for thermal sensors can be prohibitively high as the number of on-chip sensors increases. Linear models which are suitable for on-line calculation are employed to estimate temperatures at multiple sensor locations using performance counters. The estimated temperature and the actual sensor thermal profile show a very high similarity with correlation coefficient ~0.9 for SPLASH2 and SPEC2000 benchmarks.
A calibration approach is proposed to combine potentially inaccurate temperature values obtained from two sources: thermal sensor readings and temperature estimations. A data fusion strategy based on Bayesian inference, which combines information from these two sources, is demonstrated. The result shows the strategy can effectively recalibrate sensor readings in response to inaccuracies caused by process variation and environmental noise. The average absolute error of the corrected sensor temperature readings is
A dynamic task allocation strategy is proposed to address localized overheating in many-core systems. Our approach employs reinforcement learning, a dynamic machine learning algorithm that performs task allocation based on current temperatures and a prediction regarding which assignment will minimize the peak temperature. Our results show that the proposed technique is fast (scheduling performed in \u3c1 \u3ems) and can efficiently reduce peak temperature by up to 8 degree C in a 49-core processor (6% on average) versus a leading competing task allocation approach for a series of SPLASH-2 benchmarks. Reinforcement learning has also been applied to 3D integrated circuits to allocate tasks with thermal awareness
Thermal aware design techniques for multiprocessor architectures in three dimensions
Tesis inédita de la Universidad Complutense de Madrid, Facultad de Informática, Departamento de Arquitectura de Computadores y Automática, leÃda el 28-11-2013Depto. de Arquitectura de Computadores y AutomáticaFac. de InformáticaTRUEunpu
Supporting differentiated classes of resilience in multilayer networks
Services provided over telecommunications networks typically have different resilience requirements and networks need to be able to support different levels of resilience in an efficient manner. This dissertation investigates the problem of supporting differentiated classes of resilience in multilayer networks, including the most stringent resilience class required by critical services. We incorporate an innovative technique of embedding a subnetwork, termed the spine, with comparatively higher availability values at the physical layer. The spine lays a foundation for differentiation between multiple classes of flows that can be leveraged to achieve both high resilience and differentiation. The aim of this research is mainly to explore, design, and evaluate the proposed spine concept model in multilayer networks. The dissertation has four major parts. First, we explore the spine concept through numerical analysis of simple topologies illustrating the potential benefits and the cost considerations of the spine. We develop heuristics algorithms to find suitable spines for a network based on the structural properties of the network topology. Second, an optimization problem is formulated to determine the spine. The problem encompasses estimates of link availability improvements, associated costs, and a total budget. Third, we propose a crosslayer mapping and spine-aware routing design problem with protection given mainly at the lower layer. The problem is designed to transfer lower layer differentiation capability to the upper layer network and flows. We provide two joint routing-mapping optimization formulations and evaluate their performance in a multilayer scenario. Fourth, the joint routing-mapping problem is redesigned with protection given in the upper network layer instead. This will create two isolated logical networks; one mapped to the spine and the other is mapped freely on the network. Flows are assigned a path or path-pair based on their class of resilience. This approach can provide more routing options yielding different availability levels. The joint routing-mapping design problems are formulated as Integer Linear Programming (ILP) models. The goal is to achieve a wider range of availability values across layers and high availability levels for mission-critical services without the need to use higher order protection configurations. The proposed models are evaluated with extensive numerical results using real network topologies
Energy-aware synthesis for networks on chip architectures
The Network on Chip (NoC) paradigm was introduced as a scalable communication infrastructure for future System-on-Chip applications. Designing application specific customized communication architectures is critical for obtaining low power, high performance solutions. Two significant design automation problems are the creation of an optimized configuration, given application requirement the implementation of this on-chip network. Automating the design of on-chip networks requires models for estimating area and energy, algorithms to effectively explore the design space and network component libraries and tools to generate the hardware description. Chip architects are faced with managing a wide range of customization options for individual components, routers and topology. As energy is of paramount importance, the effectiveness of any custom NoC generation approach lies in the availability of good energy models to effectively explore the design space. This thesis describes a complete NoC synthesis flow, called NoCGEN, for creating energy-efficient custom NoC architectures. Three major automation problems are addressed: custom topology generation, energy modeling and generation. An iterative algorithm is proposed to generate application specific point-to-point and packet-switched networks. The algorithm explores the design space for efficient topologies using characterized models and a system-level floorplanner for evaluating placement and wire-energy. Prior to our contribution, building an energy model required careful analysis of transistor or gate implementations. To alleviate the burden, an automated linear regression-based methodology is proposed to rapidly extract energy models for many router designs. The resulting models are cycle accurate with low-complexity and found to be within 10% of gate-level energy simulations, and execute several orders of magnitude faster than gate-level simulations. A hardware description of the custom topology is generated using a parameterizable library and custom HDL generator. Fully reusable and scalable network components (switches, crossbars, arbiters, routing algorithms) are described using a template approach and are used to compose arbitrary topologies. A methodology for building and composing routers and topologies using a template engine is described. The entire flow is implemented as several demonstrable extensible tools with powerful visualization functionality. Several experiments are performed to demonstrate the design space exploration capabilities and compare it against a competing min-cut topology generation algorithm