11 research outputs found

    A Network Traffic Generator Model for Fast Network-on-Chip Simulation

    Get PDF
    For Systems-on-Chip (SoCs) development, a predomi-nant part of the design time is the simulation time. Perfor-mance evaluation and design space exploration of such sys-tems in bit- and cycle-true fashion is becoming prohibitive. We propose a traffic generation (TG) model that provides a fast and effective Network-on-Chip (NoC) development and debugging environment. By capturing the type and the timestamp of communication events at the boundary of an IP core in a reference environment, the TG can subsequently emulate the core’s communication behavior in different en-vironments. Access patterns and resource contention in a system are dependent on the interconnect architecture, and our TG is designed to capture the resulting reactiveness. The regenerated traffic, which represents a realistic work-load, can thus be used to undertake faster architectural ex-ploration of interconnection alternatives, effectively decou-pling simulation of IP cores and of interconnect fabrics. The results with the TG on an AMBA interconnect show a sim-ulation time speedup above a factor of 2 over a complete system simulation, with close to 100 % accuracy.

    A Reactive and Cycle-True IP Emulator for MPSoC Exploration

    Get PDF
    The design of MultiProcessor Systems-on-Chip (MPSoC) emphasizes intellectual-property (IP)-based communication-centric approaches. Therefore, for the optimization of the MPSoC interconnect, the designer must develop traffic models that realistically capture the application behavior as executing on the IP core. In this paper, we introduce a Reactive IP Emulator (RIPE) that enables an effective emulation of the IP-core behavior in multiple environments, including bitand cycle-true simulation. The RIPE is built as a multithreaded abstract instruction-set processor, and it can generate reactive traffic patterns. We compare the RIPE models with cycle-true functional simulation of complex application behavior (tasksynchronization, multitasking, and input/output operations). Our results demonstrate high-accuracy and significant speedups. Furthermore, via a case study, we show the potential use of the RIPE in a design-space-exploration context

    A probabilistic approach to early communication performance estimation for electronic system-level design

    Get PDF
    Today\u27s embedded system designers face the challenges of ever increasing complexity and shorter time-to-market deadlines. System-level methodologies emerge to meet these challenges. Refinement-based methodologies, such as the SpecC methodology and Transaction Level Modeling, continue to gain popularity in the embedded system designers\u27 community. However, as more communication-dominated applications and architectures appear in the market, designers find that the lack of models allowing system-level communication analysis is a major limiting factor in current system-level design methodologies. Thus, modeling for system-level communication analysis is key for a design methodology to thrive with today\u27s embedded system designers. This work presents a new approach to system-level modeling that allows better communication analysis earlier in the design process. This approach defines a new model that utilizes random variables to include the communication details at higher abstraction levels. This work proposes a probabilistic model to include and evaluate the system communication features in the higher abstraction level. Guidelines to include the proposed model into a refinement-based methodology are presented, and methods for performance estimation are shown

    Modelling, Synthesis, and Configuration of Networks-on-Chips

    Get PDF

    MPSoCBench : um framework para avaliação de ferramentas e metodologias para sistemas multiprocessados em chip

    Get PDF
    Orientador: Rodolfo Jardim de AzevedoTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Recentes metodologias e ferramentas de projetos de sistemas multiprocessados em chip (MPSoC) aumentam a produtividade por meio da utilização de plataformas baseadas em simuladores, antes de definir os últimos detalhes da arquitetura. No entanto, a simulação só é eficiente quando utiliza ferramentas de modelagem que suportem a descrição do comportamento do sistema em um elevado nível de abstração. A escassez de plataformas virtuais de MPSoCs que integrem hardware e software escaláveis nos motivou a desenvolver o MPSoCBench, que consiste de um conjunto escalável de MPSoCs incluindo quatro modelos de processadores (PowerPC, MIPS, SPARC e ARM), organizado em plataformas com 1, 2, 4, 8, 16, 32 e 64 núcleos, cross-compiladores, IPs, interconexões, 17 aplicações paralelas e estimativa de consumo de energia para os principais componentes (processadores, roteadores, memória principal e caches). Uma importante demanda em projetos MPSoC é atender às restrições de consumo de energia o mais cedo possível. Considerando que o desempenho do processador está diretamente relacionado ao consumo, há um crescente interesse em explorar o trade-off entre consumo de energia e desempenho, tendo em conta o domínio da aplicação alvo. Técnicas de escalabilidade dinâmica de freqüência e voltagem fundamentam-se em gerenciar o nível de tensão e frequência da CPU, permitindo que o sistema alcance apenas o desempenho suficiente para processar a carga de trabalho, reduzindo, consequentemente, o consumo de energia. Para explorar a eficiência energética e desempenho, foram adicionados recursos ao MPSoCBench, visando explorar escalabilidade dinâmica de voltaegem e frequência (DVFS) e foram validados três mecanismos com base na estimativa dinâmica de energia e taxa de uso de CPUAbstract: Recent design methodologies and tools aim at enhancing the design productivity by providing a software development platform before the definition of the final Multiprocessor System on Chip (MPSoC) architecture details. However, simulation can only be efficiently performed when using a modeling and simulation engine that supports system behavior description at a high abstraction level. The lack of MPSoC virtual platform prototyping integrating both scalable hardware and software in order to create and evaluate new methodologies and tools motivated us to develop the MPSoCBench, a scalable set of MPSoCs including four different ISAs (PowerPC, MIPS, SPARC, and ARM) organized in platforms with 1, 2, 4, 8, 16, 32, and 64 cores, cross-compilers, IPs, interconnections, 17 parallel version of software from well-known benchmarks, and power consumption estimation for main components (processors, routers, memory, and caches). An important demand in MPSoC designs is the addressing of energy consumption constraints as early as possible. Whereas processor performance comes with a high power cost, there is an increasing interest in exploring the trade-off between power and performance, taking into account the target application domain. Dynamic Voltage and Frequency Scaling techniques adaptively scale the voltage and frequency levels of the CPU allowing it to reach just enough performance to process the system workload while meeting throughput constraints, and thereby, reducing the energy consumption. To explore this wide design space for energy efficiency and performance, both for hardware and software components, we provided MPSoCBench features to explore dynamic voltage and frequency scalability (DVFS) and evaluated three mechanisms based on energy estimation and CPU usage rateDoutoradoCiência da ComputaçãoDoutora em Ciência da Computaçã

    A Network Traffic Generator Model for Fast Network-on-Chip Simulation

    No full text

    System Level Performance Evaluation of Distributed Embedded Systems

    Get PDF
    In order to evaluate the feasibility of the distributed embedded systems in different application domains at an early phase, the System Level Performance Evaluation (SLPE) must provide reliable estimates of the nonfunctional properties of the system such as end-to-end delays and packet losses rate. The values of these non-functional properties depend not only on the application layer of the OSI model but also on the technologies residing at the MAC, transport and Physical layers. Therefore, the system level performance evaluation methodology must provide functionally accurate models of the protocols and technologies operating at these layers. After conducting a state of the art survey, it was found that the existing approaches for SLPE are either specialized for a particular domain of systems or apply a particular model of computation (MOC) for modeling the communication and synchronization between the different components of a distributed application. Therefore, these approaches abstract the functionalities of the data-link, Transport and MAC layers by the highly abstract message passing methods employed by the different models of computation. On the other hand, network simulators such as OMNeT++, ns-2 and Opnet do not provide the models for platform components of devices such as processors and memories and totally abstract the application processing by delays obtained via traffic generators. Therefore the system designer is not able to determine the potential impact of an application in terms of utilization of the platform used by the device. Hence, for a system level performance evaluation approach to estimate both the platform utilization and the non-functional properties which are a consequence of the lower layers of OSI models (such as end-to-end delays), it must provide the tools for automatic workload extraction of application workload models at various levels of refinement and functionally correct models of lower layers of OSI model (Transport MAC and Physical layers). Since ABSOLUT is not restricted to a particular domain and also does not depend on any MOC, therefore it was selected for the extension to a system level performance evaluation approach for distributed embedded systems. The models of data-link and Transport layer protocols and automatic workload generation of system calls was not available in ABSOLUT performance evaluation methodology. The, thesis describes the design and modelling of these OSI model layers and automatic workload generation tool for system calls. The tools and models integrated to ABSOLUT methodology were used in a number of case studies. The accuracy of the protocols was compared to network simulators and real systems. The results were 88% accurate for user space code of the application layer and provide an improvement of over 50% as compared to manual models for external libraries and system calls. The ABSOLUT physical layer models were found to be 99.8% accurate when compared to analytical models. The MAC and transport layer models were found to be 70-80% accurate when compared with the same scenarios simulated by ns-2 and OMNeT++ simulators. The bit error rates, frame error probability and packet loss rates show close correlation with the analytical methods .i.e., over 99%, 92% and 80% respectively. Therefore the results of ABSOLUT framework for application layer outperform the results of performance evaluation approaches which employ virtual systems and at the same time provide as accurate estimates of the end-to-end delays and packet loss rate as network simulators. The results of the network simulators also vary in absolute values but they follow the same trend. Therefore, the extensions made to ABSOLUT allow the system designer to identify the potential bottlenecks in the system at different OSI model layers and evaluate the non-functional properties with a high level of accuracy. Also, if the system designer wants to focus entirely on the application layer, different models of computations can be easily instantiated on top of extended ABSOLUT framework to achieve higher simulation speeds as described in the thesis

    Energy-aware synthesis for networks on chip architectures

    Full text link
    The Network on Chip (NoC) paradigm was introduced as a scalable communication infrastructure for future System-on-Chip applications. Designing application specific customized communication architectures is critical for obtaining low power, high performance solutions. Two significant design automation problems are the creation of an optimized configuration, given application requirement the implementation of this on-chip network. Automating the design of on-chip networks requires models for estimating area and energy, algorithms to effectively explore the design space and network component libraries and tools to generate the hardware description. Chip architects are faced with managing a wide range of customization options for individual components, routers and topology. As energy is of paramount importance, the effectiveness of any custom NoC generation approach lies in the availability of good energy models to effectively explore the design space. This thesis describes a complete NoC synthesis flow, called NoCGEN, for creating energy-efficient custom NoC architectures. Three major automation problems are addressed: custom topology generation, energy modeling and generation. An iterative algorithm is proposed to generate application specific point-to-point and packet-switched networks. The algorithm explores the design space for efficient topologies using characterized models and a system-level floorplanner for evaluating placement and wire-energy. Prior to our contribution, building an energy model required careful analysis of transistor or gate implementations. To alleviate the burden, an automated linear regression-based methodology is proposed to rapidly extract energy models for many router designs. The resulting models are cycle accurate with low-complexity and found to be within 10% of gate-level energy simulations, and execute several orders of magnitude faster than gate-level simulations. A hardware description of the custom topology is generated using a parameterizable library and custom HDL generator. Fully reusable and scalable network components (switches, crossbars, arbiters, routing algorithms) are described using a template approach and are used to compose arbitrary topologies. A methodology for building and composing routers and topologies using a template engine is described. The entire flow is implemented as several demonstrable extensible tools with powerful visualization functionality. Several experiments are performed to demonstrate the design space exploration capabilities and compare it against a competing min-cut topology generation algorithm
    corecore