92 research outputs found

    Performance Analysis of Mesh-based NoC’s on Routing Algorithms

    Get PDF
    The advent of System-on-Chip (SoCs), has brought about a need to increase the scale of multi-core chip networks. Bus Based communications have proved to be limited in terms of performance and ease of scalability, the solution to both bus – based and Point-to-Point (P2P) communication systems is to use a communication infrastructure called Network-on-Chip (NoC). Performance of NoC depends on various factors such as network topology, routing strategy and switching technique and traffic patterns. In this paper, we have taken the initiative to compile together a comparative analysis of different Network on Chip infrastructures based on the classification of routing algorithm, switching technique, and traffic patterns. The goal is to show how varied combinations of the three factors perform differently based on the size of the mesh network, using NOXIM, an open source SystemC Simulator of mesh-based NoC. The analysis has shown tenable evidence highlighting the novelty of XY routing algorithm

    Networks on Chips: Structure and Design Methodologies

    Get PDF

    Coupling Routing Algorithm and Data Encoding for Low Power Networks on Chip

    Get PDF
    The routing algorithm used in a Network-on-Chip (NoC) has a strong impact on both the functional and non functional indices of the overall system. Traditionally, routing algorithms have been designed considering performance and cost as the main objectives. In this study we focus on two important non functional metrics, namely, power dissipation and energy consumption. We propose a selection policy that can be coupled with any multi-path routing function and whose primary goal is reducing power dissipation. As technology shrinks, the power dissipated by the network links represents an ever more significant fraction of the total power budget. Based on this, the proposed selection policy tries to reduce link power dissipation by selecting the output port of the router which minimises the switching activity of the output link. A set of experiments carried out on both synthetic and real traffic scenarios is presented. When the proposed selection policy is used in conjunction with a data encoding technique, on average, 31% of energy reduction and 37% of power saving is observed. An architectural implementation of the selection policy is also presented and its impact on cost (silicon area) and power dissipation of the baseline router is discussed

    Physical parameter-aware Networks-on-Chip design

    Get PDF
    PhD ThesisNetworks-on-Chip (NoCs) have been proposed as a scalable, reliable and power-efficient communication fabric for chip multiprocessors (CMPs) and multiprocessor systems-on-chip (MPSoCs). NoCs determine both the performance and the reliability of such systems, with a significant power demand that is expected to increase due to developments in both technology and architecture. In terms of architecture, an important trend in many-core systems architecture is to increase the number of cores on a chip while reducing their individual complexity. This trend increases communication power relative to computation power. Moreover, technology-wise, power-hungry wires are dominating logic as power consumers as technology scales down. For these reasons, the design of future very large scale integration (VLSI) systems is moving from being computation-centric to communication-centric. On the other hand, chip’s physical parameters integrity, especially power and thermal integrity, is crucial for reliable VLSI systems. However, guaranteeing this integrity is becoming increasingly difficult with the higher scale of integration due to increased power density and operating frequencies that result in continuously increasing temperature and voltage drops in the chip. This is a challenge that may prevent further shrinking of devices. Thus, tackling the challenge of power and thermal integrity of future many-core systems at only one level of abstraction, the chip and package design for example, is no longer sufficient to ensure the integrity of physical parameters. New designtime and run-time strategies may need to work together at different levels of abstraction, such as package, application, network, to provide the required physical parameter integrity for these large systems. This necessitates strategies that work at the level of the on-chip network with its rising power budget. This thesis proposes models, techniques and architectures to improve power and thermal integrity of Network-on-Chip (NoC)-based many-core systems. The thesis is composed of two major parts: i) minimization and modelling of power supply variations to improve power integrity; and ii) dynamic thermal adaptation to improve thermal integrity. This thesis makes four major contributions. The first is a computational model of on-chip power supply variations in NoCs. The proposed model embeds a power delivery model, an NoC activity simulator and a power model. The model is verified with SPICE simulation and employed to analyse power supply variations in synthetic and real NoC workloads. Novel observations regarding power supply noise correlation with different traffic patterns and routing algorithms are found. The second is a new application mapping strategy aiming vii to minimize power supply noise in NoCs. This is achieved by defining a new metric, switching activity density, and employing a force-based objective function that results in minimizing switching density. Significant reductions in power supply noise (PSN) are achieved with a low energy penalty. This reduction in PSN also results in a better link timing accuracy. The third contribution is a new dynamic thermal-adaptive routing strategy to effectively diffuse heat from the NoC-based threedimensional (3D) CMPs, using a dynamic programming (DP)-based distributed control architecture. Moreover, a new approach for efficient extension of two-dimensional (2D) partially-adaptive routing algorithms to 3D is presented. This approach improves three-dimensional networkon- chip (3D NoC) routing adaptivity while ensuring deadlock-freeness. Finally, the proposed thermal-adaptive routing is implemented in field-programmable gate array (FPGA), and implementation challenges, for both thermal sensing and the dynamic control architecture are addressed. The proposed routing implementation is evaluated in terms of both functionality and performance. The methodologies and architectures proposed in this thesis open a new direction for improving the power and thermal integrity of future NoC-based 2D and 3D many-core architectures

    Custom wireless sensor for monitoring grazing of free-range cattle

    Get PDF
    Scope and Method of Study: The purpose of this study was to develop a wireless sensor device capable of sensing cattle grazing activity. This included design and build of a miniaturized PCB, sensor specification, data processing, and experimental validation. Experiments were conducted in cooperation with the Oklahoma State University, Animal Science and Biosystems Engineering departments. The primary objective of this study was to provide information supporting the use of an accelerometer sensor for monitoring free-range cattle grazing activity. A wireless sensor platform was also developed for sensor and wireless communication development needs. Secondary objectives included exploring alternative applications, such as monitoring cattle waste excretion events, and identifying wireless network functionality for agricultural environments.Findings and Conclusions: During this study, parameters for using an accelerometer based grazing sensor were established relative to the head motion of grazing cattle. Initially, a survey of literature and video analysis of foraging livestock animals were conducted, where 0.5-8 bites/sec was confirmed as animal bite rate range. The preliminary video analysis provided guidelines for establishing a sensing strategy. Sensor data processing algorithm development and sampling rate selection were driven by video provided characteristics and sensor platform capability. The Fast Fourier Transform (FFT) was selected as the core component of the sensor's algorithm. The FFT was able to characterize grazing motions because of the animal's near-continuous periodic head movements. At least five bite cycles and a 32 Hz sampling rate were required for proper algorithm implementation. A sample size of 256 data points were collected for each accelerometer axis, and proved to be adequate for the FFT computations. A revised sample rate of 21.74 Hz was presented once the FFT was implemented in firmware. This new rate retained well performing FFT calculations based on the understanding that bite rates faster than 4 bites/sec were due to nibbling and partial bites. The FFT's Spectral power was binned and stored for the purpose of data compression and reduced wireless transmissions.The wireless sensor device platform was built using the CC1010 microcontroller/transceiver IC. The CC1010 provided integrated features commendable for fast FFT processing and conservative PCB layout design. The radio was configured for robust operation by using a 915 MHz carrier frequency, Manchester encoding, and 64 kHz frequency spread. A small, helical, and omnidirectional antenna was mounted directly to the PCB. Link budget was estimated to be 81 dBm, which equated to a 282 m (925 ft) transmission distance in optimum conditions. The device's dimensions were 19.6 mm (0.77 in) X 71.8 mm (2.83 in) X 11.0 mm (0.43 in). A custom PVC enclosure was used to house the device. For deploying experiments, the enclosure was fastened to a standard nylon turnout halter. A miniature GPS logger was also attached to the halter, which allowed for constructing grazing maps.Additionally, the proposed wireless sensor device was used to detect cattle urination and defecation events. This was accomplished by attaching the device to an animal's tail and sensing its elevated movements. Tilt measurements in the z-axis (front-to-back) direction provided the most prominent evidence of a distinct tail movement pattern during excretion events. A pattern recognition strategy was shown as a viable sensing method.An outline for a multilevel-networked system was also generated. This included cellular and internet communications, along with a customized application software for base/node management

    Proposal and development of a highly modular and scalable self-adaptive hardware architecture with parallel processing capability

    Get PDF
    This dissertation describes a novel unconventional self-adaptive hardware architecture with capacity for parallel processing. For scalability issues, this bioinspired architecture is based on a regular array of homogeneous cells. The proposed programmable architecture implements in a distributed way self-adaptive capabilities including self-placement and self-routing which, due to its intrinsic design, enable the development of systems with runtime reconfiguration, self-repair and/or fault tolerance capabilities. The physical implementation of this architecture is composed of two-layers, interconnected cells in the first level and interconnected switch and pin matrices in the second level. The cell is the basic element of the proposed self-adaptive architecture. Any application scheduled to the system has to be organized in components, where each component is composed by one or more interconnected cells. The interconnection of cells inside a component is made at cell level (first layer), while the physical interconnections of components are made in the second layer. Additionally, two layers are defined as conceptual organization for the implementation of general purpose applications: the SANE and the SANE assembly. The Self-Adaptive Networked Entity (SANE) is composed by a group of components. This is the basic self-adaptive computing system. It has the ability to monitor its local environment and its internal computation process. The SANE-Assembly (SANE-ASM) is composed by a group of interconnected SANEs. The processing capabilities of the cell are included in its Functional Unit (FU), which can be described as a four-core configurable multicomputer. The FU includes twelve programmable configuration modes, i.e., each cell permits to select from one to four processors working in parallel, with different size of program and data memories. The self-adaptive capabilities of the cell are executed mainly by the Cell Configuration Unit (CCU). The self-placement algorithm is responsible for finding out the most suitable position in the cell array to insert the new cell of a component. The self-routing algorithm permits interconnecting the ports of the FU of two cells through the cell ports. The self-placement and self-routing processes allow for performing complex functionality changes in real time, these processes endow the system with enhanced functionality, enabling the system to change itself, this allows for the implementation of run-time self-configuration, without the need for any configuration manager. The architecture proposed includes two mechanisms of fault tolerance. One of these is the Dynamic Fault Tolerance Scaling Technique, that has the ability to create and eliminate the redundant copies of the functional section of a specific application. The other mechanism of fault tolerance is a dedicated or static Fault Tolerance System. It provides redundant processing capabilities that are working continuously. When a failure in the execution of a program is detected, the processors of the cell are stopped and the self-elimination and self-replication processes start for the cell (or cells) involved in the failure. An FPGA-based prototype and a software tool have been built for demonstration purposes. The prototype includes all the self-adaptive capabilities described in this dissertation. With the purpose of having a complete development system, the software tool SANE Project Developer (SPD) has been implemented. The SPD is an Integrated Development Environment (IDE) that allows generating the memory initialization data for the control microprocessor inside the prototype.Esta tesis doctoral describe una arquitectura de hardware auto-adaptable novedosa y no convencional con capacidad de procesamiento en paralelo. Por razones de escalabilidad, esta arquitectura bioinspirada está basada en una matriz regular de células homogéneas. La arquitectura propuesta es programable, e implementa de manera distribuida diversas capacidades auto-adaptables incluyendo el auto-emplazamiento y auto-enrutamiento, los cuales debido a su diseño intrínseco, permiten el desarrollo de sistemas reconfigurables en tiempo de ejecución, así como de sistemas autoreparables y/o con capacidades de tolerancia a fallos. La implementación física de esta arquitectura esta compuesta de dos capas, que incluyen células interconectadas en el primer nivel y matrices de conmutación y pines en el segundo nivel. La célula es el elemento básico de la arquitectura propuesta. Cualquier aplicación que se quiera programar en el sistema debe estar organizada en componentes, donde cada componente está compuesto por una o más células interconectadas. La interconexión de células dentro de un componente es realizado en el mismo nivel de la matriz de células, mientras que la interconexión de componentes es realizada en la segunda capa. Adicionalmente, se definen dos capas conceptuales que son usadas con propósitos organizativos en aplicaciones de propósito general, estas son: el SANE y el SANE-assembly (o conjunto de SANEs). La entidad auto-adaptable interconectada o SANE está compuesta por un grupo de componentes. Este es el sistema de computación auto-adaptable básico, el cual tiene la habilidad de monitorizar su entorno local y su proceso de computación interno. Las capacidades de procesamiento de la célula están incluidas en su unidad funcional (FU). Esta puede ser definida como un multicomputador configurable con cuatro núcleos, los cuales son agrupados o no dependiendo del modo de configuración. La FU tiene doce modos de configuración programables, por lo que cada célula permite seleccionar entre uno y cuatro procesadores trabajando en paralelo con diversas capacidades en las memorias de programa y datos. Las capacidades auto-adaptables de la célula son ejecutadas principalmente por la unidad de configuración de la célula (CCU). El algoritmo de auto-emplazamiento es el encargado de encontrar la posición mas adecuada dentro de la matriz de células para insertar la nueva célula de un componente. El algoritmo de auto-enrutamiento permite interconectar los puertos de las FU de dos células. Los procesos de auto-emplazamiento y auto-enrutamiento permiten realizar en tiempo real cambios funcionales complejos; estos procesos dotan al sistema de una mayor funcionalidad, permitiendo que el sistema cambie por si mismo, lo que permite la implementación de la auto-configuración en tiempo real, sin la necesidad de ningún gestor de configuración. La arquitectura propuesta incluye dos mecanismos de tolerancia a fallos. Uno de estos es una técnica escalonada y dinámica de tolerancia a fallos, que tiene la habilidad de crear y eliminar copias redundantes de la unidad funcional (o de cómputo) de una aplicación específica. El otro mecanismo de tolerancia a fallos es el Sistema de Tolerancia a Fallos dedicado o estático. Este provee capacidades de procesamiento redundante que están en funcionamiento continuamente. Cuando un fallo en la ejecución de un programa es detectado, los procesadores de la célula son detenidos y los procesos de auto-eliminación y auto-replicación se inician para la célula (o células) implicada en el fallo. Se desarrolló un prototipo basado en FPGAs y una herramienta de software para comprobar la funcionalidad del sistema. El prototipo incluye todas las características de los sistemas auto-adaptable descritas en este trabajo. El SANE Project developer (SPD) es un ambiente integrado de desarrollo (IDE) que permite generar y descargar la memoria de inicialización de datos para el Microprocesador de Control dentro del prototipo
    corecore