1,033 research outputs found

    Worst-case end-to-end delays evaluation for SpaceWire networks

    Get PDF
    SpaceWire is a standard for on-board satellite networks chosen by the ESA as the basis for multiplexing payload and control traffic on future data-handling architectures. However, network designers need tools to ensure that the network is able to deliver critical messages on time. Current research fails to address this needs for SpaceWire networks. On one hand, many papers only seek to determine probabilistic results for end-to-end delays on Wormhole networks like SpaceWire. This does not provide sufficient guarantee for critical traffic. On the other hand, a few papers give methods to determine maximum latencies on wormhole networks that, unlike SpaceWire, have dedicated real-time mechanisms built-in. Thus, in this paper, we propose an appropriate method to compute an upper-bound on the worst-case end-to-end delay of a packet in a SpaceWire network

    The Chameleon Architecture for Streaming DSP Applications

    Get PDF
    We focus on architectures for streaming DSP applications such as wireless baseband processing and image processing. We aim at a single generic architecture that is capable of dealing with different DSP applications. This architecture has to be energy efficient and fault tolerant. We introduce a heterogeneous tiled architecture and present the details of a domain-specific reconfigurable tile processor called Montium. This reconfigurable processor has a small footprint (1.8 mm2^2 in a 130 nm process), is power efficient and exploits the locality of reference principle. Reconfiguring the device is very fast, for example, loading the coefficients for a 200 tap FIR filter is done within 80 clock cycles. The tiles on the tiled architecture are connected to a Network-on-Chip (NoC) via a network interface (NI). Two NoCs have been developed: a packet-switched and a circuit-switched version. Both provide two types of services: guaranteed throughput (GT) and best effort (BE). For both NoCs estimates of power consumption are presented. The NI synchronizes data transfers, configures and starts/stops the tile processor. For dynamically mapping applications onto the tiled architecture, we introduce a run-time mapping tool

    Simulation Of Multi-core Systems And Interconnections And Evaluation Of Fat-Mesh Networks

    Get PDF
    Simulators are very important in computer architecture research as they enable the exploration of new architectures to obtain detailed performance evaluation without building costly physical hardware. Simulation is even more critical to study future many-core architectures as it provides the opportunity to assess currently non-existing computer systems. In this thesis, a multiprocessor simulator is presented based on a cycle accurate architecture simulator called SESC. The shared L2 cache system is extended into a distributed shared cache (DSC) with a directory-based cache coherency protocol. A mesh network module is extended and integrated into SESC to replace the bus for scalable inter-processor communication. While these efforts complete an extended multiprocessor simulation infrastructure, two interconnection enhancements are proposed and evaluated. A novel non-uniform fat-mesh network structure similar to the idea of fat-tree is proposed. This non-uniform mesh network takes advantage of the average traffic pattern, typically all-to-all in DSC, to dedicate additional links for connections with heavy traffic (e.g., near the center) and fewer links for lighter traffic (e.g., near the periphery). Two fat-mesh schemes are implemented based on different routing algorithms. Analytical fat-mesh models are constructed by presenting the expressions for the traffic requirements of personalized all-to-all traffic. Performance improvements over the uniform mesh are demonstrated in the results from the simulator. A hybrid network consisting of one packet switching plane and multiple circuit switching planes is constructed as the second enhancement. The circuit switching planes provide fast paths between neighbors with heavy communication traffic. A compiler technique that abstracts the symbolic expressions of benchmarks' communication patterns can be used to help facilitate the circuit establishment

    Carbon nanotubes as interconnect for next generation network on chip

    Get PDF
    Multi-core processors provide better performance when compared with their single-core equivalent. Recently, Networks-on-Chip (NoC) have emerged as a communication methodology for multi core chips. Network-on-Chip uses packet based communication for establishing a communication path between multiple cores connected via interconnects. Clock frequency, energy consumption and chip size are largely determined by these interconnects. According to the International Technology Roadmap for Semiconductors (ITRS), in the next five years up to 80% of microprocessor power will be consumed by interconnects. In the sub 100nm scaling range, interconnect behavior limits the performance and correctness of VLSI systems. The performance of copper interconnects tend to get reduced in the sub 100nm range and hence we need to examine other interconnect options. Single Wall Carbon Nanotubes exhibit better performance in sub 100nm processing technology due to their very large current carrying capacity and large electron mean free paths. This work suggests using Single Wall Carbon Nanotubes (SWCNT) as interconnects for Networks-on-Chip as they consume less energy and gives more throughput and bandwidth when compared with traditional Copper wires

    K-ary n-cube based off-chip communications architecture for high-speed packet processors

    Get PDF
    A k-ary n-cube interconnect architecture is proposed, as an off-chip communications architecture for line cards, to increase the throughput of the currently used memory system. The k-ary n-cube architecture allows multiple packet processing elements on a line card to access multiple memory modules. The main advantage of the proposed architecture is that it can sustain current line rates and higher while distributing the load among multiple memories. Moreover, the proposed interconnect can scale to adopt more memories and/or processors and as a result increasing the line card processing power. Our results portray that k-ary n-cube sustained higher incoming traffic load while keeping latency lower than its shared-bus competitor. © 2005 IEEE

    Synthesis Approach of 2D Mesh Network Inter Communication (2D-2D) using Network on Chip

    Get PDF
    The solution for the multiprocessor system architecture is Application specific Network on Chip (NOC) architectures which are emerging as a leading technology. Modeling and simulation of multilevel network structure and synthesis for custom NOC can beneficial in addressing several requirements such as bandwidth, inter process communication, multitasking application use, deadlock avoidance, router structures and port bandwidth. The paper emphasizes on the network on chip modeling and synthesis of 2D network and intercommunication among multilevel 2D networks. NOC synthesis environment provides transaction level network modeling and address all the requirements together in an integrated chip. In the paper consideration is done for 2D, 8 x 8 network and similar networks are considered which are identified by their specific network address. NOC chip is developed using VHDL programming language. Design is implemented in Xilinx 14.2 VHDL software, functional simulation is carried out in Modelsim 10.1 b, student edition and synthesis process is carried out on Digilent Sparten -3E FPGA

    Networks on Chips: Structure and Design Methodologies

    Get PDF

    A Shared memory multiprocessor system architecture utilizing a uniform

    Get PDF
    Due to VLSI lithography problems and the limitation of additional architectural enhancements uniprocessor systems are nearing the end of their life cycle. Therefore, it is believed that Symmetric Multiprocessing (SMP) systems will be the next mainstream computer. These systems allow multiple processors, accessing the same memory image, to cooperate on a number of computational tasks as a single entity. While multiprocessor systems can offer a substantial performance increase compared to uniprocessor systems, major design considerations must be addressed to achieve desired system efficiency levels. Managing cache coherence is a significant problem in multiprocessor systems. Current implementations cope with this problem by utilizing a cache coherence protocol. This protocol puts a large amount of overhead on the system bus to ensure proper program execution, effectively decreasing overall system performance. This thesis approaches the cache coherence problem from a new angle. Instead of utilizing a cache coherence protocol, a new memory system is proposed which eliminates the need for a cache coherence protocol, by utilizing a shared level 2 data-only cache. This new architecture allows for better utilization of the system and improved performance and scalability. A data rate analysis is conducted to demonstrate the potential performance increase from the proposed architecture over conventional approaches. The data rate model clearly shows an increase in system performance and utilization when using the architecture proposed in this thesis
    corecore