557 research outputs found

    Proceedings of the Second International Workshop on HyperTransport Research and Applications (WHTRA2011)

    Get PDF
    Proceedings of the Second International Workshop on HyperTransport Research and Applications (WHTRA2011) which was held Feb. 9th 2011 in Mannheim, Germany. The Second International Workshop for Research on HyperTransport is an international high quality forum for scientists, researches and developers working in the area of HyperTransport. This includes not only developments and research in HyperTransport itself, but also work which is based on or enabled by HyperTransport. HyperTransport (HT) is an interconnection technology which is typically used as system interconnect in modern computer systems, connecting the CPUs among each other and with the I/O bridges. Primarily designed as interconnect between high performance CPUs it provides an extremely low latency, high bandwidth and excellent scalability. The definition of the HTX connector allows the use of HT even for add-in cards. In opposition to other peripheral interconnect technologies like PCI-Express no protocol conversion or intermediate bridging is necessary. HT is a direct connection between device and CPU with minimal latency. Another advantage is the possibility of cache coherent devices. Because of these properties HT is of high interest for high performance I/O like networking and storage, but also for co-processing and acceleration based on ASIC or FPGA technologies. In particular acceleration sees a resurgence of interest today. One reason is the possibility to reduce power consumption by the use of accelerators. In the area of parallel computing the low latency communication allows for fine grain communication schemes and is perfectly suited for scalable systems. Summing up, HT technology offers key advantages and great performance to any research aspect related to or based on interconnects. For more information please consult the workshop website (http://whtra.uni-hd.de)

    Extending HyperTransport Protocol for Improved Scalability

    Get PDF
    HyperTransport 3.10 is the best open standard communication technology for chip-to-chip interconnects. However, its extraordinary features are of little help when building mid- and large-scale systems because it is unable to natively scale beyond 8 computing nodes. Therefore, it must be complemented by other interconnect technologies. The HyperTransport Consortium has intensively stimulated discussions among its high-level members in order to overcome those shortcomings. The result is the High Node Count HyperTransport Specification, which defines protocol extensions to the HyperTransport I/O Link Specification Rev. 3.10 that enable HyperTransport to natively support high numbers of computing nodes, typical of large scale server clustering and High Performance Computing (HPC) applications. This extension has been carefully designed in such a way that it extends the maximum number of connected devices to a number large enough to support current and future scalability requirements, while preserving the excellent features that made HyperTransport successful and keeping full backward compatibility with it

    Proceedings of the First International Workshop on HyperTransport Research and Applications (WHTRA2009)(revised 08/2009)

    Get PDF
    Proceedings of the First International Workshop on HyperTransport Research and Applications (WHTRA2009) which was held Feb. 12th 2009 in Mannheim, Germany. The 1st International Workshop for Research on HyperTransport is an international high quality forum for scientists, researches and developers working in the area of HyperTransport. This includes not only developments and research in HyperTransport itself, but also work which is based on or enabled by HyperTransport. HyperTransport (HT) is an interconnection technology which is typically used as system interconnect in modern computer systems, connecting the CPUs among each other and with the I/O bridges. Primarily designed as interconnect between high performance CPUs it provides an extremely low latency, high bandwidth and excellent scalability. The definition of the HTX connector allows the use of HT even for add-in cards. In opposition to other peripheral interconnect technologies like PCI-Express no protocol conversion or intermediate bridging is necessary. HT is a direct connection between device and CPU with minimal latency. Another advantage is the possibility of cache coherent devices. Because of these properties HT is of high interest for high performance I/O like networking and storage, but also for co-processing and acceleration based on ASIC or FPGA technologies. In particular acceleration sees a resurgence of interest today. One reason is the possibility to reduce power consumption by the use of accelerators. In the area of parallel computing the low latency communication allows for fine grain communication schemes and is perfectly suited for scalable systems. Summing up, HT technology offers key advantages and great performance to any research aspect related to or based on interconnects. For more information please consult the workshop website (http://whtra.uni-hd.de)

    Proceedings of the First International Workshop on HyperTransport Research and Applications (WHTRA2009)

    Get PDF
    Proceedings of the First International Workshop on HyperTransport Research and Applications (WHTRA2009) which was held Feb. 12th 2009 in Mannheim, Germany. The 1st International Workshop for Research on HyperTransport is an international high quality forum for scientists, researches and developers working in the area of HyperTransport. This includes not only developments and research in HyperTransport itself, but also work which is based on or enabled by HyperTransport. HyperTransport (HT) is an interconnection technology which is typically used as system interconnect in modern computer systems, connecting the CPUs among each other and with the I/O bridges. Primarily designed as interconnect between high performance CPUs it provides an extremely low latency, high bandwidth and excellent scalability. The definition of the HTX connector allows the use of HT even for add-in cards. In opposition to other peripheral interconnect technologies like PCI-Express no protocol conversion or intermediate bridging is necessary. HT is a direct connection between device and CPU with minimal latency. Another advantage is the possibility of cache coherent devices. Because of these properties HT is of high interest for high performance I/O like networking and storage, but also for co-processing and acceleration based on ASIC or FPGA technologies. In particular acceleration sees a resurgence of interest today. One reason is the possibility to reduce power consumption by the use of accelerators. In the area of parallel computing the low latency communication allows for fine grain communication schemes and is perfectly suited for scalable systems. Summing up, HT technology offers key advantages and great performance to any research aspect related to or based on interconnects. For more information please consult the workshop website (http://whtra.uni-hd.de)

    HyperTransport Over Ethernet - A Scalable, Commodity Standard for Resource Sharing in the Data Center

    Get PDF
    Future data center configurations are driven by total cost of ownership (TCO) for specific performance capabilities. Low-latency interconnects are central to performance, while the use of commodity interconnects is central to cost. This paper reports on an effort to combine a very high-performance, commodity interconnect (HyperTransport) with a high-volume interconnect (Ethernet). Previous approaches to extending Hyper-Transport (HT) over a cluster used custom FPGA cards [5] and proprietary extensions to coherence schemes [22], but these solutions mainly have been adopted for use in research-oriented clusters. The new HyperShare strategy from the HyperTransport Consortium proposes several new ways to create low-cost, commodity clusters that can support scalable high performance computing in either clusters or in the data center. HyperTransport over Ethernet (HToE) is the newest specification in the HyperShare strategy that aims to combine favorable market trends with a highbandwidth and low-latency hardware solution for noncoherent sharing of resources in a cluster. This paper illustrates the motivation behind using 10, 40, or 100 Gigabit Ethernet as an encapsulation layer for Hyper-Transport, the requirements for the HToE specification, and engineering solutions for implementing key portions of the specification

    A HT3 Platform for Rapid Prototyping and High Performance Reconfigurable Computing

    Get PDF
    FPGAs as reconfigurable devices play an important role in both rapid prototyping and high performance reconfigurable computing. Usually, FPGA vendors help the users with pre-designed cores, for instance for various communication protocols. However, this is only true for widely used protocols. In the use case described here, the target application may benefit from a tight integration of the FPGA in a computing system. Typical commodity protocols like PCI Express may not fulfill these demands. HyperTransport (HT), on the other hand, allows connecting directly and without intermediate bridges or protocol conversion to a processor interface. As a result, communication costs between the FPGA unit and both processor and main memory are minimal. In this paper we present an HT3 interface for Stratix IV based FPGAs, which allows for minimal latencies and high bandwidths between processor and device and main memory and device. Designs targeting a HT connection can now be prototyped in real world systems. Furthermore, this design can be leveraged for acceleration tasks, with the minimal communication costs allowing fine-grain work deployment and the use of cost-efficient main memory instead of size-limited and costly on-device memory

    Analysis of Inter-Chip Communication Patterns on Multi-Core Distributed Shared-Memory Computers

    Get PDF
    Multi-core multi-socket distributed shared-memory com- puters (DSM computers, for short) have become an impor- tant node architecture in scientific computing as they provide substantial computational capacity with relatively low space and power requirements. Compared to conventional computer networks, inter-chip networks used in DSM computers feature higher bandwidth, lower latency and tighter integration with the CPU. The inter-chip network is a shared resource among the user application and many other services, which can lead to consid- erable variation of execution times of identical communication tasks. In this work, we explore traffic patterns resulting from MPI collective communication primitives and investigate the ques- tion whether inter-chip link load is a reliable indicator and predictor for the execution time of collective communication primitives on a DSM computer. Our experiments on a Sun Fire X4600 M2 DSM computer with 32 cores (eight quad-core CPUs) indicate that specific single link loads are positively correlated with the execution time of MPI ALLREDUCE. Ob- serving patterns over multiple links allows refinement of the single-link observation

    PGAS Model for the Implementation of Scalable Cluster Systems

    Get PDF
    This paper introduces an extended version of the traditional Partitioned Global Address Space (PGAS) model, for the implementation of scalable cluster systems, that the HyperTransport Consortium Advanced Technology Group (ATG) is working on. Using the Simics and GEMS simulators, we developed a software module that approximates the behavior of a PGAS cluster. This approach mainly provides the simplest mechanism to evaluate how much the PGAS infrastructure will affect overall the application performance. The aim of this work is to study the feasibility of the ATG’s PGAS model for running applications with high memory requirements. Such a model, will let manufacturers build clusters that enable the execution of these applications, in such a way that it will be impossible to run them in a single processor, or in a multi–processor

    Keeping Secrets in Hardware: the Microsoft Xbox(TM) Case Study

    Get PDF
    This paper discusses the hardware foundations of the cryptosystem employed by the Xbox(TM) video game console from Microsoft. A secret boot block overlay is buried within a system ASIC. This secret boot block decrypts and verifies portions of an external FLASH-type ROM. The presence of the secret boot block is camouflaged by a decoy boot block in the external ROM. The code contained within the secret boot block is transferred to the CPU in the clear over a set of high-speed busses where it can be extracted using simple custom hardware. The paper concludes with recommendations for improving the Xbox security system. One lesson of this study is that the use of a high-performance bus alone is not a sufficient security measure, given the advent of inexpensive, fast rapid prototyping services and high-performance FPGAs
    • …
    corecore