8,923 research outputs found

    Next-Generation Transport Networks Leveraging Universal Traffic Switching and Flexible Optical Transponders

    Get PDF
    Recent developments in communication technology contributed to the growth of network traffic exponentially. Cost per bit has to necessarily suffer an inverse trend, posing several challenges to network operators. Optical transport networks are no exception to this. On one hand, they have to keep up with the expectations of data speed, volume, and growth at the agreed quality-of-service (QoS), while on the other hand, a steep downward trend of the cost per bit is a matter of concern. Thus, the proper selection of network architecture, technology, resiliency schemes, and traffic handling contributes to the total cost of ownership (TCO). In this context, this chapter looks into the network architectures, including the optical transport network (OTN) switch (both traditional and universal), resiliency schemes (protection and restoration), flexible-rate line interfaces, and an overall strategy of handover in between metro and core networks. A design framework is also described and used to support the case studies reported in this chapter

    NVIDIA Tensor Core Programmability, Performance & Precision

    Full text link
    The NVIDIA Volta GPU microarchitecture introduces a specialized unit, called "Tensor Core" that performs one matrix-multiply-and-accumulate on 4x4 matrices per clock cycle. The NVIDIA Tesla V100 accelerator, featuring the Volta microarchitecture, provides 640 Tensor Cores with a theoretical peak performance of 125 Tflops/s in mixed precision. In this paper, we investigate current approaches to program NVIDIA Tensor Cores, their performances and the precision loss due to computation in mixed precision. Currently, NVIDIA provides three different ways of programming matrix-multiply-and-accumulate on Tensor Cores: the CUDA Warp Matrix Multiply Accumulate (WMMA) API, CUTLASS, a templated library based on WMMA, and cuBLAS GEMM. After experimenting with different approaches, we found that NVIDIA Tensor Cores can deliver up to 83 Tflops/s in mixed precision on a Tesla V100 GPU, seven and three times the performance in single and half precision respectively. A WMMA implementation of batched GEMM reaches a performance of 4 Tflops/s. While precision loss due to matrix multiplication with half precision input might be critical in many HPC applications, it can be considerably reduced at the cost of increased computation. Our results indicate that HPC applications using matrix multiplications can strongly benefit from using of NVIDIA Tensor Cores.Comment: This paper has been accepted by the Eighth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES) 201

    On Content-centric Wireless Delivery Networks

    Full text link
    The flux of social media and the convenience of mobile connectivity has created a mobile data phenomenon that is expected to overwhelm the mobile cellular networks in the foreseeable future. Despite the advent of 4G/LTE, the growth rate of wireless data has far exceeded the capacity increase of the mobile networks. A fundamentally new design paradigm is required to tackle the ever-growing wireless data challenge. In this article, we investigate the problem of massive content delivery over wireless networks and present a systematic view on content-centric network design and its underlying challenges. Towards this end, we first review some of the recent advancements in Information Centric Networking (ICN) which provides the basis on how media contents can be labeled, distributed, and placed across the networks. We then formulate the content delivery task into a content rate maximization problem over a share wireless channel, which, contrasting the conventional wisdom that attempts to increase the bit-rate of a unicast system, maximizes the content delivery capability with a fixed amount of wireless resources. This conceptually simple change enables us to exploit the "content diversity" and the "network diversity" by leveraging the abundant computation sources (through application-layer encoding, pushing and caching, etc.) within the existing wireless networks. A network architecture that enables wireless network crowdsourcing for content delivery is then described, followed by an exemplary campus wireless network that encompasses the above concepts.Comment: 20 pages, 7 figures,accepted by IEEE Wireless Communications,Sept.201

    Dependable Information Exchange for the Next Generation Mobile Cyber-Physical Systems

    Get PDF
    Mobile cyber-physical systems (M-CPSs) are envisaged as an integral part of our digital future. Dependability of M-CPSs is subject to timely, reliable, and secure information exchange among M-CPS entities. Information exchange provisioning in such systems is conventionally built with sole reliance on wireless connectivity. The conventional approaches, however, fail to efficiently exploit dynamism and heterogeneity, and to incorporate computing/cooperation as alternative system-wide tools for information exchange. To address these issues, we approach M-CPSs dependability from the information exchange perspective and define dependable-exchange-of-information (DeX) indicating collective M-CPS capability of information exchange provisioning. We then propose a cloud-based architecture for DeX provisioning as a service to facilitate versatile development of dependable M-CPSs
    • …
    corecore