386 research outputs found

    Caching Techniques in Next Generation Cellular Networks

    Get PDF
    Content caching will be an essential feature in the next generations of cellular networks. Indeed, a network equipped with caching capabilities allows users to retrieve content with reduced access delays and consequently reduces the traffic passing through the network backhaul. However, the deployment of the caching nodes in the network is hindered by the following two challenges. First, the storage space of a cache is limited as well as expensive. So, it is not possible to store in the cache every content that can be possibly requested by the user. This calls for efficient techniques to determine the contents that must be stored in the cache. Second, efficient ways are needed to implement and control the caching node. In this thesis, we investigate caching techniques focussing to address the above-mentioned challenges, so that the overall system performance is increased. In order to tackle the challenge of the limited storage capacity, smart proactive caching strategies are needed. In the context of vehicular users served by edge nodes, we believe a caching strategy should be adapted to the mobility characteristics of the cars. In this regard, we propose a scheme called RICH (RoadsIde CacHe), which optimally caches content at the edge nodes where connected vehicles require it most. In particular, our scheme is designed to ensure in-order delivery of content chunks to end users. Unlike blind popularity decisions, the probabilistic caching used by RICH considers vehicular trajectory predictions as well as content service time by edge nodes. We evaluate our approach on realistic mobility datasets against a popularity-based edge approach called POP, and a mobility-aware caching strategy known as netPredict. In terms of content availability, our RICH edge caching scheme provides an enhancement of up to 33% and 190% when compared with netPredict and POP respectively. At the same time, the backhaul penalty bandwidth is reduced by a factor ranging between 57% and 70%. Caching node is an also a key component in Named Data Networking (NDN) that is an innovative paradigm to provide content based services in future networks. As compared to legacy networks, naming of network packets and in-network caching of content make NDN more feasible for content dissemination. However, the implementation of NDN requires drastic changes to the existing network infrastructure. One feasible approach is to use Software Defined Networking (SDN), according to which the control of the network is delegated to a centralized controller, which configures the forwarding data plane. This approach leads to large signaling overhead as well as large end-to-end (e2e) delays. In order to overcome these issues, in this work, we provide an efficient way to implement and control the NDN node. We propose to enable NDN using a stateful data plane in the SDN network. In particular, we realize the functionality of an NDN node using a stateful SDN switch attached with a local cache for content storage, and use OpenState to implement such an approach. In our solution, no involvement of the controller is required once the OpenState switch has been configured. We benchmark the performance of our solution against the traditional SDN approach considering several relevant metrics. Experimental results highlight the benefits of a stateful approach and of our implementation, which avoids signaling overhead and significantly reduces e2e delays

    PASSION: Parallel And Scalable Software for Input-Output

    Get PDF
    We are developing a software system called PASSION: Parallel And Scalable Software for Input-Output which provides software support for high performance parallel I/O. PASSION provides support at the language, compiler, runtime as well as file system level. PASSION provides runtime procedures for parallel access to files (read/write), as well as for out-of-core computations. These routines can either be used together with a compiler to translate out-of-core data parallel programs written in a language like HPF, or used directly by application programmers. A number of optimizations such as Two-Phase Access, Data Sieving, Data Prefetching and Data Reuse have been incorporated in the PASSION Runtime Library for improved performance. PASSION also provides an initial framework for runtime support for out-of-core irregular problems. The goal of the PASSION compiler is to automatically translate out- of-core data parallel programs to node programs for distributed memory machines, with calls to the PASSION Runtime Library. At the language level, PASSION suggests extensions to HPF for out-of-core programs. At the file system level, PASSION provides support for buffering and prefetching data from disks. A portable parallel file system is also being developed as part of this project, which can be used across homogeneous or heterogeneous networks of workstations. PASSION also provides support for integrating data and task parallelism using parallel I/O techniques. We have used PASSION to implement a number of out-of-core applications such as a Laplace\u27s equation solver, 2D FFT, Matrix Multiplication, LU Decomposition, image processing applications as well as unstructured mesh kernels in molecular dynamics and computational fluid dynamics. We are currently in the process of using PASSION in applications in CFD (3D turbulent flows), molecular structure calculations, seismic computations, and earth and space science applications such as Four-Dimensional Data Assimilation. PASSION is currently available on the Intel Paragon, Touchstone Delta and iPSC/860. Efforts are underway to port it to the IBM SP-1 and SP-2 using the Vesta Parallel File System

    Building high-performance web-caching servers

    Get PDF

    Prefetch throttling and data pinning for improving performance of shared caches

    Get PDF
    In this paper, we (i) quantify the impact of compilerdirected I/O prefetching on shared caches at I/O nodes. The experimental data collected shows that while I/O prefetching brings some benefits, its effectiveness reduces significantly as the number of clients (compute nodes) is increased; (ii) identify interclient misses due to harmful I/O prefetches as one of the main sources for this reduction in performance with increased number of clients; and (iii) propose and experimentally evaluate prefetch throttling and data pinning schemes to improve performance of I/O prefetching. Prefetch throttling prevents one or more clients from issuing further prefetches if such prefetches are predicted to be harmful, i.e., replace from the memory cache the useful data accessed by other clients. Data pinning on the other hand makes selected data blocks immune to harmful prefetches by pinning them in the memory cache. We show that these two schemes can be applied in isolation or combined together, and they can be applied at a coarse or fine granularity. Our experiments with these two optimizations using four disk-intensive applications reveal that they can improve performance by 9.7% and 15.1% on average, over standard compiler-directed I/O prefetching and no-prefetch case, respectively, when 8 clients are used. © 2008 IEEE

    LEAP Scratchpads: Automatic Memory and Cache Management for Reconfigurable Logic [Extended Version]

    Get PDF
    CORRECTION: The authors for entry [4] in the references should have been "E. S. Chung, J. C. Hoe, and K. Mai".Developers accelerating applications on FPGAs or other reconfigurable logic have nothing but raw memory devices in their standard toolkits. Each project typically includes tedious development of single-use memory management. Software developers expect a programming environment to include automatic memory management. Virtual memory provides the illusion of very large arrays and processor caches reduce access latency without explicit programmer instructions. LEAP scratchpads for reconfigurable logic dynamically allocate and manage multiple, independent, memory arrays in a large backing store. Scratchpad accesses are cached automatically in multiple levels, ranging from shared on-board, RAM-based, set-associative caches to private caches stored in FPGA RAM blocks. In the LEAP framework, scratchpads share the same interface as on-die RAM blocks and are plug-in replacements. Additional libraries support heap management within a storage set. Like software developers, accelerator authors using scratchpads may focus more on core algorithms and less on memory management. Two uses of FPGA scratchpads are analyzed: buffer management in an H.264 decoder and memory management within a processor microarchitecture timing model

    Prefetching and Caching Techniques in File Systems for Mimd Multiprocessors

    Get PDF
    The increasing speed of the most powerful computers, especially multiprocessors, makes it difficult to provide sufficient I/O bandwidth to keep them running at full speed for the largest problems. Trends show that the difference in the speed of disk hardware and the speed of processors is increasing, with I/O severely limiting the performance of otherwise fast machines. This widening access-time gap is known as the “I/O bottleneck crisis.” One solution to the crisis, suggested by many researchers, is to use many disks in parallel to increase the overall bandwidth. \par This dissertation studies some of the file system issues needed to get high performance from parallel disk systems, since parallel hardware alone cannot guarantee good performance. The target systems are large MIMD multiprocessors used for scientific applications, with large files spread over multiple disks attached in parallel. The focus is on automatic caching and prefetching techniques. We show that caching and prefetching can transparently provide the power of parallel disk hardware to both sequential and parallel applications using a conventional file system interface. We also propose a new file system interface (compatible with the conventional interface) that could make it easier to use parallel disks effectively. \par Our methodology is a mixture of implementation and simulation, using a software testbed that we built to run on a BBN GP1000 multiprocessor. The testbed simulates the disks and fully implements the caching and prefetching policies. Using a synthetic workload as input, we use the testbed in an extensive set of experiments. The results show that prefetching and caching improved the performance of parallel file systems, often dramatically

    Effective data parallel computing on multicore processors

    Get PDF
    The rise of chip multiprocessing or the integration of multiple general purpose processing cores on a single chip (multicores), has impacted all computing platforms including high performance, servers, desktops, mobile, and embedded processors. Programmers can no longer expect continued increases in software performance without developing parallel, memory hierarchy friendly software that can effectively exploit the chip level multiprocessing paradigm of multicores. The goal of this dissertation is to demonstrate a design process for data parallel problems that starts with a sequential algorithm and ends with a high performance implementation on a multicore platform. Our design process combines theoretical algorithm analysis with practical optimization techniques. Our target multicores are quad-core processors from Intel and the eight-SPE IBM Cell B.E. Target applications include Matrix Multiplications (MM), Finite Difference Time Domain (FDTD), LU Decomposition (LUD), and Power Flow Solver based on Gauss-Seidel (PFS-GS) algorithms. These applications are popular computation methods in science and engineering problems and are characterized by unit-stride (MM, LUD, and PFS-GS) or 2-point stencil (FDTD) memory access pattern. The main contributions of this dissertation include a cache- and space-efficient algorithm model, integrated data pre-fetching and caching strategies, and in-core optimization techniques. Our multicore efficient implementations of the above described applications outperform nai¨ve parallel implementations by at least 2x and scales well with problem size and with the number of processing cores

    Extending the reach of microprocessors : column and curious caching

    Get PDF
    Thesis (Ph.D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1999.Includes bibliographical references (p. 162-167).by Derek T. Chiou.Ph.D

    Life-cycle management and placement of service function chains in MEC-enabled 5G networks

    Get PDF
    Recent advancements in mobile communication technology have led to the fifth generation of mobile cellular networks (5G), driven by the proliferation in data traffic demand, stringent latency requirements, and the desire for a fully connected world. This transformation calls for novel technology solutions such as Multi-access Edge Computing (MEC) and Network Function Virtualization (NFV) to satisfy service requirements while providing dynamic and instant service deployment. MEC and NFV are two principal and complementary enablers for 5G networks whose co-existence can lead to numerous benefits. Despite the numerous advantages MEC offers, physical resources at the edge are extremely scarce and require efficient utilization. In this doctoral dissertation, we first attempt to optimize resource utilization at the network edge for the scenario of live video streaming. We specifically utilize the real-time Radio Access Network (RAN) information available at the MEC servers to develop a machine learning-based prediction solution and anticipate user requests. Consequently, Integer Linear Programming (ILP) models are used to prefetch/cache video contents from a centralized video server. Regarding the advantages of NFV technology for the deployment of NFs, the second problem that this dissertation address is the proper association of the users to the gNBs along with efficient placement of SFCs on the substrate network. Our primary purpose is to find a proper embedding of the SFC in a hierarchical 5G network. The problem is formulated as a Mixed Integer Linear Programming (MILP) model, having the objective to minimize service provisioning cost, link utilization, and the effect of VNF migration on users' perceived quality of experience. After rigorously analyzing the proposed SFC placement and considering mobile networks' dynamicity, our next goal is to develop an ILP-based model that minimizes the resource provisioning cost by dynamically embed and scale SFCs so that provisioning cost is minimized while user requirements are met
    corecore