244 research outputs found

    vDNN: Virtualized Deep Neural Networks for Scalable, Memory-Efficient Neural Network Design

    Full text link
    The most widely used machine learning frameworks require users to carefully tune their memory usage so that the deep neural network (DNN) fits into the DRAM capacity of a GPU. This restriction hampers a researcher's flexibility to study different machine learning algorithms, forcing them to either use a less desirable network architecture or parallelize the processing across multiple GPUs. We propose a runtime memory manager that virtualizes the memory usage of DNNs such that both GPU and CPU memory can simultaneously be utilized for training larger DNNs. Our virtualized DNN (vDNN) reduces the average GPU memory usage of AlexNet by up to 89%, OverFeat by 91%, and GoogLeNet by 95%, a significant reduction in memory requirements of DNNs. Similar experiments on VGG-16, one of the deepest and memory hungry DNNs to date, demonstrate the memory-efficiency of our proposal. vDNN enables VGG-16 with batch size 256 (requiring 28 GB of memory) to be trained on a single NVIDIA Titan X GPU card containing 12 GB of memory, with 18% performance loss compared to a hypothetical, oracular GPU with enough memory to hold the entire DNN.Comment: Published as a conference paper at the 49th IEEE/ACM International Symposium on Microarchitecture (MICRO-49), 201

    Instruction manual: Photogrammetry as a non-contact measurement system in large scale structural testing

    Get PDF
    Photogrammetry is a non-contact measurement method that is being used in large scale structural experimentation to extract information about the overall geometry of the specimen as well as the XYZ motion of select points on the structure during testing. This is possible through the use of high-resolution still cameras that capture several photographs of the specimen and are processed using photogrammetric software. The following document will focus specifically on the application of PhotoModeler® as the image post-processing tool. This instruction manual aims to provide guidance to researchers who would like to adopt photogrammetric techniques to acquire experimental test data, especially in cases where a high density grid of displacement measurements is desired at a relatively low cost

    Pathfinding Future PIM Architectures by Demystifying a Commercial PIM Technology

    Full text link
    Processing-in-memory (PIM) has been explored for decades by computer architects, yet it has never seen the light of day in real-world products due to their high design overheads and lack of a killer application. With the advent of critical memory-intensive workloads, several commercial PIM technologies have been introduced to the market ranging from domain-specific PIM architectures to more general-purpose PIM architectures. In this work, we deepdive into UPMEM's commercial PIM technology, a general-purpose PIM-enabled parallel architecture that is highly programmable. Our first key contribution is the development of a flexible simulation framework for PIM. The simulator we developed (aka PIMulator) enables the compilation of UPMEM-PIM source codes into its compiled machine-level instructions, which are subsequently consumed by our cycle-level performance simulator. Using PIMulator, we demystify UPMEM's PIM design through a detailed characterization study. Building on top of our characterization, we conduct a series of case studies to pathfind important architectural features that we deem will be critical for future PIM architectures to suppor
    corecore