3 research outputs found


    Get PDF
    The present work is a systematic review of the literature is the result of previous work for the construction of a Direct Memory Access (DMA) using the VHDL hardware description language, the Field-programmable gate array devices (FPGA) and some algorithms for the programming of these devices, the main objective of this study is to provide an effective methodology for modeling the DMA controller ensuring proper access to data by optimizing resources for each of the relevant transactions, Another objective is to know the different architectures that exist and the configuration of the devices for a much simpler and optimal implementation, the literature research includes a framework of the Systematic Literature Review process on primary studies focused on the search of articles related to architecture, design and algorithms used to build the DMA Controller using FPGA and VHDL. The results of the review show that there is a great variety of DMA architectures, the use of these architectures depends on the type of transmission you want to make and the types of data involved in the transaction, there are also several design models in multiple programming and modeling languages, according to the DMA architecture, there is the improved architecture of the controller that greatly helps to reduce processing latency, as well as the presence of a specific architecture needed for read/write image and videoEl presente trabajo es una revisi贸n sistem谩tica de la literatura es el resultado de un trabajo previo para la construcci贸n de un Acceso Directo a Memoria (DMA) utilizando el lenguaje de descripci贸n de hardware VHDL, los dispositivos Field-programmable gate array (FPGA) y algunos algoritmos para la programaci贸n de estos dispositivos, el objetivo principal de este estudio es proporcionar una metodolog铆a eficaz para modelar el controlador de la DMA asegurando un acceso adecuado a los datos optimizando los recursos para cada una de las transacciones relevantes, otro objetivo es conocer las diferentes arquitecturas que existen y la configuraci贸n de los dispositivos para una implementaci贸n mucho m谩s simple y 贸ptima, la investigaci贸n literaria incluye un marco de referencia del proceso de Revisi贸n Sistem谩tica de la Literatura sobre estudios primarios centrados en la b煤squeda de art铆culos relacionados con la arquitectura, el dise帽o y los algoritmos utilizados para construir el Controlador DMA utilizando FPGA y VHDL. Los resultados de la revisi贸n muestran que hay una gran variedad de arquitecturas DMA, el uso de las mismas depende del tipo de transmisi贸n que se quiera realizar y de los tipos de datos involucrados en la transacci贸n, tambi茅n hay varios modelos de dise帽o en m煤ltiples lenguajes de programaci贸n y modelado, de acuerdo con la arquitectura de la DMA, existe la arquitectura mejorada del controlador que ayuda en gran medida a reducir la latencia de procesamiento, as铆 como la presencia de una arquitectura espec铆fica necesaria para la lectura/escritura de imagen y v铆de

    TrIMS: Transparent and Isolated Model Sharing for Low Latency Deep LearningInference in Function as a Service Environments

    Full text link
    Deep neural networks (DNNs) have become core computation components within low latency Function as a Service (FaaS) prediction pipelines: including image recognition, object detection, natural language processing, speech synthesis, and personalized recommendation pipelines. Cloud computing, as the de-facto backbone of modern computing infrastructure for both enterprise and consumer applications, has to be able to handle user-defined pipelines of diverse DNN inference workloads while maintaining isolation and latency guarantees, and minimizing resource waste. The current solution for guaranteeing isolation within FaaS is suboptimal -- suffering from "cold start" latency. A major cause of such inefficiency is the need to move large amount of model data within and across servers. We propose TrIMS as a novel solution to address these issues. Our proposed solution consists of a persistent model store across the GPU, CPU, local storage, and cloud storage hierarchy, an efficient resource management layer that provides isolation, and a succinct set of application APIs and container technologies for easy and transparent integration with FaaS, Deep Learning (DL) frameworks, and user code. We demonstrate our solution by interfacing TrIMS with the Apache MXNet framework and demonstrate up to 24x speedup in latency for image classification models and up to 210x speedup for large models. We achieve up to 8x system throughput improvement.Comment: In Proceedings CLOUD 201