24 research outputs found

    NICE 2023 Zero-shot Image Captioning Challenge

    Full text link
    In this report, we introduce NICE project\footnote{\url{https://nice.lgresearch.ai/}} and share the results and outcomes of NICE challenge 2023. This project is designed to challenge the computer vision community to develop robust image captioning models that advance the state-of-the-art both in terms of accuracy and fairness. Through the challenge, the image captioning models were tested using a new evaluation dataset that includes a large variety of visual concepts from many domains. There was no specific training data provided for the challenge, and therefore the challenge entries were required to adapt to new types of image descriptions that had not been seen during training. This report includes information on the newly proposed NICE dataset, evaluation methods, challenge results, and technical details of top-ranking entries. We expect that the outcomes of the challenge will contribute to the improvement of AI models on various vision-language tasks.Comment: Tech report, project page https://nice.lgresearch.ai

    Hardware-Friendly Compression Methods for Low-Cost Deep Neural Networks

    No full text
    Department of Electrical Engineeringclos

    Efficient Execution of Stream Graphs on Coarse-Grained Reconfigurable Architectures

    No full text
    Coarse-grained reconfigurable architectures (CGRAs) can provide extremely energy-efficient acceleration for applications that are rich in arithmetic operations such as digital signal processing and multimedia applications. Since those applications are often naturally represented by stream graphs, it is very compelling to develop optimization strategies for stream graphs on CGRAs. One unique property of stream graphs is that they contain many kernels or loops, which creates both advantages and challenges when it comes to mapping them to CGRAs. This paper addresses two main problems with it, namely, many-buffer problem and control overhead problem, and presents our results of optimizing the execution of stream graphs for CGRAs including our low-cost architecture extensions. Our evaluation results demonstrate that our software and hardware optimizations can help generate highly efficient mapping of stream applications to CGRAs, with 3.4x speedup on average at the application level over CPU-only execution, which is significant

    Design Space Exploration of FPGA Accelerators for Convolutional Neural Networks

    No full text
    The increasing use of machine learning algorithms, such as Convolutional Neural Networks (CNNs), makes the hardware accelerator approach very compelling. However the question of how to best design an accelerator for a given CNN has not been answered yet, even on a very fundamental level. This paper addresses that challenge, by providing a novel framework that can universally and accurately evaluate and explore various architectural choices for CNN accelerators on FPGAs. Our exploration framework is more extensive than that of any previous work in terms of the design space, and takes into account various FPGA resources to maximize performance including DSP resources, on-chip memory, and off-chip memory bandwidth. Our experimental results using some of the largest CNN models including one that has 16 convolutional layers demonstrate the efficacy of our framework, as well as the need for such a high-level architecture exploration approach to find the best architecture for a CNN model

    Automated Log-Scale Quantization for Low-Cost Deep Neural Networks

    No full text
    Quantization plays an important role in deep neural network (DNN) hardware. In particular, logarithmic quantization has multiple advantages for DNN hardware implementations, and its weakness in terms of lower performance at high precision compared with linear quantization has been recently remedied by what we call selective two-word logarithmic quantization (STLQ). However, there is a lack of training methods designed for STLQ or even logarithmic quantization in general. In this paper we propose a novel STLQ-aware training method, which significantly outperforms the previous state-of-the-art training method for STLQ. Moreover, our training results demonstrate that with our new training method, STLQ applied to weight parameters of ResNet-18 can achieve the same level of performance as state-of-the-art quantization method, APoT, at 3-bit precision. We also apply our method to various DNNs in image enhancement and semantic segmentation, showing competitive results