8,396 research outputs found

    s-LWSR: Super Lightweight Super-Resolution Network

    Full text link
    Deep learning (DL) architectures for superresolution (SR) normally contain tremendous parameters, which has been regarded as the crucial advantage for obtaining satisfying performance. However, with the widespread use of mobile phones for taking and retouching photos, this character greatly hampers the deployment of DL-SR models on the mobile devices. To address this problem, in this paper, we propose a super lightweight SR network: s-LWSR. There are mainly three contributions in our work. Firstly, in order to efficiently abstract features from the low resolution image, we build an information pool to mix multi-level information from the first half part of the pipeline. Accordingly, the information pool feeds the second half part with the combination of hierarchical features from the previous layers. Secondly, we employ a compression module to further decrease the size of parameters. Intensive analysis confirms its capacity of trade-off between model complexity and accuracy. Thirdly, by revealing the specific role of activation in deep models, we remove several activation layers in our SR model to retain more information for performance improvement. Extensive experiments show that our s-LWSR, with limited parameters and operations, can achieve similar performance to other cumbersome DL-SR methods

    Online Video Super-Resolution with Convolutional Kernel Bypass Graft

    Full text link
    Deep learning-based models have achieved remarkable performance in video super-resolution (VSR) in recent years, but most of these models are less applicable to online video applications. These methods solely consider the distortion quality and ignore crucial requirements for online applications, e.g., low latency and low model complexity. In this paper, we focus on online video transmission, in which VSR algorithms are required to generate high-resolution video sequences frame by frame in real time. To address such challenges, we propose an extremely low-latency VSR algorithm based on a novel kernel knowledge transfer method, named convolutional kernel bypass graft (CKBG). First, we design a lightweight network structure that does not require future frames as inputs and saves extra time costs for caching these frames. Then, our proposed CKBG method enhances this lightweight base model by bypassing the original network with ``kernel grafts'', which are extra convolutional kernels containing the prior knowledge of external pretrained image SR models. In the testing phase, we further accelerate the grafted multi-branch network by converting it into a simple single-path structure. Experiment results show that our proposed method can process online video sequences up to 110 FPS, with very low model complexity and competitive SR performance

    Human-Machine Interface for Remote Training of Robot Tasks

    Full text link
    Regardless of their industrial or research application, the streamlining of robot operations is limited by the proximity of experienced users to the actual hardware. Be it massive open online robotics courses, crowd-sourcing of robot task training, or remote research on massive robot farms for machine learning, the need to create an apt remote Human-Machine Interface is quite prevalent. The paper at hand proposes a novel solution to the programming/training of remote robots employing an intuitive and accurate user-interface which offers all the benefits of working with real robots without imposing delays and inefficiency. The system includes: a vision-based 3D hand detection and gesture recognition subsystem, a simulated digital twin of a robot as visual feedback, and the "remote" robot learning/executing trajectories using dynamic motion primitives. Our results indicate that the system is a promising solution to the problem of remote training of robot tasks.Comment: Accepted in IEEE International Conference on Imaging Systems and Techniques - IST201

    LHDR: HDR Reconstruction for Legacy Content using a Lightweight DNN

    Full text link
    High dynamic range (HDR) image is widely-used in graphics and photography due to the rich information it contains. Recently the community has started using deep neural network (DNN) to reconstruct standard dynamic range (SDR) images into HDR. Albeit the superiority of current DNN-based methods, their application scenario is still limited: (1) heavy model impedes real-time processing, and (2) inapplicable to legacy SDR content with more degradation types. Therefore, we propose a lightweight DNN-based method trained to tackle legacy SDR. For better design, we reform the problem modeling and emphasize degradation model. Experiments show that our method reached appealing performance with minimal computational cost compared with others.Comment: Accepted in ACCV202

    MetaISP -- Exploiting Global Scene Structure for Accurate Multi-Device Color Rendition

    Full text link
    Image signal processors (ISPs) are historically grown legacy software systems for reconstructing color images from noisy raw sensor measurements. Each smartphone manufacturer has developed its ISPs with its own characteristic heuristics for improving the color rendition, for example, skin tones and other visually essential colors. The recent interest in replacing the historically grown ISP systems with deep-learned pipelines to match DSLR's image quality improves structural features in the image. However, these works ignore the superior color processing based on semantic scene analysis that distinguishes mobile phone ISPs from DSLRs. Here, we present MetaISP, a single model designed to learn how to translate between the color and local contrast characteristics of different devices. MetaISP takes the RAW image from device A as input and translates it to RGB images that inherit the appearance characteristics of devices A, B, and C. We achieve this result by employing a lightweight deep learning technique that conditions its output appearance based on the device of interest. In this approach, we leverage novel attention mechanisms inspired by cross-covariance to learn global scene semantics. Additionally, we use the metadata that typically accompanies RAW images and estimate scene illuminants when they are unavailable.Comment: VMV 2023, Project page: https://www.github.com/vccimaging/MetaIS

    Achieving on-Mobile Real-Time Super-Resolution with Neural Architecture and Pruning Search

    Full text link
    Though recent years have witnessed remarkable progress in single image super-resolution (SISR) tasks with the prosperous development of deep neural networks (DNNs), the deep learning methods are confronted with the computation and memory consumption issues in practice, especially for resource-limited platforms such as mobile devices. To overcome the challenge and facilitate the real-time deployment of SISR tasks on mobile, we combine neural architecture search with pruning search and propose an automatic search framework that derives sparse super-resolution (SR) models with high image quality while satisfying the real-time inference requirement. To decrease the search cost, we leverage the weight sharing strategy by introducing a supernet and decouple the search problem into three stages, including supernet construction, compiler-aware architecture and pruning search, and compiler-aware pruning ratio search. With the proposed framework, we are the first to achieve real-time SR inference (with only tens of milliseconds per frame) for implementing 720p resolution with competitive image quality (in terms of PSNR and SSIM) on mobile platforms (Samsung Galaxy S20)
    • …
    corecore