563 research outputs found

    Dynamic Low-Rank Instance Adaptation for Universal Neural Image Compression

    Full text link
    The latest advancements in neural image compression show great potential in surpassing the rate-distortion performance of conventional standard codecs. Nevertheless, there exists an indelible domain gap between the datasets utilized for training (i.e., natural images) and those utilized for inference (e.g., artistic images). Our proposal involves a low-rank adaptation approach aimed at addressing the rate-distortion drop observed in out-of-domain datasets. Specifically, we perform low-rank matrix decomposition to update certain adaptation parameters of the client's decoder. These updated parameters, along with image latents, are encoded into a bitstream and transmitted to the decoder in practical scenarios. Due to the low-rank constraint imposed on the adaptation parameters, the resulting bit rate overhead is small. Furthermore, the bit rate allocation of low-rank adaptation is \emph{non-trivial}, considering the diverse inputs require varying adaptation bitstreams. We thus introduce a dynamic gating network on top of the low-rank adaptation method, in order to decide which decoder layer should employ adaptation. The dynamic adaptation network is optimized end-to-end using rate-distortion loss. Our proposed method exhibits universality across diverse image datasets. Extensive results demonstrate that this paradigm significantly mitigates the domain gap, surpassing non-adaptive methods with an average BD-rate improvement of approximately 19%19\% across out-of-domain images. Furthermore, it outperforms the most advanced instance adaptive methods by roughly 5%5\% BD-rate. Ablation studies confirm our method's ability to universally enhance various image compression architectures.Comment: Accepted by ACM MM 2023, 13 pages, 12 figure

    An Introduction to Neural Data Compression

    Full text link
    Neural compression is the application of neural networks and other machine learning methods to data compression. Recent advances in statistical machine learning have opened up new possibilities for data compression, allowing compression algorithms to be learned end-to-end from data using powerful generative models such as normalizing flows, variational autoencoders, diffusion probabilistic models, and generative adversarial networks. The present article aims to introduce this field of research to a broader machine learning audience by reviewing the necessary background in information theory (e.g., entropy coding, rate-distortion theory) and computer vision (e.g., image quality assessment, perceptual metrics), and providing a curated guide through the essential ideas and methods in the literature thus far

    Leveraging progressive model and overfitting for efficient learned image compression

    Full text link
    Deep learning is overwhelmingly dominant in the field of computer vision and image/video processing for the last decade. However, for image and video compression, it lags behind the traditional techniques based on discrete cosine transform (DCT) and linear filters. Built on top of an autoencoder architecture, learned image compression (LIC) systems have drawn enormous attention in recent years. Nevertheless, the proposed LIC systems are still inferior to the state-of-the-art traditional techniques, for example, the Versatile Video Coding (VVC/H.266) standard, due to either their compression performance or decoding complexity. Although claimed to outperform the VVC/H.266 on a limited bit rate range, some proposed LIC systems take over 40 seconds to decode a 2K image on a GPU system. In this paper, we introduce a powerful and flexible LIC framework with multi-scale progressive (MSP) probability model and latent representation overfitting (LOF) technique. With different predefined profiles, the proposed framework can achieve various balance points between compression efficiency and computational complexity. Experiments show that the proposed framework achieves 2.5%, 1.0%, and 1.3% Bjontegaard delta bit rate (BD-rate) reduction over the VVC/H.266 standard on three benchmark datasets on a wide bit rate range. More importantly, the decoding complexity is reduced from O(n) to O(1) compared to many other LIC systems, resulting in over 20 times speedup when decoding 2K images

    Universal Neural-Cracking-Machines: Self-Configurable Password Models from Auxiliary Data

    Full text link
    We develop the first universal password model -- a password model that, once pre-trained, can automatically adapt to any password distribution. To achieve this result, the model does not need to access any plaintext passwords from the target set. Instead, it exploits users' auxiliary information, such as email addresses, as a proxy signal to predict the underlying target password distribution. The model uses deep learning to capture the correlation between the auxiliary data of a group of users (e.g., users of a web application) and their passwords. It then exploits those patterns to create a tailored password model for the target community at inference time. No further training steps, targeted data collection, or prior knowledge of the community's password distribution is required. Besides defining a new state-of-the-art for password strength estimation, our model enables any end-user (e.g., system administrators) to autonomously generate tailored password models for their systems without the often unworkable requirement of collecting suitable training data and fitting the underlying password model. Ultimately, our framework enables the democratization of well-calibrated password models to the community, addressing a major challenge in the deployment of password security solutions on a large scale.Comment: v0.0

    Bayesian Modelling Approaches for Quantum States -- The Ultimate Gaussian Process States Handbook

    Full text link
    Capturing the correlation emerging between constituents of many-body systems accurately is one of the key challenges for the appropriate description of various systems whose properties are underpinned by quantum mechanical fundamentals. This thesis discusses novel tools and techniques for the (classical) modelling of quantum many-body wavefunctions with the ultimate goal to introduce a universal framework for finding accurate representations from which system properties can be extracted efficiently. It is outlined how synergies with standard machine learning approaches can be exploited to enable an automated inference of the most relevant intrinsic characteristics through rigorous Bayesian regression techniques. Based on the probabilistic framework forming the foundation of the introduced ansatz, coined the Gaussian Process State, different compression techniques are explored to extract numerically feasible representations of relevant target states within stochastic schemes. By following intuitively motivated design principles, the resulting model carries a high degree of interpretability and offers an easily applicable tool for the numerical study of quantum systems, including ones which are notoriously difficult to simulate due to a strong intrinsic correlation. The practical applicability of the Gaussian Process States framework is demonstrated within several benchmark applications, in particular, ground state approximations for prototypical quantum lattice models, Fermi-Hubbard models and J1−J2J_1-J_2 models, as well as simple ab-initio quantum chemical systems.Comment: PhD Thesis, King's College London, 202 page

    Advancing Transformer Architecture in Long-Context Large Language Models: A Comprehensive Survey

    Full text link
    Transformer-based Large Language Models (LLMs) have been applied in diverse areas such as knowledge bases, human interfaces, and dynamic agents, and marking a stride towards achieving Artificial General Intelligence (AGI). However, current LLMs are predominantly pretrained on short text snippets, which compromises their effectiveness in processing the long-context prompts that are frequently encountered in practical scenarios. This article offers a comprehensive survey of the recent advancement in Transformer-based LLM architectures aimed at enhancing the long-context capabilities of LLMs throughout the entire model lifecycle, from pre-training through to inference. We first delineate and analyze the problems of handling long-context input and output with the current Transformer-based models. We then provide a taxonomy and the landscape of upgrades on Transformer architecture to solve these problems. Afterwards, we provide an investigation on wildly used evaluation necessities tailored for long-context LLMs, including datasets, metrics, and baseline models, as well as optimization toolkits such as libraries, frameworks, and compilers to boost the efficacy of LLMs across different stages in runtime. Finally, we discuss the challenges and potential avenues for future research. A curated repository of relevant literature, continuously updated, is available at https://github.com/Strivin0311/long-llms-learning.Comment: 40 pages, 3 figures, 4 table
    • …
    corecore