796 research outputs found

    Operational one-to-one mapping between coherence and entanglement measures

    Full text link
    We establish a general operational one-to-one mapping between coherence measures and entanglement measures: Any entanglement measure of bipartite pure states is the minimum of a suitable coherence measure over product bases. Any coherence measure of pure states, with extension to mixed states by convex roof, is the maximum entanglement generated by incoherent operations acting on the system and an incoherent ancilla. Remarkably, the generalized CNOT gate is the universal optimal incoherent operation. In this way, all convex-roof coherence measures, including the coherence of formation, are endowed with (additional) operational interpretations. By virtue of this connection, many results on entanglement can be translated to the coherence setting, and vice versa. As applications, we provide tight observable lower bounds for generalized entanglement concurrence and coherence concurrence, which enable experimentalists to quantify entanglement and coherence of the maximal dimension in real experiments.Comment: 14 pages, 1 figure, new results added, published in PR

    Lossy Image Compression with Quantized Hierarchical VAEs

    Full text link
    Recent research has shown a strong theoretical connection between variational autoencoders (VAEs) and the rate-distortion theory. Motivated by this, we consider the problem of lossy image compression from the perspective of generative modeling. Starting with ResNet VAEs, which are originally designed for data (image) distribution modeling, we redesign their latent variable model using a quantization-aware posterior and prior, enabling easy quantization and entropy coding at test time. Along with improved neural network architecture, we present a powerful and efficient model that outperforms previous methods on natural image lossy compression. Our model compresses images in a coarse-to-fine fashion and supports parallel encoding and decoding, leading to fast execution on GPUs. Code is available at https://github.com/duanzhiihao/lossy-vae.Comment: WACV 2023 Best Algorithms Paper Award, revised versio

    An Improved Upper Bound on the Rate-Distortion Function of Images

    Full text link
    Recent work has shown that Variational Autoencoders (VAEs) can be used to upper-bound the information rate-distortion (R-D) function of images, i.e., the fundamental limit of lossy image compression. In this paper, we report an improved upper bound on the R-D function of images implemented by (1) introducing a new VAE model architecture, (2) applying variable-rate compression techniques, and (3) proposing a novel \ourfunction{} to stabilize training. We demonstrate that at least 30\% BD-rate reduction w.r.t. the intra prediction mode in VVC codec is achievable, suggesting that there is still great potential for improving lossy image compression. Code is made publicly available at https://github.com/duanzhiihao/lossy-vae.Comment: Conference paper at ICIP 2023. The first two authors share equal contribution

    Diffusion in Diffusion: Cyclic One-Way Diffusion for Text-Vision-Conditioned Generation

    Full text link
    Text-to-Image (T2I) generation with diffusion models allows users to control the semantic content in the synthesized images given text conditions. As a further step toward a more customized image creation application, we introduce a new multi-modality generation setting that synthesizes images based on not only the semantic-level textual input but also on the pixel-level visual conditions. Existing literature first converts the given visual information to semantic-level representation by connecting it to languages, and then incorporates it into the original denoising process. Seemingly intuitive, such methodological design loses the pixel values during the semantic transition, thus failing to fulfill the task scenario where the preservation of low-level vision is desired (e.g., ID of a given face image). To this end, we propose Cyclic One-Way Diffusion (COW), a training-free framework for creating customized images with respect to semantic text and pixel-visual conditioning. Notably, we observe that sub-regions of an image impose mutual interference, just like physical diffusion, to achieve ultimate harmony along the denoising trajectory. Thus we propose to repetitively utilize the given visual condition in a cyclic way, by planting the visual condition as a high-concentration "seed" at the initialization step of the denoising process, and "diffuse" it into a harmonious picture by controlling a one-way information flow from the visual condition. We repeat the destroy-and-construct process multiple times to gradually but steadily impose the internal diffusion process within the image. Experiments on the challenging one-shot face and text-conditioned image synthesis task demonstrate our superiority in terms of speed, image quality, and conditional fidelity compared to learning-based text-vision conditional methods. Project page is available at: https://bigaandsmallq.github.io/COW/Comment: Project page is available at: https://bigaandsmallq.github.io/COW
    corecore