Search CORE

10 research outputs found

Towards Real-time Text-driven Image Manipulation with Unconditional Diffusion Models

Author: Babenko Artem
Baranchuk Dmitry
Khrulkov Valentin
Starodubcev Nikita
Publication venue
Publication date: 09/04/2023
Field of study

Recent advances in diffusion models enable many powerful instruments for image editing. One of these instruments is text-driven image manipulations: editing semantic attributes of an image according to the provided text description. % Popular text-conditional diffusion models offer various high-quality image manipulation methods for a broad range of text prompts. Existing diffusion-based methods already achieve high-quality image manipulations for a broad range of text prompts. However, in practice, these methods require high computation costs even with a high-end GPU. This greatly limits potential real-world applications of diffusion-based image editing, especially when running on user devices. In this paper, we address efficiency of the recent text-driven editing methods based on unconditional diffusion models and develop a novel algorithm that learns image manipulations 4.5-10 times faster and applies them 8 times faster. We carefully evaluate the visual quality and expressiveness of our approach on multiple datasets using human annotators. Our experiments demonstrate that our algorithm achieves the quality of much more expensive methods. Finally, we show that our approach can adapt the pretrained model to the user-specified image and text description on the fly just for 4 seconds. In this setting, we notice that more compact unconditional diffusion models can be considered as a rational alternative to the popular text-conditional counterparts

arXiv.org e-Print Archive

TabDDPM: Modelling Tabular Data with Diffusion Models

Author: Babenko Artem
Baranchuk Dmitry
Kotelnikov Akim
Rubachev Ivan
Publication venue
Publication date: 30/09/2022
Field of study

Denoising diffusion probabilistic models are currently becoming the leading paradigm of generative modeling for many important data modalities. Being the most prevalent in the computer vision community, diffusion models have also recently gained some attention in other domains, including speech, NLP, and graph-like data. In this work, we investigate if the framework of diffusion models can be advantageous for general tabular problems, where datapoints are typically represented by vectors of heterogeneous features. The inherent heterogeneity of tabular data makes it quite challenging for accurate modeling, since the individual features can be of completely different nature, i.e., some of them can be continuous and some of them can be discrete. To address such data types, we introduce TabDDPM -- a diffusion model that can be universally applied to any tabular dataset and handles any type of feature. We extensively evaluate TabDDPM on a wide set of benchmarks and demonstrate its superiority over existing GAN/VAE alternatives, which is consistent with the advantage of diffusion models in other fields. Additionally, we show that TabDDPM is eligible for privacy-oriented setups, where the original datapoints cannot be publicly shared.Comment: code https://github.com/rotot0/tab-ddp

arXiv.org e-Print Archive

Your Student is Better Than Expected: Adaptive Teacher-Student Collaboration for Text-Conditional Diffusion Models

Author: Babenko Artem
Baranchuk Dmitry
Fedorov Artem
Starodubcev Nikita
Publication venue
Publication date: 05/04/2024
Field of study

Knowledge distillation methods have recently shown to be a promising direction to speedup the synthesis of large-scale diffusion models by requiring only a few inference steps. While several powerful distillation methods were recently proposed, the overall quality of student samples is typically lower compared to the teacher ones, which hinders their practical usage. In this work, we investigate the relative quality of samples produced by the teacher text-to-image diffusion model and its distilled student version. As our main empirical finding, we discover that a noticeable portion of student samples exhibit superior fidelity compared to the teacher ones, despite the "approximate" nature of the student. Based on this finding, we propose an adaptive collaboration between student and teacher diffusion models for effective text-to-image synthesis. Specifically, the distilled model produces the initial sample, and then an oracle decides whether it needs further improvements with a slow teacher model. Extensive experiments demonstrate that the designed pipeline surpasses state-of-the-art text-to-image alternatives for various inference budgets in terms of human preference. Furthermore, the proposed approach can be naturally used in popular applications such as text-guided image editing and controllable generation.Comment: CVPR2024 camera ready v

arXiv.org e-Print Archive

Results of the NeurIPS’21 Challenge on Billion-Scale Approximate Nearest Neighbor Search

Author: Aumüller Martin
Babenko Artem
Baranchuk Dmitry
Douze Matthijs
Hosseini Lucas
Krishnaswamny Ravishankar
Simhadri Hasha
Srinivasa Gopal
Subramanya Suhas Jayaram
Wang Jingdong
Williams George
Publication venue
Publication date: 01/01/2022
Field of study

The IT University of Copenhagen's Repository

magdalendobson/big-ann-benchmarks: Final artifact release

Author: Abdelrahman Ezzat
Akira Naruse
alemagnani
AliHashish
Amir Ingber
Ben Landrum
C. George Williams
Carlos Eduardo Rosar Kós Lassance
Dappur
Dmitry Baranchuk
Dmitry Kan
Harsha Vardhan Simhadri
Khylon
laziyu
Martin Aumüller
Masajiro Iwasaki
Match_yc
Matthijs Douze
Max Irwin
Niklas Hansson
NJU-yasuo
nk2014yj
Patrick Weizhi Xu
veaaaab
yangming
Yong Wang
Yutaro Oguri
Ziad Ahmed
Zixu Li
Publication venue: Zenodo
Publication date: 29/11/2023
Field of study

Framework for evaluating ANNS algorithms on billion scale datasets

ZENODO