3 research outputs found
ElasticDiffusion: Training-free Arbitrary Size Image Generation through Global-Local Content Separation
Diffusion models have revolutionized image generation in recent years, yet
they are still limited to a few sizes and aspect ratios. We propose
ElasticDiffusion, a novel training-free decoding method that enables pretrained
text-to-image diffusion models to generate images with various sizes.
ElasticDiffusion attempts to decouple the generation trajectory of a pretrained
model into local and global signals. The local signal controls low-level pixel
information and can be estimated on local patches, while the global signal is
used to maintain overall structural consistency and is estimated with a
reference image. We test our method on CelebA-HQ (faces) and LAION-COCO
(objects/indoor/outdoor scenes). Our experiments and qualitative results show
superior image coherence quality across aspect ratios compared to
MultiDiffusion and the standard decoding strategy of Stable Diffusion. Project
page: https://elasticdiffusion.github.io/Comment: Accepted at CVPR 2024. Project Page:
https://elasticdiffusion.github.io
VidStyleODE: Disentangled Video Editing via StyleGAN and NeuralODEs
We propose , a spatiotemporally continuous disentangled
eo representation based upon GAN and
Neural-s. Effective traversal of the latent space learned by
Generative Adversarial Networks (GANs) has been the basis for recent
breakthroughs in image editing. However, the applicability of such advancements
to the video domain has been hindered by the difficulty of representing and
controlling videos in the latent space of GANs. In particular, videos are
composed of content (i.e., appearance) and complex motion components that
require a special mechanism to disentangle and control. To achieve this,
VidStyleODE encodes the video content in a pre-trained StyleGAN
space and benefits from a latent ODE component to summarize the spatiotemporal
dynamics of the input video. Our novel continuous video generation process then
combines the two to generate high-quality and temporally consistent videos with
varying frame rates. We show that our proposed method enables a variety of
applications on real videos: text-guided appearance manipulation, motion
manipulation, image animation, and video interpolation and extrapolation.
Project website: https://cyberiada.github.io/VidStyleOD
Erratum: Global, regional, and national comparative risk assessment of 84 behavioural, environmental and occupational, and metabolic risks or clusters of risks for 195 countries and territories, 1990–2017: a systematic analysis for the Global Burden of Disease Study 2017
Interpretation: By quantifying levels and trends in exposures to risk factors and the resulting disease burden, this assessment offers insight into where past policy and programme efforts might have been successful and highlights current priorities for public health action. Decreases in behavioural, environmental, and occupational risks have largely offset the effects of population growth and ageing, in relation to trends in absolute burden. Conversely, the combination of increasing metabolic risks and population ageing will probably continue to drive the increasing trends in non-communicable diseases at the global level, which presents both a public health challenge and opportunity. We see considerable spatiotemporal heterogeneity in levels of risk exposure and risk-attributable burden. Although levels of development underlie some of this heterogeneity, O/E ratios show risks for which countries are overperforming or underperforming relative to their level of development. As such, these ratios provide a benchmarking tool to help to focus local decision making. Our findings reinforce the importance of both risk exposure monitoring and epidemiological research to assess causal connections between risks and health outcomes, and they highlight the usefulness of the GBD study in synthesising data to draw comprehensive and robust conclusions that help to inform good policy and strategic health planning