24 research outputs found

    FLATTEN: optical FLow-guided ATTENtion for consistent text-to-video editing

    Full text link
    Text-to-video editing aims to edit the visual appearance of a source video conditional on textual prompts. A major challenge in this task is to ensure that all frames in the edited video are visually consistent. Most recent works apply advanced text-to-image diffusion models to this task by inflating 2D spatial attention in the U-Net into spatio-temporal attention. Although temporal context can be added through spatio-temporal attention, it may introduce some irrelevant information for each patch and therefore cause inconsistency in the edited video. In this paper, for the first time, we introduce optical flow into the attention module in the diffusion model's U-Net to address the inconsistency issue for text-to-video editing. Our method, FLATTEN, enforces the patches on the same flow path across different frames to attend to each other in the attention module, thus improving the visual consistency in the edited videos. Additionally, our method is training-free and can be seamlessly integrated into any diffusion-based text-to-video editing methods and improve their visual consistency. Experiment results on existing text-to-video editing benchmarks show that our proposed method achieves the new state-of-the-art performance. In particular, our method excels in maintaining the visual consistency in the edited videos.Comment: Project page: https://flatten-video-editing.github.io

    A multiobjective evolutionary algorithm based on objective-space localization selection

    No full text
    This article proposes a simple yet effective multiobjective evolutionary algorithm (EA) for dealing with problems with irregular Pareto front. The proposed algorithm does not need to deal with the issues of predefining weight vectors and calculating indicators in the search process. It is mainly based on the thought of adaptively selecting multiple promising search directions according to crowdedness information in local objective spaces. Concretely, the proposed algorithm attempts to dynamically delete an individual of poor quality until enough individuals survive into the next generation. In this environmental selection process, the proposed algorithm considers two or three individuals in the most crowded area, which is determined by the local information in objective space, according to a probability selection mechanism, and deletes the worst of them from the current population. Thus, these surviving individuals are representative of promising search directions. The performance of the proposed algorithm is verified and compared with seven state-of-the-art algorithms [including four general multi/many-objective EAs and three algorithms specially designed for dealing with problems with irregular Pareto-optimal front (PF)] on a variety of complicated problems with different numbers of objectives ranging from 2 to 15. Empirical results demonstrate that the proposed algorithm has a strong competitiveness power in terms of both the performance and the algorithm compactness, and it can well deal with different types of problems with irregular PF and problems with different numbers of objectives.This work was supported in part by the National Natural Science Foundation of China under Grant 61773410 and Grant 61673403; in part by the Science and Technology Program of Guangzhou under Grant 202002030355; and in part by the Fundamental Research Funds for the Central Universities under Grant 2019MS088
    corecore