Skip to main content

Training-Free Video Editing with Pretrained Flow-Based Generative Models

Supervisor

Suitable for

MSc in Advanced Computer Science
Computer Science, Part C

Abstract

Abstract

Training-free generative methods have recently become popular for image editing tasks, where a pretrained model is guided by text prompts to modify an input image without any additional model training. Such methods are widely used for style changes, attribute edits, and super-resolution, and the goal is to alter selected visual attributes while preserving the overall scene and identity of the input.

 

Extending this paradigm to video is more challenging, as video edits must stay consistent across time, maintain object identity, and avoid frame-wise drift or flicker. This project will explore how training-free image-editing techniques can be adapted to video using pretrained flow-based generative models. The work will evaluate whether this approach can produce coherent, prompt-aligned edits and will compare its performance with inversion-based and optimisation-based baselines.

 

Pre-requisites:

Suitable for those who have taken a course in machine learning. Some familiarity with PyTorch would be beneficial.

 

References:

[1] Lipman, Yaron, et al. "Flow matching for generative modeling." International Conference on Learning Representations (ICLR), 2023. arXiv:2210.02747.

[2] Meng, Chenlin, et al. "SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations" International Conference on Learning Representations (ICLR), 2022. arXiv:2108.01073.

[3] Kulikov, Vladimir, et al. "FlowEdit: Inversion-free text-based editing using pre-trained flow models." Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2025.

[4] Mokady, Ron, et al. "Null-text inversion for editing real images using guided diffusion models." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), 2023.