| Dragging
with Geometry: From Pixels to Geometry-Guided Image
Editing |
ICML |
2026 |
[Code] |
interactive point-based image editing, geometry-guided dragging,
3D cues |
Image Editing |
| Follow-Your-Shape:
Shape-Aware Image Editing via Trajectory-Guided Region
Control |
ICLR |
2026 |
[Code]
|
shape-aware image editing |
Image Editing |
| UTDesign:
A Unified Framework for Stylized Text Editing and Generation
in Graphic Design Images |
SIGGRAPH Asia |
2025 |
[Code] |
stylized text editing and generation |
Image Editing |
| EditInfinity:
Image Editing with Binary-Quantized Generative Models
|
NeurIPS |
2025 |
[Code]
|
text-driven image editing, binary-quantized generative models
|
Image Editing |
| Prompt-Softbox-Prompt:
A Free-Text Embedding Control for Image Editing |
MM |
2025 |
[Code] |
free-text embedding control |
Image Editing |
| Exploring Optimal
Latent Trajetory for Zero-shot Image Editing |
arXiv |
2025 |
- |
zero-shot image editing |
Image Editing |
| FramePainter:
Endowing Interactive Image Editing with Video Diffusion
Priors |
arXiv |
2025 |
- |
interactive image editing |
Image Editing |
| Energy-Guided
Optimization for Personalized Image Editing with Pretrained
Text-to-Image Diffusion Models |
arXiv |
2025 |
- |
personalized image editing |
Image Editing |
| Early Timestep
Zero-Shot Candidate Selection for Instruction-Guided Image
Editing |
arXiv |
2025 |
- |
instruction-guided image editing |
Image Editing |
| PartEdit:
Fine-Grained Image Editing using Pre-Trained Diffusion
Models |
arXiv |
2025 |
- |
Fine-Grained Image Editing |
Image Editing |
| S2Edit: Text-Guided
Image Editing with Precise Semantic and Spatial Control
|
arXiv |
2025 |
- |
text guided image editing |
Image Editing |
| REED-VAE: RE-Encode
Decode Training for Iterative Image Editing with Diffusion
Models |
arXiv |
2025 |
- |
iterative image editing |
Image Editing |
| Towards Efficient
Exemplar Based Image Editing with Multimodal VLMs |
arXiv |
2025 |
- |
exemplar-based image editing |
Image Editing |
| ReFlex: Text-Guided
Editing of Real Images in Rectified Flow via Mid-Step
Feature Extraction and Attention Adaptation |
arXiv |
2025 |
- |
text-guided image editing |
Image Editing |
| LUSD: Localized
Update Score Distillation for Text-Guided Image Editing
|
arXiv |
2025 |
- |
text-guided image editing |
Image Editing |
| Guiding
Instruction-based Image Editing via Multimodal Large
Language Models |
ICLR |
2024 |
[Code] |
LLM-guided, Diffusion, Concise instruction loss, Supervised
fine-tuning |
Image Editing |
| Hive: Harnessing
human feedback for instructional visual editing |
CVPR |
2024 |
[Code] |
RLHF, Diffusion, Data augmentation |
Image Editing |
| InstructBrush:
Learning Attention-based Instruction Optimization for Image
Editing |
arXiv |
2024 |
[Code]
|
Diffusion, Attention-based |
Image Editing |
| FlexEdit: Flexible
and Controllable Diffusion-based Object-centric Image
Editing |
arXiv |
2024 |
[Code] |
Controllable diffusion |
Image Editing |
| Pix2Pix-OnTheFly:
Leveraging LLMs for Instruction-Guided Image Editing
|
arXiv |
2024 |
[Code]
|
on-the-fly, tuning-free, training-free |
Image Editing |
| EffiVED:Efficient
Video Editing via Text-instruction Diffusion Models |
arXiv |
2024 |
[Code] |
Video editing, decoupled classifier-free |
Image Editing |
| Grounded-Instruct-Pix2Pix:
Improving Instruction Based Image Editing with Automatic
Target Grounding |
ICASSP |
2024 |
[Code]
|
Diffusion, mask generation image editing |
Image Editing |
| TexFit:
Text-Driven Fashion Image Editing with Diffusion Models
|
AAAI |
2024 |
[Code] |
Fashion editing, region locaation, diffusion |
Image Editing |
| InstructGIE: Towards
Generalizable Image Editing |
arXiv |
2024 |
[Code]
|
Diffusion, context matching |
Image Editing |
| An Item is Worth a
Prompt: Versatile Image Editing with Disentangled
Control |
arXiv |
2024 |
[Code] |
Freestyle, Diffusion, Group attention |
Image Editing |
| Text-Driven Image
Editing via Learnable Regions |
CVPR |
2024 |
[Code]
|
Region generation, diffusion, mask-free |
Image Editing |
| ChartReformer:
Natural Language-Driven Chart Image Editing |
ICDAR |
2024 |
[Code]
|
chart editing |
Image Editing |
| GANTASTIC: GAN-based
Transfer of Interpretable Directions for Disentangled Image
Editing in Text-to-Image Diffusion Models |
arXiv |
2024 |
[Code] |
Hybrid, direction transfer |
Image Editing |
| StyleBooth: Image
Style Editing with Multimodal Instruction |
arXiv |
2024 |
[Code]
|
style editing, diffusion |
Image Editing |
| ZONE: Zero-Shot
Instruction-Guided Local Editing |
CVPR |
2024 |
[Code] |
Local editing, localisation |
Image Editing |
| Inversion-Free Image
Editing with Natural Language |
CVPR |
2024 |
[Code] |
Consistent models, unified attention |
Image Editing |
| Focus on Your
Instruction: Fine-grained and Multi-instruction Image
Editing by Attention Modulation |
CVPR |
2024 |
[Code]
|
Diffusion, multi-instruction |
Image Editing |
| MoEController:
Instruction-based Arbitrary Image Manipulation with
Mixture-of-Expert Controllers |
arXiv |
2024 |
[Code]
|
MoE, LLM-powered |
Image Editing |
| InstructCV:
Instruction-Tuned Text-to-Image Diffusion Models as Vision
Generalists |
ICLR |
2024 |
[Code] |
Diffusion, LLM-based, classifier-free |
Image Editing |
| Iterative
Multi-Granular Image Editing Using Diffusion Models |
WACV |
2024 |
- |
Diffusion, Iterative editing |
Image Editing |
| Dynamic
Prompt Learning: Addressing Cross-Attention Leakage for
Text-Based Image Editing |
NeurIPS |
2024 |
[Code] |
Diffusion, dynamic prompt |
Image Editing |
| Object-Aware
Inversion and Reassembly for Image Editing |
ICLR |
2024 |
[Code]
|
Diffusion, multi-object |
Image Editing |
| Zero-Shot Video
Editing Using Off-The-Shelf Image Diffusion Models |
arXiv |
2024 |
[Code]
|
video editing, zero-shot |
Image Editing |
| Video-P2P:
Video Editing with Cross-attention Control |
CVPR |
2024 |
[Code] |
Decoupled-guidance attention control, video editing |
Image Editing |
| NeRF-Insert: 3D Local
Editing with Multimodal Control Signals |
arXiv |
2024 |
- |
3D Editing |
Image Editing |
| BlenderAlchemy:
Editing 3D Graphics with Vision-Language Models |
arXiv |
2024 |
[Code]
|
3D Editing |
Image Editing |
| AudioScenic:
Audio-Driven Video Scene Editing |
arXiv |
2024 |
- |
audio-based instruction |
Image Editing |
| LocInv:
Localization-aware Inversion for Text-Guided Image
Editing |
CVPR-AI4CC |
2024 |
[Code] |
Localization-aware inversion |
Image Editing |
| SonicDiffusion:
Audio-Driven Image Generation and Editing with Pretrained
Diffusion Models |
arXiv |
2024 |
[Code]
|
Audio-driven |
Image Editing |
| Exploring Text-Guided
Single Image Editing for Remote Sensing Images |
arXiv |
2024 |
[Code]
|
Remote sensing images |
Image Editing |
| GaussianVTON: 3D
Human Virtual Try-ON via Multi-Stage Gaussian Splatting
Editing with Image Prompting |
arXiv |
2024 |
[Code] |
Fashion editing |
Image Editing |
| TIE: Revolutionizing
Text-based Image Editing for Complex-Prompt Following and
High-Fidelity Editing |
arXiv |
2024 |
- |
Chain of thought |
Image Editing |
| Unified Editing of
Panorama, 3D Scenes, and Videos Through Disentangled
Self-Attention Injection |
arXiv |
2024 |
[Code] |
Diffusion, Self-attention Injection |
Image Editing |
| Instruct-MusicGen:
Unlocking Text-to-Music Editing for Music Language Models
via Instruction Tuning |
arXiv |
2024 |
[Code]
|
Music editing, diffusion |
Image Editing |
| Text Guided Image
Editing with Automatic Concept Locating and Forgetting
|
arXiv |
2024 |
- |
Diffusion, concept forgetting |
Image Editing |
| FreeEdit: Mask-free
Reference-based Image Editing with Multi-modal
Instruction |
arXiv |
2024 |
[Code] |
Diffusion, instruction-driven editing |
Image Editing |
| Revealing Directions
for Text-guided 3D Face Editing |
arXiv |
2024 |
- |
Text-guided 3D face editing |
Image Editing |
| Vision-guided and
Mask-enhanced Adaptive Denoising for Prompt-based Image
Editing |
arXiv |
2024 |
- |
Text-to-image, editing, diffusion |
Image Editing |
| Hyper-parameter
tuning for text guided image editing |
arXiv |
2024 |
[Code]
|
Text Editing |
Image Editing |
| Add-it: Training-Free
Object Insertion in Images With Pretrained Diffusion
Models |
arXiv |
2024 |
- |
Text-guided Object Insertion |
Image Editing |
| GenMix: Effective
Data Augmentation with Generative Diffusion Model Image
Editing |
arXiv |
2024 |
- |
Diffusion image augmentation |
Image Editing |
| SwiftEdit: Lightning
Fast Text-Guided Image Editing via One-Step Diffusion
|
arXiv |
2024 |
- |
Text-Guided Image Editing |
Image Editing |
| FireFlow: Fast
Inversion of Rectified Flow for Image Semantic Editing
|
arXiv |
2024 |
- |
semantic image editing |
Image Editing |
| FluxSpace:
Disentangled Semantic Editing in Rectified Flow
Transformers |
arXiv |
2024 |
- |
disentangled semantic editing |
Image Editing |
| UIP2P: Unsupervised
Instruction-based Image Editing via Cycle Edit
Consistency |
arXiv |
2024 |
- |
Instruction-based image editing |
Image Editing |
| CA-Edit:
Causality-Aware Condition Adapter for High-Fidelity Local
Facial Attribute Editing |
arXiv |
2024 |
- |
facial attribute editing |
Image Editing |
| Unsupervised
Region-Based Image Editing of Denoising Diffusion Models
|
arXiv |
2024 |
- |
region-based image editing |
Image Editing |
| Edicho: Consistent
Image Editing in the Wild |
arXiv |
2024 |
- |
consistent image editing |
Image Editing |
| UIP2P: Unsupervised
Instruction-based Image Editing via Edit Reversibility
Constraint |
arXiv |
2024 |
- |
instruction-based image editing |
Image Editing |
| InstructPix2Pix:
Learning To Follow Image Editing Instruction |
CVPR |
2023 |
[Code]
|
Core paper, Diffusion |
Image Editing |
| Visual
Instruction Inversion: Image Editing via Image Prompting
|
NeurIPS |
2023 |
[Code] |
Diffusion, visual instruction |
Image Editing |
| Instruct-NeRF2NeRF:
Editing 3D Scenes with Instructions |
ICCV |
2023 |
[Code] |
3D scene editing |
Image Editing |
| Instruct 3D-to-3D:
Text Instruction Guided 3D-to-3D conversion |
arXiv |
2023 |
[Code]
|
3D editing, Dynamic scaling |
Image Editing |
| InstructME: An
Instruction Guided Music Edit And Remix Framework with
Latent Diffusion Models |
arXiv |
2023 |
[Code] |
Music editing, diffusion |
Image Editing |
| EditShield:
Protecting Unauthorized Image Editing by Instruction-guided
Diffusion Models |
arXiv |
2023 |
[Code]
|
authorized editing, diffusion |
Image Editing |
| Fairy: Fast
Parallelized Instruction-Guided Video-to-Video Synthesis
|
arXiv |
2023 |
[Code] |
Video editing, cross-time attention |
Image Editing |
| AUDIT:
Audio Editing by Following Instructions with Latent
Diffusion Models |
NeurIPS |
2023 |
[Code] |
Audio, Diffusion |
Image Editing |
| InstructAny2Pix:
Flexible Visual Editing via Multimodal Instruction
Following |
arXiv |
2023 |
[Code]
|
Refinement prior, instrucitonal tuning |
Image Editing |
| Learning
to Follow Object-Centric Image Editing Instructions
Faithfully |
EMNLP |
2023 |
[Code]
|
Diffusion, additional supervision |
Image Editing |
| StableVideo:
Text-driven Consistency-aware Diffusion Video Editing
|
ICCV |
2023 |
[Code] |
Diffusion, Video |
Image Editing |
| Vox-E:
Text-Guided Voxel Editing of 3D Objects |
ICCV |
2023 |
[Code] |
Diffusion, 3D |
Image Editing |
| Unitune:
Text-driven image editing by fine tuning a diffusion model
on a single image |
TOG |
2023 |
[Code] |
Diffusion, fine-tuning |
Image Editing |
| Dreamix: Video
Diffusion Models are General Video Editors |
arXiv |
2023 |
[Code]
|
Cascaded diffusion, video |
Image Editing |
| Dialogpaint: A
dialog-based image editing model |
arXiv |
2023 |
- |
Dialog-based |
Image Editing |
| iEdit: Localised
Text-guided Image Editing with Weak Supervision |
arXiv |
2023 |
- |
Localized diffusion |
Image Editing |
| ImageBrush:
Learning Visual In-Context Instructions for Exemplar-Based
Image Manipulation |
NeurIPS |
2023 |
- |
Example-based instruction |
Image Editing |
| NULL-Text
Inversion for Editing Real Images Using Guided Diffusion
Models |
CVPR |
2023 |
[Code] |
null-tex embedding, Diffusion, CLIP |
Image Editing |
| Imagic:
Text-based real image editing with diffusion models |
CVPR |
2023 |
[Code] |
Diffusion, embedding interpolation |
Image Editing |
| PhotoVerse:
Tuning-Free Image Customization with Text-to-Image Diffusion
Models |
arXiv |
2023 |
[Code] |
Diffusion, dual-branch concept |
Image Editing |
| InstructEdit:
Improving Automatic Masks for Diffusion-based Image Editing
With User Instructions |
arXiv |
2023 |
[Code]
|
Diffusion, LLM-powered |
Image Editing |
| Instructdiffusion: A
generalist modeling interface for vision tasks |
arXiv |
2023 |
[Code]
|
Multi-task, multi-turn, Diffusion, LLM |
Image Editing |
| Emu Edit: Precise
Image Editing via Recognition and Generation Tasks |
arXiv |
2023 |
[Code] |
Diffusion, multi-task, multi-turn |
Image Editing |
| SmartEdit: Exploring
Complex Instruction-based Image Editing with Multimodal
Large Language Models |
arXiv |
2023 |
[Code] |
MLLM, Diffusion |
Image Editing |
| ChatFace: Chat-Guided
Real Face Editing via Diffusion Latent Space
Manipulation |
arXiv |
2023 |
[Code] |
LLM, Diffusion |
Image Editing |
| Prompt-to-Prompt
Image Editing with Cross Attention Control |
ICLR |
2023 |
[Code]
|
Diffusion, Cross Attention |
Image Editing |
| Target-Free
Text-Guided Image Manipulation |
AAAI |
2023 |
[Code]
|
3D Editing |
Image Editing |
| Paint
by example: Exemplar-based image editing with diffusion
models |
CVPR |
2023 |
[Code]
|
Diffusion, example-based |
Image Editing |
| De-net:
Dynamic text-guided image editing adversarial networks
|
AAAI |
2023 |
[Code] |
GAN, multi-task |
Image Editing |
| Imagen
editor and editbench: Advancing and evaluating text-guided
image inpainting |
CVPR |
2023 |
[Code] |
Diffusion, benchmark, CLIP |
Image Editing |
| Plug-and-Play
Diffusion Features for Text-Driven Image-to-Image
Translation |
CVPR |
2023 |
[Code] |
Diffusion, feature injection |
Image Editing |
| MasaCtrl:
Tuning-Free Mutual Self-Attention Control for Consistent
Image Synthesis and Editing |
ICCV |
2023 |
[Code] |
Diffusion, mutual self-attention |
Image Editing |
| LIME: Localized Image
Editing via Attention Regularization in Diffusion Models
|
arXiv |
2023 |
- |
Localized image editing |
Image Editing |
| LDEdit: Towards
Generalized Text Guided Image Manipulation via Latent
Diffusion Models |
BMVC |
2022 |
- |
latent diffusion |
Image Editing |
| StyleMC:
Multi-Channel Based Fast Text-Guided Image Generation and
Manipulation |
WACV |
2022 |
[Code] |
GAN, CLIP |
Image Editing |
| Blended
Diffusion for Text-Driven Editing of Natural Images |
CVPR |
2022 |
[Code]
|
Diffusion, CLIP, Blend |
Image Editing |
| VQGAN-CLIP:
Open Domain Image Generation and Editing with Natural
Language Guidance |
ECCV |
2022 |
[Code]
|
GAN, CLIP |
Image Editing |
| StyleGAN-NADA:
CLIP-guided domain adaptation of image generators |
TOG |
2022 |
[Code] |
GAN, CLIP |
Image Editing |
| DiffusionCLIP:
Text-Guided Diffusion Models for Robust Image
Manipulation |
CVPR |
2022 |
[Code]
|
Diffusion, CLIP, Noise combination |
Image Editing |
| GLIDE:
Towards Photorealistic Image Generation and Editing with
Text-Guided Diffusion Models |
ICML |
2022 |
[Code]
|
Diffusion, CLIP, Classifier-free guidance |
Image Editing |
| DiffEdit:
Diffusion-based semantic image editing with mask
guidance |
ICLR |
2022 |
[Code]
|
Diffusion, DDIM, Mask generation |
Image Editing |
| Text2mesh:
Text-driven neural stylization for meshes |
CVPR |
2022 |
[Code] |
3D Editing |
Image Editing |
| Manitrans:
Entity-level text-guided image manipulation via token-wise
semantic alignment and generation |
CVPR |
2022 |
[Code] |
GAN, multi-entities |
Image Editing |
| Text2live:
Text-driven layered image and video editing |
ECCV |
2022 |
[Code] |
GAN, CLIP, Video editing |
Image Editing |
| SPEECHPAINTER:
TEXT-CONDITIONED SPEECH INPAINTING |
Interspeech |
2022 |
[Code]
|
Speech editing |
Image Editing |
| Talk-to-Edit:
Fine-Grained Facial Editing via Dialog |
ICCV |
2021 |
[Code]
|
GAN, dialog, semantic field |
Image Editing |
| Manigan:
Text-guided image manipulation |
CVPR |
2020 |
[Code] |
GAN, affine combination |
Image Editing |
| SSCR:
Iterative Language-Based Image Editing via Self-Supervised
Counterfactual Reasoning |
EMNLP |
2020 |
[Code]
|
GAN, Cross-task consistency |
Image Editing |
| Open-Edit:
Open-Domain Image Manipulation with Open-Vocabulary
Instructions |
ECCV |
2020 |
[Code] |
GAN |
Image Editing |
| Sequential
Attention GAN for Interactive Image Editing |
MM |
2020 |
- |
GAN, Dialog, Sequential Attention |
Image Editing |
| Lightweight
generative adversarial networks for text-guided image
manipulation |
NeurIPS |
2020 |
[Code]
|
Light-weight GAN |
Image Editing |
| Tell,
Draw, and Repeat: Generating and Modifying Images Based on
Continual Linguistic Instruction |
ICCV |
2019 |
[Code] |
GAN |
Image Editing |
| Language-Based
Image Editing With Recurrent Attentive Models |
CVPR |
2018 |
[Code] |
GAN, Recurrent Attention |
Image Editing |
| Text-Adaptive
Generative Adversarial Networks: Manipulating Images with
Natural Language |
NeurIPS |
2018 |
[Code] |
GAN, simple |
Image Editing |
| Dreamcrafter:
Immersive Editing of 3D Radiance Fields Through Flexible,
Generative Inputs and Outputs |
CHI |
2025 |
- |
3D Radiance Field editing |
Media Editing |
| SVG-Head:
Hybrid Surface-Volumetric Gaussians for High-Fidelity Head
Reconstruction and Real-Time Editing |
ICCV |
2025 |
[Code] |
head reconstruction and real-time editing |
Media Editing |
| Edit as You See:
Image-guided Video Editing via Masked Motion Modeling
|
arXiv |
2025 |
- |
image-guided video editing |
Media Editing |
| CAD-Editor: A
Locate-then-Infill Framework with Automated Training Data
Synthesis for Text-Based CAD Editing |
arXiv |
2025 |
- |
Text-Based CAD Editing |
Media Editing |
| MRHaD: Mixed
Reality-based Hand-Drawn Map Editing Interface for Mobile
Robot Navigation |
arXiv |
2025 |
- |
mixed reality map editing |
Media Editing |
| ScanEdit:
Hierarchically-Guided Functional 3D Scan Editing |
arXiv |
2025 |
- |
3D Scan Editing |
Media Editing |
| Vidi: Large
Multimodal Models for Video Understanding and Editing
|
arXiv |
2025 |
- |
video understanding and editing |
Media Editing |
| Rethinking Score
Distilling Sampling for 3D Editing and Generation |
arXiv |
2025 |
- |
3D editing and generation |
Media Editing |
| BlenderFusion:
3D-Grounded Visual Editing and Generative Compositing
|
arXiv |
2025 |
- |
3D visual editing |
Media Editing |
| EditIQ: Automated
Cinematic Editing of Static Wide-Angle Videos via Dialogue
Interpretation and Saliency Cues |
arXiv |
2025 |
- |
Automated cinematic editing |
Media Editing |
| VideoGrain:
Modulating Space-Time Attention for Multi-grained Video
Editing |
arXiv |
2025 |
- |
multi-grained video editing |
Media Editing |
| VEU-Bench: Towards
Comprehensive Understanding of Video Editing |
arXiv |
2025 |
- |
video editing benchmark |
Media Editing |
| Controllable
Pedestrian Video Editing for Multi-View Driving Scenarios
via Motion Sequence |
arXiv |
2025 |
- |
pedestrian video editing |
Media Editing |
| TexGS-VolVis:
Expressive Scene Editing for Volume Visualization via
Textured Gaussian Splatting |
arXiv |
2025 |
- |
Volume Scene Editing |
Media Editing |
| SGEdit: Bridging LLM
with Text2Image Generative Model for Scene Graph-based Image
Editing |
SIGGRAPH Asia |
2024 |
[Code]
|
Diffusion, scene graph, image-editing |
Media Editing |
| Audio-Agent:
Leveraging LLMs For Audio Generation, Editing and
Composition |
arXiv |
2024 |
- |
Text-to-Audio, Multimodal |
Media Editing |
| AudioEditor: A
Training-Free Diffusion-Based Audio Editing Framework
|
arXiv |
2024 |
[Code] |
Diffusion-based text-to-audio |
Media Editing |
| Enabling Local
Editing in Diffusion Models by Joint and Individual
Component Analysis |
BMVC |
2024 |
[Code] |
Diffusion-based local image manipulation |
Media Editing |
| Steer-by-prior
Editing of Symbolic Music Loops |
MML |
2024 |
[Code] |
Masked Language Modelling, music instruments |
Media Editing |
| Audio Prompt Adapter:
Unleashing Music Editing Abilities for Text-to-Music with
Lightweight Finetuning |
ISMIR |
2024 |
[Code]
|
Diffusion-based text-to-audio |
Media Editing |
| GroupDiff:
Diffusion-based Group Portrait Editing |
ECCV |
2024 |
[Code] |
Diffusion-based image editing |
Media Editing |
| RegionDrag: Fast
Region-Based Image Editing with Diffusion Models |
ECCV |
2024 |
[Code]
|
Diffusion-based image editing |
Media Editing |
| SyncNoise:
Geometrically Consistent Noise Prediction for Text-based 3D
Scene Editing |
arXiv |
2024 |
- |
Multi-view consistency |
Media Editing |
| DreamCatalyst: Fast
and High-Quality 3D Editing via Controlling Editability and
Identity Preservation |
arXiv |
2024 |
[Code] |
Diffusion-based editing |
Media Editing |
| MEDIC: Zero-shot
Music Editing with Disentangled Inversion Control |
arXiv |
2024 |
- |
Audio editing |
Media Editing |
| 3DEgo: 3D Editing on
the Go! |
ECCV |
2024 |
[Code] |
Monocular 3D Scene Synthesis |
Media Editing |
| MedEdit:
Counterfactual Diffusion-based Image Editing on Brain
MRI |
SASHIMI |
2024 |
- |
Biomedical editing |
Media Editing |
| FlexiEdit:
Frequency-Aware Latent Refinement for Enhanced Non-Rigid
Editing |
ECCV |
2024 |
- |
Image editing |
Media Editing |
| LEMON: Localized
Editing with Mesh Optimization and Neural Shaders |
arXiv |
2024 |
- |
Mesh editing |
Media Editing |
| Diffusion Brush: A
Latent Diffusion Model-based Editing Tool for AI-generated
Images |
arXiv |
2024 |
- |
Image editing |
Media Editing |
| Streamlining Image
Editing with Layered Diffusion Brushes |
arXiv |
2024 |
- |
Image editing |
Media Editing |
| SEED-Data-Edit
Technical Report: A Hybrid Dataset for Instructional Image
Editing |
arXiv |
2024 |
[Code]
|
Image Editing Dataset |
Media Editing |
| Environment Maps
Editing using Inverse Rendering and Adversarial Implicit
Functions |
arXiv |
2024 |
- |
Inverse rendering, HDR editing |
Media Editing |
| HairDiffusion: Vivid
Multi-Colored Hair Editing via Latent Diffusion |
arXiv |
2024 |
- |
Hair editing, Diffusion models |
Media Editing |
| DiffuMask-Editor: A
Novel Paradigm of Integration Between the Segmentation
Diffusion Model and Image Editing to Improve Segmentation
Ability |
arXiv |
2024 |
- |
Synthetic Data Generation |
Media Editing |
| Taming Rectified Flow
for Inversion and Editing |
arXiv |
2024 |
[Code]
|
Image Inversion |
Media Editing |
| Pathways on the Image
Manifold: Image Editing via Video Generation |
arXiv |
2024 |
- |
video-based editing, Frame2Frame, Temporal Editing Caption |
Media Editing |
| PrEditor3D: Fast and
Precise 3D Shape Editing |
arXiv |
2024 |
- |
3D shape editing |
Media Editing |
| Diffusion-Based
Attention Warping for Consistent 3D Scene Editing |
arXiv |
2024 |
- |
3D scene editing |
Media Editing |
| MIVE: New Design and
Benchmark for Multi-Instance Video Editing |
arXiv |
2024 |
- |
Multi-Instance Video Editing |
Media Editing |
| DriveEditor: A
Unified 3D Information-Guided Framework for Controllable
Object Editing in Driving Scenes |
arXiv |
2024 |
- |
3D object editing |
Media Editing |
| MAKIMA: Tuning-free
Multi-Attribute Open-domain Video Editing via Mask-Guided
Attention Modulation |
arXiv |
2024 |
- |
Multi-Attribute Video Editing |
Media Editing |
| EditSplat: Multi-View
Fusion and Attention-Guided Optimization for View-Consistent
3D Scene Editing with 3D Gaussian Splatting |
arXiv |
2024 |
- |
3D scene editing |
Media Editing |