
Instruction-based image editing improves the controllability and flexibility of image manipulation via natural commands without elaborate descriptions or regional masks. However, human instructions are sometimes too brief for current methods to capture and follow. Multimodal large language models (MLLMs) show promising capabilities in cross-modal understanding and visual-aware response generation via LMs. We investigate how MLLMs facilitate edit instructions and present MLLM-Guided Image Editing (MGIE). MGIE learns to derive expressive instructions and provides explicit guidance. The editing model jointly captures this visual imagination and performs manipulation through end-to-end training. We evaluate various aspects of Photoshop-style modification, global photo optimization, and local editing. Extensive experimental results demonstrate that expressive instructions are crucial to instruction-based image editing, and our MGIE can lead to a notable improvement in automatic metrics and human evaluation while maintaining competitive inference efficiency.
👇 press the tab for different datasets
数据统计
数据评估
关于[ICLR’24] MGIE特别声明
本站鸟瑞导航提供的[ICLR’24] MGIE数据都来源于网络,不保证外部链接的准确性和完整性,同时,对于该外部链接的指向,不由鸟瑞导航实际控制,在2025年9月10日 下午7:04收录时,该网页上的内容,都属于合法合规,后期网页的内容如出现违规,请联系本站网站管理员进行举报,我们将进行删除,鸟瑞导航不承担任何责任。
相关导航

NVIDIA 发明了 GPU,并推动了 AI、HPC、游戏、创意设计、自动驾驶汽车和机器人开发领域的进步。

SceneXplain
SceneXplain - Leading AI Solution for Image Captions and Video Summaries

可图AI
Create professional videos and images with Kling AI's state-of-the-art generative AI platform. Our tools support video generation, image creation, and advanced editing capabilities for content creators.

Clip Interrogator
Run open-source machine learning models with a cloud API

触手AI绘画
支持文字生成AI图;支持图生图;可controlnet条件生图,上传特征参考图和特征,依照特征进行创作;支持inpainting的神奇涂抹,可局部修改,支持自训练AI绘画模型;支持在基础风格模型上,使用叠加AI绘图模型;支持在模型广场收藏各类画风、IP、场景、人物、设计类模型。

Auth0: Secure access for everyone. But not just anyone.
Rapidly integrate authentication and authorization for web, mobile, and legacy applications so you can focus on your core business.

Midjourney提示词(咒语)生成器
B族智能MJ中文站提供优质的Midjourney绘画系统平台,汇集Midjourney绘画、MJ中文版绘图、平台支持高质量图片生成、风格转换、智能抠图等多种功能,满足不同用户需求。

HOME
Unlock the potential of AI and achieve unlimited money making possibilities with our expert guides and tutorials. Learn how to leverage AI for increased efficiency, automation, and profits. From beginner to advanced, we provide the knowledge and tools you need to take your business to the next level
暂无评论...




