
Depth Anything
This work presents Depth Anything, a highly practical solution for robust monocular depth estimation. Without pursuing novel technical modules, we aim to build a simple yet powerful foundation model dealing with any images under any circumstances. To this end, we scale up the dataset by designing a data engine to collect and automatically annotate large-scale unlabeled data (~62M), which significantly enlarges the data coverage and thus is able to reduce the generalization error. We investigate two simple yet effective strategies that make data scaling-up promising. First, a more challenging optimization target is created by leveraging data augmentation tools. It compels the model to actively seek extra visual knowledge and acquire robust representations. Second, an auxiliary supervision is developed to enforce the model to inherit rich semantic priors from pre-trained encoders. We evaluate its zero-shot capabilities extensively, including six public datasets and randomly captured photos. It demonstrates impressive generalization ability. Further, through fine-tuning it with metric depth information from NYUv2 and KITTI, new SOTAs are set. Our better depth model also results in a much better depth-conditioned ControlNet. All models have been released.
We thank the MagicEdit team for providing some video examples for video depth estimation, and Tiancheng Shen for evaluating the depth maps with MagicEdit. The middle video is generated by MiDaS-based ControlNet, while the last video is generated by Depth Anything-based ControlNet.
数据统计
数据评估
关于Depth Anything特别声明
本站鸟瑞导航提供的Depth Anything数据都来源于网络,不保证外部链接的准确性和完整性,同时,对于该外部链接的指向,不由鸟瑞导航实际控制,在2025年9月10日 下午7:03收录时,该网页上的内容,都属于合法合规,后期网页的内容如出现违规,请联系本站网站管理员进行举报,我们将进行删除,鸟瑞导航不承担任何责任。
相关导航

Chaos develops visualization technologies that empower artists & designers to create photorealistic imagery and animation across all creative industries

CSM — The fastest way to create 3D with AI
Common Sense Machines builds industry-leading 3D generative-AI models that transform images, text, and sketches into game-ready 3D assets and worlds. Trusted by world leading game studios, product designers and industrial designers.

StyleDrop: Text
StyleDrop: Text-to-Image Generation in Any Style

Openflow
OpenFlow | 慧言AI 提供工作流、知识流和心流的AI行业垂直应用层搭建服务。我们帮助行业先行者低门槛搭建AI实操平台,为行业伙伴提供咨询和赋能。

Artefacts
Artefacts is a 3D AI toolkit that enables users to effortlessly transform text or 2D images into 3D assets. Unleash your creativity with Artefacts - the future of 3D content creation.
站酷ZCOOL
站酷ZCOOL,中国设计师互动平台.深耕设计领域十八年,站酷聚集了1800万设计师、摄影师、插画师、艺术家、创意人,设计创意群体中具有较高的影响力与号召力.

OneStory
OneStory 是一款创新的AI故事生成助手,用AI快速生成连续性、一致性的角色和故事。

SOLART – 3D创易
一步到位 构筑你的虚拟世界
暂无评论...




