Depth Anything At Any Condition

Quick Preview

DepthAnything V1

Ours

DepthAnything V1

Ours

DepthAnything V1

Ours

DepthAnything V1

Ours

DepthAnything V2

Ours

DepthAnything V2

Ours

DepthAnything V2

Ours

DepthAnything V2

Ours

DepthPro

Ours

DepthPro

Ours

DepthPro

Ours

DepthPro

Ours

Abstract

We present Depth Anything at Any Condition (DepthAnything-AC), a foundation monocular depth estimation (MDE) model capable of handling diverse environmental conditions. Previous foundation MDE models achieve impressive performance across general scenes but not perform well in complex open-world environments that involve challenging conditions, such as illumination variations, adverse weather, and sensor-induced distortions.

To overcome the challenges of data scarcity and the inability of generating high-quality pseudo-labels from corrupted images, we propose an unsupervised consistency regularization finetuning paradigm that requires only a relatively small amount of unlabeled data. Furthermore, we propose the Spatial Distance Constraint to explicitly enforce the model to learn patch-level relative relationships, resulting in clearer semantic boundaries and more accurate details.

Experimental results demonstrate the zero-shot capabilities of DepthAnything-AC across diverse benchmarks, including real-world adverse weather benchmarks, synthetic corruption benchmarks, and general benchmarks.

Framework

The perturbation consistency framework encourages DepthAnything-AC to generate consistent predictions under augmentations while retaining generality via the frozen original model. To enhance semantic boundaries and details, the spatial distance constraint is used to strengthen the understanding of inter-patch relationships.

Data Coverage

Dataset	Samples	Used	Ratio
ADE20k	20K	20K	100%
MegaDepth	128K	100K	78%
DIML	927K	80K	8.6%
VKITTI2	42K	42K	100%
HRWSI	20K	20K	100%
SA-1B	11.1M	140K	1.3%
COCO	120K	120K	100%
Pascal VOC 2012	10K	10K	100%
AODRaw	80K	8K	10%

Datasets used for training. In total, our DepthAnything-AC is fine-tuned on 540K unlabeled images, which is much less than the number used in the DepthAnything series.

Open Source Resources

GitHub Repository

Complete implementation code, pre-trained models and usage examples

Datasets

Download links for training and evaluation datasets

Pre-trained Models

DepthAnything-AC ViT-S model checkpoint (94.6MB) and DINOv2 backbone

Download Models

Contact

Feel free to contact us at:

boyuansun[AT]mail.nankai.edu.cn

jin_modi[AT]mail.nankai.edu.cn

For commercial licensing, please contact:

andrewhoux[AT]gmail.com

License

Creative Commons Attribution-NonCommercial 4.0 International

This code is licensed under the Creative Commons Attribution-NonCommercial 4.0 International for non-commercial use only. Please note that any commercial use of this code requires formal permission prior to use.

View License

Citation

BibTeX

@article{sun2025depth,
title={Depth Anything at Any Condition},
author={Sun, Boyuan and Modi Jin and Bowen Yin and Hou, Qibin},
journal={arXiv preprint arXiv:2507.01634},
year={2025}
}