Nankai University
We present Depth Anything at Any Condition (DepthAnything-AC), a foundation monocular depth estimation (MDE) model capable of handling diverse environmental conditions. Previous foundation MDE models achieve impressive performance across general scenes but not perform well in complex open-world environments that involve challenging conditions, such as illumination variations, adverse weather, and sensor-induced distortions.
To overcome the challenges of data scarcity and the inability of generating high-quality pseudo-labels from corrupted images, we propose an unsupervised consistency regularization finetuning paradigm that requires only a relatively small amount of unlabeled data. Furthermore, we propose the Spatial Distance Constraint to explicitly enforce the model to learn patch-level relative relationships, resulting in clearer semantic boundaries and more accurate details.
Experimental results demonstrate the zero-shot capabilities of DepthAnything-AC across diverse benchmarks, including real-world adverse weather benchmarks, synthetic corruption benchmarks, and general benchmarks.
The perturbation consistency framework encourages DepthAnything-AC to generate consistent predictions under augmentations while retaining generality via the frozen original model. To enhance semantic boundaries and details, the spatial distance constraint is used to strengthen the understanding of inter-patch relationships.
Dataset | Indoor | Outdoor | Samples | Used | Ratio |
---|---|---|---|---|---|
ADE20k | 20K | 20K | 100% | ||
MegaDepth | 128K | 100K | 78% | ||
DIML | 927K | 80K | 8.6% | ||
VKITTI2 | 42K | 42K | 100% | ||
HRWSI | 20K | 20K | 100% | ||
SA-1B | 11.1M | 140K | 1.3% | ||
COCO | 120K | 120K | 100% | ||
Pascal VOC 2012 | 10K | 10K | 100% | ||
AODRaw | 80K | 8K | 10% |
Datasets used for training. In total, our DepthAnything-AC is fine-tuned on 540K unlabeled images, which is much less than the number used in the DepthAnything series.
Complete implementation code, pre-trained models and usage examples
Download links for training and evaluation datasets
DepthAnything-AC ViT-S model checkpoint (94.6MB) and DINOv2 backbone
Download ModelsFeel free to contact us at:
boyuansun[AT]mail.nankai.edu.cn
jin_modi[AT]mail.nankai.edu.cn
For commercial licensing, please contact:
andrewhoux[AT]gmail.com
This code is licensed under the Creative Commons Attribution-NonCommercial 4.0 International for non-commercial use only. Please note that any commercial use of this code requires formal permission prior to use.
@article{depth_anything_AC, title={Depth Anything At Any Condition}, author={Sun, Boyuan and Jin, Modi and Yin, Bowen and Hou, Qibin}, journal={arXiv preprint}, year={2025}, institution={Nankai University} }