PrimeDepth: Efficient Monocular Depth Estimation with a Stable Diffusion Preimage
Denis Zavadski* · Damjan Kalšan* · Carsten Rother
Computer Vision and Learning Lab,
IWR, Heidelberg University
*equal contribution
ACCV 2024
PrimeDepth is a diffusion-based monocular depth estimator which leverages the rich representation of the visual world stored within Stable Diffusion. The representation, termed preimage
, is extracted in a single diffusion step from frozen Stable Diffusion 2.1 and adjusted towards depth prediction. PrimeDepth yields detailed predictions while simulatenously being fast at inference time due to the single-step approach.
Introduction
These are the weights for the inference codebase for PrimeDepth based on Stable Diffusion 2.1. Further details and visual examples can be found on the project page.
Installation
Create and activate a virtual environment:
conda create -n PrimeDepth python=3.9 conda activate PrimeDepth
Install dependencies:
pip3 install -r requirements.txt
Download the weights
Adjust the attribute
ckpt_path
inconfigs/inference.yaml
to point to the downloaded weights from the previous step
Usage
from scripts.utils import InferenceEngine
config_path = "./configs/inference.yaml"
image_path = "./goodBoy.png"
ie = InferenceEngine(pd_config_path=config_path, device="cuda")
depth_ssi, depth_color = ie.predict(image_path)
PrimeDepth predicts in inverse space. The raw model predictions are stored in depth_ssi
, while a colorized prediction depth_color
is precomputed for visualization convenience:
depth_color.save("goodBoy_primedepth.png")
Citation
@misc{zavadski2024primedepth,
title={PrimeDepth: Efficient Monocular Depth Estimation with a Stable Diffusion Preimage},
author={Denis Zavadski and Damjan Kalšan and Carsten Rother},
year={2024},
eprint={2409.09144},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2409.09144},
}