MiDaS / mobile /README.md
qninhdt's picture
Upload 191 files
ef877a2 verified

Mobile version of MiDaS for iOS / Android - Monocular Depth Estimation

Accuracy

  • Old small model - ResNet50 default-decoder 384x384
  • New small model - EfficientNet-Lite3 small-decoder 256x256

Zero-shot error (the lower - the better):

Model DIW WHDR Eth3d AbsRel Sintel AbsRel Kitti δ>1.25 NyuDepthV2 δ>1.25 TUM δ>1.25
Old small model 384x384 0.1248 0.1550 0.3300 21.81 15.73 17.00
New small model 256x256 0.1344 0.1344 0.3370 29.27 13.43 14.53
Relative improvement, % -8 % +13 % -2 % -34 % +15 % +15 %

None of Train/Valid/Test subsets of datasets (DIW, Eth3d, Sintel, Kitti, NyuDepthV2, TUM) were not involved in Training or Fine Tuning.

Inference speed (FPS) on iOS / Android

Frames Per Second (the higher - the better):

Model iPhone CPU iPhone GPU iPhone NPU OnePlus8 CPU OnePlus8 GPU OnePlus8 NNAPI
Old small model 384x384 0.6 N/A N/A 0.45 0.50 0.50
New small model 256x256 8 22 30 6 22 4
SpeedUp, X times 12.8x - - 13.2x 44x 8x

N/A - run-time error (no data available)

Models:

  • Old small model - ResNet50 default-decoder 1x384x384x3, batch=1 FP32 (converters: Pytorch -> ONNX - onnx_tf -> (saved model) PB -> TFlite)

    (Trained on datasets: RedWeb, MegaDepth, WSVD, 3D Movies, DIML indoor)

  • New small model - EfficientNet-Lite3 small-decoder 1x256x256x3, batch=1 FP32 (custom converter: Pytorch -> TFlite)

    (Trained on datasets: RedWeb, MegaDepth, WSVD, 3D Movies, DIML indoor, HRWSI, IRS, TartanAir, BlendedMVS, ApolloScape)

Frameworks for training and conversions:

pip install torch==1.6.0 torchvision==0.7.0
pip install tf-nightly-gpu==2.5.0.dev20201031 tensorflow-addons==0.11.2 numpy==1.18.0
git clone --depth 1 --branch v1.6.0 https://github.com/onnx/onnx-tensorflow

SoC - OS - Library:

  • iPhone 11 (A13 Bionic) - iOS 13.7 - TensorFlowLiteSwift 0.0.1-nightly
  • OnePlus 8 (Snapdragon 865) - Andoird 10 - org.tensorflow:tensorflow-lite-task-vision:0.0.0-nightly

Citation

This repository contains code to compute depth from a single image. It accompanies our paper:

Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer
René Ranftl, Katrin Lasinger, David Hafner, Konrad Schindler, Vladlen Koltun

Please cite our paper if you use this code or any of the models:

@article{Ranftl2020,
    author    = {Ren\'{e} Ranftl and Katrin Lasinger and David Hafner and Konrad Schindler and Vladlen Koltun},
    title     = {Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer},
    journal   = {IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)},
    year      = {2020},
}