MVPSNet: Fast Generalizable Multi-view Photometric Stereo

Abstract

We propose a fast and generalizable solution to Multi-view Photometric Stereo (MVPS), called MVPSNet. The key to our approach is a feature extraction network that effectively combines images from the same view captured under multiple lighting conditions to extract geometric features from shading cues for stereo matching. We demonstrate these features, termed 'Light Aggregated Feature Maps' (LAFM), are effective for feature matching even in textureless regions, where traditional multi-view stereo methods fail. Our method produces similar reconstruction results to PS-NeRF, a state-of-the-art MVPS method that optimizes a neural network per-scene, while being 411x faster (105 seconds vs. 12 hours) in inference. Additionally, we introduce a new synthetic dataset for MVPS, sMVPS, which is shown to be effective to train a generalizable MVPS method.

Results

Results on DiLiGenT-MV

MVPSNet achieves comparable results to SOTA method while being fast and generalizable.

Result on real-world capture

Result of an example of real-world capture using a simple at-home setup: a user captures MVPS imagery by attaching a flash-light with a string to the tripod to move lights in a circle around the camera.

BibTeX

@article{zhao2023mvpsnet,
  title={MVPSNet: Fast Generalizable Multi-view Photometric Stereo},
  author={Zhao, Dongxu and Lichy, Daniel and Perrin, Pierre-Nicolas and Frahm, Jan-Michael and Sengupta, Soumyadip},
  journal={arXiv preprint arXiv:2305.11167},
  year={2023},
}