In-The-Wild 3D Morphable Models: Code and Data

This webpage provides source code and data for our method of 3D reconstruction of "In-the-Wild" faces in images and videos, as outlined in the following papers:

 

If you use the code or data please cite the above papers.

 

Examples of Results


Example of our 3D face reconstruction from images:

 

Example of our 3D face reconstruction from videos:

 

 

 

 

Source code

Update 28 June 2018: There has been a major update in the code and the problems of the previous preliminary version have been addressed. The current version is fully functional.


3D Reconstruction of "In-the-Wild" faces

The source code of our method for 3D Reconstruction of "In-the-Wild" faces in images and videos can be found in the following github repository:

https://github.com/menpo/itwmm

 

Evaluation Code

In the above repository, under the folder "evaluation", we also provide source code that will help you evaluate and compare 3D face reconstruction methods using the 3dMDLab & 4DMaja benchmarks:

https://github.com/menpo/itwmm/evaluation

 




Data

 

3dMDLab benchmark

 

Download the data: [3dMDLab_real]  [3dMDLab_synthetic]

 

 

This benchmark has been created in controlled conditions using a high-resolution 3dMD facial scanner. This results in highly-detailed ground truth facial meshes and is therefore suited to evaluating 3D fitting methods on constrained acquisition conditions. In more detail, 3dMDLab includes different subjects each performing 2 different expressions (a neutral and a non-neutral one). It consists of real and synthetic sub-sets:

  • 3dMDLab_real: this includes 8 real images in ideal, laboratory conditions, coming directly from one of the RGB cameras of the 3dMD face scanning system. The images are high-resolution (2048x2448 pixels) images with true colour range (24 bits per pixel).
  • 3dMDLab_synthetic: this includes 6 synthetic images created by the same scans after rendering them from different view points with varying synthetic light. Again, the images are high-resolution (2048x2448 pixels) images with true colour range (24 bits per pixel).

 

Format of the data:

In both cases (real and synthetic data), the zip files contain the following subfolders:

  • Images_and_Landmarks: This contains the input images (in .bmp format) as well as the corresponding sparse landmarks per input image (in .pts format with the same file name as the input image). These landmarks are the ones that have been used in our evaluation, as described in our T-PAMI article (for our fitting method as well as Classic and Linear methods). The landmarks have been extracted using the CNN-based landmarker of [1].
  • Reconstructions_dummy: This contains a .obj file per input image with the 3D mesh of a “dummy” reconstruction result (which always corresponds to the same 3D face scan). These are “dummy” results, in the sense that they do not really correspond to real 3D reconstruction results coming from a method. They are just included so that you can see the format that your result should have and you can test the evaluation code that we provide.
  • Ground_Truth: This contains a .obj file per input image with the ground truth 3D mesh. These meshes are registered with the LSFM model. In case your method outputs a result using the representation from a different model, the evaluation code that we provide converts it to the LSFM model representation before calculating the reconstruction errors.

 

4DMaja benchmark

 

Download the data: [4DMaja_synthetic]  [4DMaja_real

 

 

We are introducing this benchmark to quantitatively evaluate 3DMM video fitting, To the best of our knowledge, this is the first publicly available benchmark that allows detailed quantitative evaluation of 3D face reconstruction on videos. 4DMaja includes the following two videos of the same subject (Prof. Maja Pantic) exhibiting various natural expressions and head pose variations:

  • 4DMaja_synthetic: This is a 440-frame-long synthetic sequence generated from high-resolution face scans taken using a DI4D face scanner, with the (virtual) camera undergoing a periodic rotation. This allows for a quantitative evaluation of the 3D face reconstruction for every frame of the video.
  • 4DMaja_real: This is a 387-frame-long clip from a public talk the subject gave and is under ``in-the-wild'' conditions. We associate with this video a high-resolution neutral 3D scan of the subject which was taken with the same DI4D scanner within 2 months of the talk. Given the small time difference we can consider the 3D scan a ground truth of the identity component of the 3D facial shape for the real video. In this way, one can quantitatively evaluate how well the 3D facial identity is estimated when different methods are run over a real ``in-the-wild'' video for the first time.

 

Format of the data:

The format is very similar to the format of 3dMDLab benchmark. In more detail, in both cases (real and synthetic data), the zip files contain the following subfolders:

  • Frames_and_Landmarks: This contains the input frames (in .png format with a numbering starting from 1 and a file name format of the type %06d.png) as well as the corresponding sparse landmarks per input frame (in .pts format with the same file name as the input frame). Again, these landmarks are the ones that have been used in the relevant evaluation of our T-PAMI article and have been extracted using the CNN-based landmarker of [1].
  • Reconstructions_dummy: This contains a .obj file per input frame with a “dummy” reconstruction result (which always corresponds to the same 3D face scan). Again, these are dummy results, since they correspond to the repetition of the same 3D mesh coming from a 3D face scan.
  • Ground_Truth: In case of 4DMaja_synthetic, this contains a .obj file per input frame, with the 3D mesh that corresponds to the ground truth 3D shape of every frame. In case of 4DMaja_real, this subfolder only contains the file neutral.obj with the 3D mesh originating from the neutral 3D scan of the subject. As already mentioned, this is considered as a ground truth of the identity component of the subject’s 3D face shape. Again, the meshes in all .obj files are registered with the LSFM model and, if needed, you can use the evaluation code that we provide to convert your result to a compatible LSFM representation

 

 

References

[1] J. Deng, Y. Zhou, S. Cheng, and S. Zafeiriou, “Cascade multi-view hourglass model for robust 3d face alignment,” in FG, 2018.