Sunday, March 31, 2019

AutoFace - How It Works

The core of AutoFace is the 3DMM face identity shape network described in

A. Tran, T. Hassner, I. Masi, G. Medioni, "Regressing Robust and Discriminative 3D Morphable Models with a very Deep Neural Network", in CVPR, 2017. The preprint version is available as arXiv:1612.04904 [cs.CV].

Code for this has generously been made available at Github, https://github.com/fengju514/Expression-Net.

The Expression-Net script analyzes a single input image and outputs a 3D shape in the form of weights for a set of 99 morphs labelled 00 to 98. The location of a mesh vertex is








The Expression-Net script also describes the 99 morphs, but only for a modified Basel Face Model (BFM), which appears to be popular among AI researchers. The BFM is a high-density face mesh, and as such rather unusable for artists, who prefer a medium-density full body mesh, which is rigged and textured. A key contribution of AutoFace is to translate the morphs from the modified BFM into shapekeys for two useful meshes, the Genesis 8 Male and Female characters used in DAZ Studio.

The pictures below show the first few morphs, for the modified BFM, Genesis 8 Male, and Genesis 8 Female. First the original meshes with no shapekeys applied.

Basic shapes - unmorphed meshes


Shapekey 00
Shapekey 01
Shapekey 02
Shapekey 03
Shapekey 04
Shapekey 05
Shapekey 06
A number of comments are in order.
  • The shapekeys in AutoFace are ten times stronger than the original ones, and hence the coefficients must be reduced by the same factor of ten. The reason is that higher shapekeys are quite subtle and difficult to detect.
  • The Genesis shapekeys taper off near the boundary of the BFM, in order to avoid a sharp transition to the rest of the head mesh. In particular a fattened jawline, such as the one in shapekey 00, is not transferred correctly. This problem may be affecting the example with president Trump.
  • They eyes, including the skin behind the eyes, are not morphed. Instead the eye vertices are scaled and translated based on the movement of the corners of the eyes. This ensures that the eyes remain round, which is desirable when posing.
  • Similarly, the inside of the mouth is scaled based on the movement of the corners of the mouth, in order to ensure that the jaws and teeth remain inside the face.
  • The shapekeys shown in this post are more or less symmetric. Higher shapekeys are less pronounced, and some of them are asymmetric.
As always, something is lost in translation, so it would be better if the deep network were trained directly for the Genesis 8 meshes. Or for subdivided versions of them. They high-frequency data could then be converted into a normal map. Alas, I neither have the competence, computer resources nor the training data to do this.



Saturday, March 30, 2019

AutoFace - Create Rigged and Textured Character from a Single Photograph

During the last month I have been working on an old idea: a collection of scripts that creates a rigged and textured character from a single photograph. Today I am proud to announce the result: AutoFace. The scripts are usable, although not completely finished and the results are not perfect.

AutoFace documentation is found at http://diffeomorphic.blogspot.com/p/autoface.html.

A zip file can be downloaded from https://bitbucket.org/Diffeomorphic/autoface/downloads/.

Or if you prefer to clone the Mercurial repository: https://bitbucket.org/Diffeomorphic/autoface.

AutoFace uses quite a large number of external libraries, listed in the Prerequisites https://diffeomorphic.blogspot.com/p/prerequisites-1.html.


At the core of AutoFace is a deep convolutional neural network model and python code for robust estimation of the 3DMM face identity shape networks, directly from an unconstrained face image and without the use of face landmark detectors. The method is described in

A. Tran, T. Hassner, I. Masi, G. Medioni, "Regressing Robust and Discriminative 3D Morphable Models with a very Deep Neural Network", in CVPR, 2017. The preprint version is available as arXiv:1612.04904 [cs.CV].

Code from this project has been generously shared on Github: https://github.com/fengju514/Expression-Net. That site also contains code for estimating expressions and face poses, but AutoFace does not yet make use of this data.

Whereas the Expression-Net script does the job of recreating the 3D face from a photograph, the output is not really useful for the practising artist. The script outputs morphs for a modified Basel Face Model (BFM), which seems to be a favorite among AI researchers. The BFM is a high-density face mesh, but in practise we want a medium-density full body mesh, which is rigged and textured.

The goal of AutoFace is to transfer the output from the BFM to useful meshes, in particular the Genesis 8 Male and Female characters used in DAZ Studio. After the morphs have been transferred, we can load the characters into DAZ Studio and add hair, clothes and body morphs. The dressed characters can then be used in DAZ Studio or exported to other application. I personally use the DAZ importer to import the characters into Blender, where they can be posed and rendered.


The pictures below show some results for images of two beloved candidates of the 2016 US presidential election. This images are included in Autoface.

Input images, with landmarks detected by Dlib.
Basel Face Model
Genesis 8 Male and Female
Textures
Rendered
Clothes, hair and body morphs added in DAZ Studio.
Imported into Blender, posed and rendered.
As you can immediately tell from these images, the results are not perfect. The obvious flaw is that the characters look far too young. A possible workaround could be to add some aging morphs in DAZ Studio, although it is not satisfactory.

For best results we should start with a frontal photograph of a face in neutral position. The picture of secretary Clinton shows why. The deep network removes the smile from the 3D mesh, but it still lingers in the texture, in particular in the artifacts around the cheeks.