Learning a model of facial shape and expression from 4D scans
3. Model formulation
FLAME is described by a function:
- — coefficients describing shape
- — coefficients describing pose
- — coefficients describing expression
- — template mesh in the “zero pose”
- — “zero pose”
- — shape blendshape function to account for identity related shape variation
- — corrective pose blendshapes to correct pose deformations that cannot be explained solely by LBS
- — expression blendshapes that capture facial expressions
- — A standard skinning function is applied to rotate the vertices of around joints , linearly smoothed by blendweights
- — a sparse matrix defining how to compute joint locations from mesh vertices
Shape blendshapes
- — shape coefficients
- — orthonormal shape basis, which will be learned below with PCA
Pose blendshapes
- — a function from a face / head / eye pose vector to a vector containing the concatenated elements of all the corresponding rotation matrices
- — -th element of and
- vector — vertex offsets from the rest pose activated by
- — pose space, a matrix containing all pose blendshapes
Expression blendshapes
- — expression coefficients
- — orthonormal expression basis
Template shape
4. Temporal registration
4.1. Initial model
Shape
Pose
Expression
4.2. Single-frame registration
Model-only
estimate the model coefficients by optimizing
- — measures the scan-to-mesh distance of the scan vertices and the closest point in the surface of the model
- — scan vertices
- — weight controls the influence of the data term
- — a Geman-McClure robust penalty function
- — a landmark term, measuring the L2-norm distance between image landmarks and corresponding vertices on the model template, projected into the image using the known camera calibration
- — regularizes the pose coefficients , shape coefficients , and expression coefficients to be close to zero by penalizing their squared values
Coupled
allow the optimization to leave the model space by optimizing
— template mesh
— measures the scan-to-mesh distance from the scan to the aligned mesh
— constrains to be close to the current statistical model by penalizing edge differences between and the model as
— edges of and
— an individual weight assigned to each edge
- — regularization term for each vertex in
- — the set of vertices in the one-ring neighborhood of
Texture-based
- — measures the photometric error between real image and the rendered textured image of from all views
- — Frobenius norm of
- — ratio of Gaussian filters help minimize the influence of lighting changes between real and rendered images
- — the image of resolution level from view
4.3. Sequential registration
Personalization
- use a coupled registration ( Equation ) and average the results across multiple sequences to get a personalized template for each subject
- randomly select one of the for each subject to generate a personalized texture map
Sequence fitting
- replace the generic model template in by personalized template
- fix the to zero
- initialize the model parameters from the previous frame and use the single-frame registration 4.2.
6. Model training
decouple shape, pose, and expression variations
- — pose parameters
- — expression parameters
- — shape parameters
6.1. Pose parameter training
- — personalized rest-pose templates
- — person specific joints
- — blendweights
- — pose blendshapes
- — joint regressor
alternate between:
- solve for the pose parameters of each registration
- optimize the subject specific parameters
- optimize the global parameters
objective function being optimized consists of:
- data term — penalizes the squared Euclidean reconstruction error of the training data
- regularization term — penalizes the Frobenius norm of the pose blendshapes
- regularization term — penalizes large deviations of the blendweights from their initialization
To avoid and being affected by strong facial expressions, expression effects are removed when solving for and . This is done by jointly solving for pose and expression parameters for each registration, subtracting (Equation ), and solving for and on those residuals.
6.2. Expression parameter training
- solve for the pose parameters of each registration
- unpose: remove the pose influence by applying the inverse transformation entailed by (Equation )
- — the vertices resulting from unposing the registration
- — the vertices of the neutral expression of subject , also unposed
- compute expression residuals for each registration
- — the subject index
- compute expression space by applying PCA
6.3. Shape parameter training
- — computed as the mean of these expression- and pose-normalized registrations
- — formed by the first principal components computed using PCA
6.4. Optimization structure
Due to the high capacity and flexibility of the expression space formulation, pose blendshapes should be trained before expression parameters in order to avoid expression overfitting.