FaceWarehouse: a 3D Facial Expression Database for Visual Computing
TL;DR
- Building
- Input — ( depth maps + color images ) expression identity
- Morphable Model --> meshes
- Active Shape Model ( ASM ) — Expression meshes --> individual-specific blendshape
- Multilinear Model
Abstract
We present FaceWarehouse, a database of 3D facial expressions for visual computing applications. We use Kinect, an off-the-shelf RGBD camera, to capture 150 individuals aged 7–80 from various ethnic backgrounds. For each person, we captured the RGBD data of her different expressions, including the neutral expression and 19 other expressions such as mouth-opening, smile, kiss, etc. For every RGBD raw data record, a set of facial feature points on the color image such as eye corners, mouth contour and the nose tip are automatically localized, and manually adjusted if better accuracy is required. We then deform a template facial mesh to fit the depth data as closely as possible while matching the feature points on the color image to their corresponding points on the mesh. Starting from these fitted face meshes, we construct a set of individual-specific expression blendshapes for each person. These meshes with consistent topology are assembled as a rank-three tensor to build a bilinear face model with two attributes, identity and expression. Compared with previous 3D facial databases, for every person in our database, there is a much richer matching collection of expressions, enabling depiction of most human facial actions. We demonstrate the potential of FaceWarehouse for visual computing with four applications: facial image manipulation, face component transfer, real-time performance-based facial image animation, and facial animation retargeting from video to image.
3. FaceWarehouse
3.1. Data capture
- 2D images + depth maps
3.2. Expression mesh and individual-specific blendshape generation
Active Shape Model ( ASM ) --> feature points on color image --> internal feature points + contour feature points
Neutral expression
- Blanz and Vetter’s morphable model + mesh deformation algorithm
- — average face
- — the -th PCA vector
energy to be minimized for feature point matching:
- — 3D position of the -th feature point
- — corresponding vertex on the mesh
- — 2D feature point on the color image
- — corresponding 3D feature vertex on the mesh
- — projection matrix of the camera
energy term for matching the depth map:
- — a mesh vertex
- — closest point to in the depth map
- — the number of the mesh vertices
Tikhonov regularization energy term:
based on the estimated probability distribution of a shape defined by
- — eigenvalues of the face covariance matrix from PCA
total energy:
Laplacian energy as the regularization term:
- — discrete Laplacian operator
- — magnitude of the original Laplacian coordinate before deformation
- — vertex number of the mesh
mesh deformation energy:
Other expressions
- deformation transfer algorithm: the deformation from face mesh to mimics the deformation from the guide model to
- mesh deformation algorithm
Individual-specific expression blendshapes
example-based facial rigging algorithm
- — a neutral face + 46 FACS blendshapes
- — an expression
- — generic blendshape model
minimize
- difference between each expression mesh and the linear combination of with the known weights for expression
- the difference between the relative deformation from to and that from to
3.3. Blinear Bilinear face model
assemble the dataset into a rank-three ( 3-mode ) data tensor ( 11K vertices 150 identities 47 expressions )
-mode SVD process:
- — data tensor
- — core tensor
- orthonormal transform matrices, which contain the left singular vectors of the 2nd mode ( identity ) space and 3rd mode ( expression ) space
approximate:
- — reduced core tensor produces by keeping the top-left corner of the original core tensor
- — truncated matrices from and by removing the trailing columns
- — bilinear face model for FaceWarehouse
- — column vectors of identity
4. Applications
4.1. Facial image manipulation
- learn a linear regression model that maps a set of user-specified facial attributes to the identity attribute in the bilinear face model
- compute the identity and expression weights in bilinear face model for the input image
- reconstruct a new 3D face mesh based on these weights
Facial feature analysis
algorithm of multi-variate linear regression to map attributes to the identity attribute in bilinear face model
- — identity weights ( a -D vector )
- — user-specified attributes
- — a matrix mapping user-specified attributes to the identity weights
assemble the vectors
- — ( )
- — ( )
- — left pseudoinverse of
Fitting 3D face mesh into image
- — scaling factor
- — 3D rotation matrix
- — translation vector
- — mesh vertex position
- — projected point position on the image
matching error:
- — feature point positions on the image
4.2. Face component transfer
As the two input images represent the same person, their identity weights should be the same.
- unified identity weights
- expression weights ( and )
- use the method described in the last section to compute an initial estimation of the identity and expression weights
- fix and and compute by minimizing
- and are solved separately with fixed
- step 2, 3 are performed iteratively until the fitting results converge
- 2D expression flow: warp the target face to match the desired expression
- 2D alignment flow: warp the reference face to an appropriate size and position for transferring
- select a crop region from the warped reference image and blend it to the warped target image
4.3. Real-time performance-based facial image animation
construct the expression blendshapes for the person of ientity
- — truncated transform matrix for the expression mode
- — the expression weight vector with value 1 for the -th element and 0 for other elements
generate new expressions
- real-time performance-based facial animation system to capture the dynamic expressions of an arbitrary user
- track the rigid transformation of the user’s head and the facial expressions expressed in the format of blendshapes coefficients
- hair, teeth
4.4 Facial animation retargeting from video to image
- estimate face identity and expression of the image using the algorithm described in Section IV-A
- fit a unified face identity for all frames using a simple extension of the joint fitting algorithm described in Section IV-B
- construct expression blendshapes using the method described in Section IV-C