(cache) flandmark - open-source implementation of facial landmark detector

News

11-11-2012 - New version of flandmark with better internal structure and improved MATLAB interface available!

flandmark is an open source C library (with interface to MATLAB) implementing a facial landmark detector in static images. Detector parameters learning is written solely in MATLAB and is also part of flandmark.

The input of flandmark is an image of a face. Face detector provided by the courtesy of Eydea Recognition Ltd. was used to detect faces during the learning of parameters. However the flandmark software package includes fully-contained demo application which uses the OpenCV face detector.

flandmark (version 1.06) can be also used in python thanks to the following project: xbob.flandmark 1.0.2.

Sample results

The landmark detector processes each frame separately, i.e. temporal continuity of landmark positions is not exploited. The red rectangle is bounding box returned by the face detector, the blue rectangle represents the bounding box used to construct the input to flandmark detector (the normalized image frame).

CNN anchorwoman. Source: CNN

Resolution: 640x360, Frames: 1383, Size: 5MB

Video captured by the head camera of humanoid robot NAO.

Resolution: 320x240, Frames: 669, Size: 13MB

Movie "In Bruges".

Resolution: 720x304, Frames: 300, Size: 2.9M

Structured output classifier

flandmark uses a structured output classifier based on the Deformable Part Models (DPM). The quality of landmark configuration $\mathbf{s} = (\mathbf{s_0}, \dots, \mathbf{s_{M-1}}) \in \mathcal{S}$ for given image $I$ is measured by scoring function $f: \mathcal{I} \times \mathcal{S} \rightarrow \mathbb{R}$. The scoring function is defined as a sum of the appearance fit and the deformation cost. Exact formulation of $f$ is derived from the graph constraints:

$$ f(I, \mathbf{s}) = \sum_{i = 0}^{M - 1}{q_i(I, \mathbf{s}_i)} + \sum_{i=1}^{M-3}{g_i(\mathbf{s}_0, \mathbf{s}_i)} + g_5(\mathbf{s}_1, \mathbf{s}_5) + g_6(\mathbf{s}_2, \mathbf{s}_6) + g_7(\mathbf{s}_0, \mathbf{s}_7) $$

Functions $q_i(I, \mathbf{s}_i)\ \mathrm{and}\ g_i(\mathbf{s}_i, \mathbf{s}_j)$ (appearance fit and deformation cost) are parameterized. These parameters are learned from annotated examples using the structured output SVM algorithm.

The maximization of $f$ is solved by Dynamic Programming (DP), thanks to the form of graph constraints (directed acyclic graph). The following images show the graph constraints for the current version of flandmark with 8 components, components dimensions and the flandmark detection pipeline.

Graph constraints

Components

flandmark detector

Performance evaluaiton

Illustration of relative displacement calculation.

The detector accuracy is measured in terms of the relative deviation defined as a distance between the estimated and the ground truth landmark positions divided by the size of the face. The size of the face is defined as the distance between the center of the mouth and the midpoint between centers of the eyes. The geometric accuracy is measured by the mean relative feature displacement and the maximum relative feature displacement.

$$ \mbox{mean relative displacement} = \frac{\epsilon_0+\epsilon_1+\cdots+\epsilon_7}{8} \cdot \frac{1}{l_{\mathrm{face}}} $$ $$ \mbox{maximum relative displacement} = \max \Bigg\{ \frac{\epsilon_0}{l_{\mathrm{face}}}, \cdots, \frac{\epsilon_7}{l_{\mathrm{face}}} \Bigg\} $$

The flandmark detector is compared with three competing detectors in terms of the accuracy of the estimated landmark positions. Specifically it is compared with detectors based on the following approaches: Active Appearance Models (AAM), the Deformable Part Models (DPM) based detector of Everingham, Sivic and Zisserman and binary SVMs trained independently for each landmark.

The parameters of flandmark detector were learned on a subset of the Labeled Faces in the Wild database (LFW). All performance evaluation was done on a subset of the LFW database.

cumulative histogram of maximal deviation

Percentage of examples from testing set that have relative deviation less or equal to 10%
	Mean deviation	Maximal deviation
AAM	8.98%	0.62%
Everingham et al.	85.28%	22.93%
Independent SVMs	85.66%	34.50%
flandmark	96.59%	53.23%

Download

flandmark can be downloaded (or forked) from GitHub

Older versions:

Version 1.07, 2012-11-11, flandmark_v107.zip (latest zip from GitHub). Downloads since 11-11-12:
Version 1.06, 2012-04-19, flandmark_v106.zip. Downloads since 19-04-12:
Version 1.05, 2012-03-27, flandmark_v105.zip (NOTE: WIP, version without learning scripts). Downloads since 29-03-12:
Version 1.04, 2012-03-01, flandmark_v104.tar
Version 1.03, 2011-08-30, flandmark_v103.zip
Version 1.02, 2011-08-10, flandmark_v102.zip

Annotated LFW database can be downloaded from here

lfw.tgz, direct link to the LFW database on its homepage
LFW_annotation.zip, LFW facial landmarks annotation (provided by courtesy of Eyedea Recognition Ltd.). Downloads since 29-03-12:
08-06-12 LFW annotation updated, 15 images had wrong face bounding box assigned. We would like to thanks Alberto Albiol for noticing.

Code snippets

Usage of the flandmark detector in both C/C++ and MATLAB is very simple, as you can see from the following code snippets:

MATLAB example

	  % Load image and convert it to grayscale
	  I = rgb2gray(imread('photo.jpg'));
	  % Get face bounding box
	  bbox = [72, 72, 183, 183];
	  % Load flandmark_model file into MATLAB memory
	  model = flandmark_load_model('../data/flandmark_model.dat');
	  % Detect facial landmark calling the mex function
	  landmarks = flandmark_detector(I, int32(bbox), model);

C/C++ example

	  #include "flandmark_detector.h"

	  int main(int argc, char * argv[])
	  {
	    // load flandmark model structure and initialize
	    FLANDMARK_Model * model = flandmark_init("flandmark_model.dat");

	    // load input image
	    IplImage *img = cvLoadImage("photo.jpg");

	    // convert image to grayscale
	    IplImage *img_grayscale = cvCreateImage(cvSize(img->width, img->height), IPL_DEPTH_8U, 1);
	    cvCvtColor(img, img_grayscale, CV_BGR2GRAY);

	    // bbox with detected face (format: top_left_col top_left_row bottom_right_col bottom_right_row)
	    int bbox[] = {72, 72, 183, 183};

	    // detect facial landmarks (output are x, y coordinates of detected landmarks)
	    float * landmarks = (float*)malloc(2*model->data.options.M*sizeof(float));
	    flandmark_detect(img_grayscale, bbox, model, landmarks);
	  }

Platforms

GNU/Linux, Windows

flandmark should work with Mac also, though it was not tested.

Licensing information

FLANDMARK is licensed under the GNU/GPL version 3.

References

If you use this software in research, please cite this paper [Bibtex]

		
		@InProceedings{Uricar-Franc-Hlavac-VISAPP-2012,
		  author =      {U{\v{r}}i{\v{c}}{\'{a}}{\v{r}}, Michal and 
				Franc, Vojt{\v{e}}ch and Hlav{\'{a}}{\v{c}}, V{\'{a}}clav },
		  title =       {Detector of Facial Landmarks Learned by the Structured Output {SVM}},
		  year =        {2012},
		  pages =       {547-556},
		  booktitle =   {VISAPP '12: Proceedings of the 7th International Conference on Computer Vision Theory and Applications},
		  editor =      {Csurka, Gabriela and Braz, Jos{\'{e}}},
		  publisher =   {SciTePress --- Science and Technology Publications},
		  address =     {Portugal},
		  volume =      {1},
		  isbn =        {978-989-8565-03-7},
		  book_pages =  {747},
		  month =       {February},
		  day =         {24-26},
		  venue =       {Rome, Italy},
		  annote =      {In this paper we describe a detector of facial
		    landmarks based on the Deformable Part Models. We treat the task
		    of landmark detection as an instance of the structured output
		    classification problem. We propose to learn the parameters of the
		    detector from data by the Structured Output Support Vector
		    Machines algorithm. In contrast to the previous works, the
		    objective function of the learning algorithm is directly related
		    to the performance of the resulting detector which is controlled
		    by a user-defined loss function. The resulting detector is
		    real-time on a standard PC, simple to implement and it can be
		    easily modified for detection of a different set of landmarks.  We
		    evaluate performance of the proposed landmark detector on a
		    challenging ''Labeled Faces in the Wild`` (LFW) database. The
		    empirical results demonstrate that the proposed detector is
		    consistently more accurate than two public domain implementations
		    based on the Active Appearance Models and the Deformable Part
		    Models. We provide an open-source implementation of the proposed
		    detector and the manual annotation of the facial landmarks for all
		    images in the LFW database.},
		  keywords =    {Facial Landmark Detection, Structured Output Classification, 
				Support Vector Machines, Deformable Part Models},
		  prestige =    {important},
		  authorship =  {50-40-10},
		  status =      {published},
		  project =     {FP7-ICT-247525 HUMAVIPS, PERG04-GA-2008-239455 SEMISOL, 
				Czech Ministry of Education project 1M0567},
		  www = {http://www.visapp.visigrapp.org},
		}

M. Uricar, V. Franc and V. Hlavac, Detector of Facial Landmarks Learned by the Structured Output SVM, VISAPP '12: Proceedings of the 7th International Conference on Computer Vision Theory and Applications, 2012. Received Best Paper Award [pdf] [Bibtex]

			      
			      @InProceedings{Uricar-Franc-Hlavac-VISAPP-2012,
				author =      {U{\v{r}}i{\v{c}}{\'{a}}{\v{r}}, Michal and 
					      Franc, Vojt{\v{e}}ch and Hlav{\'{a}}{\v{c}}, V{\'{a}}clav },
				title =       {Detector of Facial Landmarks Learned by the Structured Output {SVM}},
				year =        {2012},
				pages =       {547-556},
				booktitle =   {VISAPP '12: Proceedings of the 7th International Conference on Computer Vision Theory and Applications},
				editor =      {Csurka, Gabriela and Braz, Jos{\'{e}}},
				publisher =   {SciTePress --- Science and Technology Publications},
				address =     {Portugal},
				volume =      {1},
				isbn =        {978-989-8565-03-7},
				book_pages =  {747},
				month =       {February},
				day =         {24-26},
				venue =       {Rome, Italy},
				annote =      {In this paper we describe a detector of facial
				  landmarks based on the Deformable Part Models. We treat the task
				  of landmark detection as an instance of the structured output
				  classification problem. We propose to learn the parameters of the
				  detector from data by the Structured Output Support Vector
				  Machines algorithm. In contrast to the previous works, the
				  objective function of the learning algorithm is directly related
				  to the performance of the resulting detector which is controlled
				  by a user-defined loss function. The resulting detector is
				  real-time on a standard PC, simple to implement and it can be
				  easily modified for detection of a different set of landmarks.  We
				  evaluate performance of the proposed landmark detector on a
				  challenging ''Labeled Faces in the Wild`` (LFW) database. The
				  empirical results demonstrate that the proposed detector is
				  consistently more accurate than two public domain implementations
				  based on the Active Appearance Models and the Deformable Part
				  Models. We provide an open-source implementation of the proposed
				  detector and the manual annotation of the facial landmarks for all
				  images in the LFW database.},
				keywords =    {Facial Landmark Detection, Structured Output Classification, 
					      Support Vector Machines, Deformable Part Models},
				prestige =    {important},
				authorship =  {50-40-10},
				status =      {published},
				project =     {FP7-ICT-247525 HUMAVIPS, PERG04-GA-2008-239455 SEMISOL, 
					      Czech Ministry of Education project 1M0567},
				www = {http://www.visapp.visigrapp.org},
			      }

M. Uricar, Detector of facial landmarks, Master's Thesis, supervised by V. Franc, May 2011. [pdf] [Bibtex]

			    
			    @MastersThesis{Uricar-TR-2011-05,
			    author =        {U{\v r}i{\v c}{\' a}{\v r}, Michal},
			    supervisor =    {Franc, Vojt{\v e}ch},
			    title =         {Detector of Facial Landmarks},
			    school =        {Center for Machine Perception,
					      K13133 FEE Czech Technical University},
			    address =       {Prague, Czech Republic},
			    year =          {2011},
			    month =         {June},
			    day =           {7},
			    type =          {{MSc Thesis CTU--CMP--2011--05}},
			    issn =          {1213-2365},
			    pages =         {69},
			    authorship =    {100},
			    psurl =         {[Uricar-TR-2011-05.pdf]},
			    project =       {FP7-ICT-247525, PERG04-GA-2008-239455},
			    annote =        {In this thesis we develop a detector of facial
			      landmarks based on the Deformable Part Models. We treat the task of
			      landmark detection as an instance of the structured output
			      classification problem. We propose to learn the parameters of the
			      detector from data by the Structured Output Support Vector Machines
			      algorithm. In contrast to previous works, the objective function of
			      the learning algorithm is directly related to the performance of
			      the resulting detector which is controlled by a user-defined loss
			      function. The resulting detector is real-time on a standard PC,
			      simple to implement and it can be easily changed for detection of a
			      different set of landmarks. We evaluate performance of the proposed
			      landmark detector on a challenging ``Labeled Face in the Wild''
			      database. The empirical results demonstrate that the proposed
			      detector is consistently more accurate than two public domain
			      implementations based on the Active Appearance Models and the
			      Deformable Part Models. We provide an open source implementation of
			      the proposed detector as well as the algorithm for supervised
			      learning of its parameters from data. },
			    keywords =      {Facial Landmark Detection, Support Vector Machines,
					    Structured Output Classification, Deformable Part Models},
			    }

J. Sivic, M. Everingham and A. Zisserman, "Who are you?" - Learning Person Specific Classifiers from Video, Proc. of IEEE Conference on Computer Vision and Pattern Recognition, 2009. [pdf]
G. B. Huang, M. Ramesh, T. Berg and E. Learned-Miller, Labeled faces in the wild: A database for studying face recognition in unconstrained environments, Technical Report 07-49. University of Massachusetts, Amherst, 2007. [pdf]

flandmark

Open-source implementation of facial landmark detector

Michal Uřičář, Vojtěch Franc

uricamic@cmp.felk.cvut.cz, xfrancv@cmp.felk.cvut.cz

The Center for Machine Perception

Czech Technical University in Prague