Supervised Face Recognition

Course project in AI & Deep Learning at ESIEE Paris, co-authored with Lubin Benoit and supervised by Prof. Laurent Najman.
We built an end-to-end face recognition pipeline—from raw images to identities—then measured the impact of each stage.

GitHub code Project report (PDF, fr)

Problem & data

Two datasets:

  • Jurassic Park characters (≈218 images) used to design and ablate the pipeline.
  • A personal dataset (10 people, 50–110 photos each) to check generalization and bias.

Pipeline (what we compared)

  1. Detection — dlib HOG vs CNN detector; we crop and resize faces to 128×128.
  2. Pose/Alignment — facial landmarks (5/68 pts) + affine align; reduces pose variance.
  3. Encoding — 128-D embeddings (OpenFace-style) for compact, robust features.
  4. ClassificationLogistic Regression, Linear SVM, kNN, and a small NN.

Headline result: with embeddings, several classifiers reach ~97% accuracy on test;
Logistic Regression and Linear SVM are the fastest at inference (ms range).


Face detection and crop
Figure 1. Detection & crop before any alignment.

CNN depth ablation: 4 conv layers vs 1
Figure 2. Ablation on CNN depth (without encoding). Fewer layers converged faster on this small dataset.

Alignment via facial landmarks
Figure 3. Alignment with landmarks—reduces pose variance before training.

Effect of alignment on validation curves
Figure 4. With alignment (right) vs without (left): cleaner validation dynamics.

Embeddings vs raw pixels
Figure 5. Using 128-D embeddings pushes test accuracy to ~97% and stabilizes training.

Classifier comparison: Logistic, SVM, kNN, NN
Figure 6. Classifier comparison (same embeddings): all ~97% accuracy; Logistic/SVM are the quickest.

Per-class F1 scores on personal dataset
Figure 7. Per-class F1 on the personal dataset—balanced scores, no obvious single-class collapse.

Notes & takeaways

  • Alignment helps shallow convnets; deep models were already robust on this scale.
  • Embeddings remove background/lighting noise and make simple classifiers shine.
  • Speed matters: for deployment, LogReg/SVM give near-NN accuracy with far lower latency.
  • Bias check: with a more varied personal set, accuracy stays high (≈94–98%), but performance depends on data quality and diversity—worth monitoring if scaled.