How to perform Face Recognition using KNN — with source code — Interesting Project

Abhishek Sharma
6 min readDec 5, 2021

So in today’s blog, we will see how we can perform Face Recognition using KNN (K-Nearest Neighbors Algorithm) and Haar cascades. Haar cascades are very fast as compared to other ways of detecting faces (like MTCNN) but with an accuracy tradeoff. Its accuracy is a bit less when compared with these big boys like MTCNNs.

Read the full article with source code here —

We will be seeing 2 scripts in today’s blog:

  • The first is for adding a new face.
  • The second is for real-time Face Recognition using KNN.

Let’s do it…


Code for adding a new face…

import cv2
import numpy as np
import os
import pickle

face_data = []
i = 0

cam = cv2.VideoCapture(0)

facec = cv2.CascadeClassifier('data/haarcascade_frontalface_default.xml')

name = input('Enter your name --> ')
ret = True

# Face Recognition using KNN
ret, frame =
if ret == True:
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

face_coordinates = facec.detectMultiScale(gray, 1.3, 4)

for (x, y, w, h) in face_coordinates:
faces = frame[y:y+h, x:x+w, :]
resized_faces = cv2.resize(faces, (50, 50))

if i % 10 == 0 and len(face_data) < 10:
cv2.rectangle(frame, (x, y), (x+w, y+h), (255, 0, 0), 2)
i += 1

cv2.imshow('frames', frame)

if cv2.waitKey(1) == 27 or len(face_data) >= 10:


face_data = np.asarray(face_data)
face_data = face_data.reshape(10, -1)

if 'names.pkl' not in os.listdir('data/'):
names = [name]*10
with open('data/names.pkl', 'wb') as f:
pickle.dump(names, f)
with open('data/names.pkl', 'rb') as f:
names = pickle.load(f)

names = names + [name]*10
with open('data/names.pkl', 'wb') as f:
pickle.dump(names, f)

if 'faces.pkl' not in os.listdir('data/'):
with open('data/faces.pkl', 'wb') as w:
pickle.dump(face_data, w)
with open('data/faces.pkl', 'rb') as w:
faces = pickle.load(w)

faces = np.append(faces, face_data, axis=0)
with open('data/faces.pkl', 'wb') as w:
pickle.dump(faces, w)

Linewise explanation…

  • Line 1–4 — Importing libraries required for Face Recognition using KNN.
  • Line 6–7 — Initializing variables.
  • Line 9 — Creating a VideoCapture object to access the webcam. Argument 0 is passed when we want to use the inbuilt webcam of PC/Laptop, use 1 if you want to use the external camera.
  • Line 11 — In this line, we are creating a Haarcascade object to detect faces in the frame.
  • Line 13 — We are asking for the name of the person, who is adding his/her face.
  • Line 14 — Let’s set ret=True (just a formality to start the infinite loop).
  • Let’s start the loop to perform Face Recognition using KNN…
  • Line 18 — We are using to read the current frame from the webcam.
  • Line 19 — This if statement says that if we are getting frames from the webcam without any error, then proceed further, because in that case, ret would be True.
  • Line 20 — Convert the image from BGR to grayscale because Haar Cascades detect faces in grayscale images efficiently.
  • Line 22 — Let’s detect the faces using detectMultiscale, Now we have our face coordinates as (x,y,w,h) where (x,y) are the coordinates of the top-left of the rectangle around the face, w is the width and h is the height of the rectangle.
  • Line 24 — Let’s traverse through the faces.
  • Line 25 — Let’s extract the face from the frame, and resize it to 50X50.
  • Line 28–29 — We are storing the faces in the face_data array. We just need 10 faces that’s why we are checking the condition len(face_data) < 10 and we are saving faces after every 10 frames so that we can get some diverse images and not images of the same type/pose. We are doing this to make our model more robust.
  • Line 30 — We are drawing a rectangle around the face for showing it in Line 33.
  • Line 31 — Simply incrementing the i value which is keeping track of the frame number.
  • Line 33 — Finally showing our frame with a rectangle around the detected face.
  • Line 35–36 — If someone hits the ESC key or the number of stored images equal to 10 break the code and exit out.
  • Line 37–39 — Else, break the code with an Error message.
  • Line 42–43 — Just some formalities, destroy all the open image windows, and release the camera object.
  • Line 46–47 — We have our 10 images of 50X50 in face_data, let’s convert that to the array, and shape it correctly. After this step, we will have an array face_data of shape 10X7500 (10 rows, each row depicts one image ) where 10 depicts the no of images and 7500 depicts the flattened image itself (50X50X3) (structure shown below).
  • Line 49–52 — If we don’t have ‘names.pkl’ in our data folder till now, means it is the first face we are adding. So create a file ‘names.pkl’ which will contain the same name 10 times (because we are also saving 10 images of every person).
  • Line 53–59 — Else case, means we have our ‘names.pkl’, means it is not the first face we are adding, so just load the ‘names.pkl’ add 10 entries of our current face name and save it as ‘names.pkl’.
  • Line 62–71 — Same as we did above for names, here we are doing it for face_data.
  • Line 69 says to add the face data row-wise.

Code for livetime Face Recognition using KNN…

import cv2
import numpy as np
import pickle
from sklearn.neighbors import KNeighborsClassifier

with open('data/faces.pkl', 'rb') as w:
faces = pickle.load(w)

with open('data/names.pkl', 'rb') as f:
labels = pickle.load(f)

facec = cv2.CascadeClassifier('data/haarcascade_frontalface_default.xml')

cam = cv2.VideoCapture(0)

print('Shape of Faces matrix --> ', faces.shape)
knn = KNeighborsClassifier(n_neighbors=5),labels)

# Face Recognition using KNN
while True:
ret, fr =
if ret == True:
gray = cv2.cvtColor(fr, cv2.COLOR_BGR2GRAY)
face_coordinates = facec.detectMultiScale(gray, 1.3, 5)

for (x, y, w, h) in face_coordinates:
fc = fr[y:y + h, x:x + w, :]
r = cv2.resize(fc, (50, 50)).flatten().reshape(1,-1)
text = knn.predict(r)
cv2.putText(fr, text[0], (x, y-10), cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 0), 2)
cv2.rectangle(fr, (x, y), (x + w, y + w), (0, 0, 255), 2)

cv2.imshow('livetime face recognition', fr)
if cv2.waitKey(1) == 27:

  • Line 1–4 — Importing required libraries.
  • Line 6–7 — Loading faces (X_train) data.
  • Line 9–10 — Loading names/labels (y_train) data.
  • Line 12 -14 — we have discussed above.
  • Line 16 — Let’s check the shape of faces data/matrix
  • It is 20X7500 because, at the time of writing this blog, it had faces of only 2 people (10 photos each).
  • Line 17–18 — We declared an object knn of KNeighborsClassifier() class with n_neighbors=5, which means it will check for only 5 nearest neighbors for giving results.
  • Line 21–29 — We have discussed this earlier. We are just traversing through the frames, detecting faces in them, and resizing them.
  • Line 30 — We are making predictions using knn.predict(r), where r is the resized and flattened image/array of 7500 points.
  • Line 31–32 — We are drawing the rectangle around the face and name on the final window.
  • Line 34 — Showing our results.
  • Line 35–36 — If we hit the ESC key, break the code.

Let’s see the results of Face Recognition using KNN…

  • I added 2 person’s faces, Shahrukh khan’s and Salman khans’.

NOTE This Face Recognition using the KNN algorithm and Haarcascades method is fast but not much accurate. In further blogs, we will also discuss better Face Recognition methods.

To explore more Machine Learning, Deep Learning, Computer Vision, NLP, Flask Projects visit my blog.

For further code explanation and source code visit here

So this is all for this blog folks, thanks for reading it and I hope you are taking something with you after reading this and till the next time 👋…