In this article, we are going to discuss about implementing Euclidean distance in PostgreSQL database. Before getting into actual implementation, let me give you a quick background to understand the need for writing this article. I have been working on Face Authentication system and to perform Face Verification task, we need to compute the distance between two faces. There are a lot of implementations out there to achieve this using Python, however it did not help in my case, so I have implemented the Euclidean distance computation in PostgreSQL. Let's see about the challenges and solutions in detail.
Face Recognition System:
In simple terms, my implementation of Face Recognition systems consist of two parts:
- Face Registration
- Face Verification
Face Registration:
This part contains below steps:
- Users will register their faces with name and image
- The registered photos will be stored in folder with name as folder name
- The user images will be feed to Convolutional Neural Network (CNN) model and extract 128 measurements for each image.
- Using a K-Nearest Neighbour classifier to train a classifier with name and their corresponding 128 dimension encodings.
- Save the classifier as pickle file on application directory.
The CNN model we are using here is ResNet network with 29 layers and trained with about 3 million images - thanks to Davis King (dlib) for this great work and making this available to public. Refer this page for understanding more about this model.
Face Verification:
This part contains below steps:
- When a new user try to login with their image, it will be feed to same Convolutional Neural Network (CNN) and extract 128 dimension number vector.
- This 128 dimension vector is passed to classifier and it will be compared against all the pre-trained face encodings.
- The classifier will return name of the encodings where distance comparison value is less than 0.3 (as I defined threshold as 0.3)
To learn more about this Face Recognition system, please refer this series of insightful article from Adam Geitgey. Thanks to Adam for this awesome github repository, great place to start, if you are looking for open source Face Recognition system code.
The Challenges:
This system works well, however challenges begins when we want to add new user to Face Recognition system. When a new user is registered, the encodings and name of the user should be appended to existing pre-trained data to verify user using classifier. To do this, we have to re-train the entire data - all the registered users and save the new classifier. This approach is not scalable due to following reasons:- Adding new users to system is complicated as it requires training for every users.
- Each time we add new user, need to train all the existing registered user images, generating 128 dimension number vector for all the images during training is time & resource consuming process, it requires higher GPU processing and doing this for each new users is not the right way.
- Deleting existing user is complicated as it deals with classifier object as pickle file.
- Parallelising to multiple servers with this approach is not feasible, as classifier stored as pickle file, we need to keep moving this file to all the servers, every-time we make changes to training data.
- Cannot add multiple images for a person as it increases the computation.
The Solution:
After dealing with these issues for sometime, I have changed my approach to overcome above mentioned challenges. The current implementation works as follows:
|
Face Recognition - Computing Euclidean distance in PostgreSQL |