Machine learning tips to build a facial recognition tool

Exadel, Inc.

Even just a decade ago, it was hard to believe that computers would be able to drive cars or easily recognize pictures of a cat. The programming and AI tools around at that time struggled with computers seeing the world around them and processing that information accurately.

In the last 10 years, machine learning has slowly grown from a field of research to a mature technology with real-world applications that are used by some of the top organizations and multiple industries today. At the most basic level, machine learning creates algorithms that solve some of the most complex and interesting problems that technology organizations face.

Many of these problems have moved from science fiction to established fact. Some have become positively easy, such as handwriting recognition, which is now the “Hello World” of machine learning. Through Exadel’s own experience developing machine learning programs for clients, we wanted to share some of the do’s and don’ts of using machine learning technology.

On the surface, the result of the machine learning process looks much like traditional programming because the end product is a programmatic algorithm that processes information. When it comes to the how of machine learning program creation, it actually looks quite different.

How to create a machine learning model

First, a few don’ts of creating a model: don’t define requirements, design a system or algorithm, write code, test or iterate as you would with traditional software development.

With machine learning you must characterize the problem in a way that makes it susceptible to machine learning, and then you must understand if you have the data that can help solve that problem. Next, you define a model, train the model with training data and test the model, hoping that your training resulted in a high probability of success. If it didn’t, you tweak your model and retrain.

Facial recognition app development

One of our clients came to us for help developing an app to make it simpler for secure check-in to an office space. The client had requests to simplify the visitor check-in process and avoid duplicate data entry. When someone checks into the office, they must enter a few pieces of information, including name and phone number, into a tablet at the front desk. For reasons of privacy, this needed to be re-entered every time because we can’t simply provide a list of all the people who have previously checked in. Re-entering this information was repetitious for visitors, but important for the client to know who was in the office and how to get a hold of them. In order to automate this process and to provide security for the information, we decided to use facial recognition to identify visitors and understand if they had been in the office before. If they had been in the building before, we would have a picture on file and could identify them when they took a picture again. We decided to use machine learning, open-source tools and open source projects as a baseline. Not surprisingly, we sought out existing tools to develop this application.

In the existing app, when a visitor first comes to the office, they fill in the information on a tablet and the tablet takes their picture. The check-in tool now has a profile and an image that can be used to recognize each individual.

To create this facial recognition system, we used some off-the-shelf machine learning and computer vision (CV) components:

Python: generally the language of choice for machine learning today.
Tensorflow: an open-source machine learning and neural network toolkit. Tensorflow is the go-to library for numerical computation and large-scale machine learning.
scikit-learn: Simple and efficient tools for data mining and data analysis.
scipy: a free and open source library for scientific and technical computing.
numpy: a Python library supporting large, multi-dimensional arrays with a large library of functions for operating on these arrays.
OpenCV: an open-source library of functions aimed at real-time computer vision.

These are all very common tools used for machine learning projects. We’ve been working with and adapting open source code to tie all of these components together, including the face recognition using Tensorflow GitHub project.

Developing the code and tools to do facial recognition is important, but, as mentioned above, the core of machine learning is to train the model until the results on test data — which has never been evaluated during training — provide a high-enough level of success to say that the developed neural network algorithm can recognize people in the setting — in this case, checking in at the front desk.

Data is very important here as well. Best practices indicate that you should have training data, validation data and test data. Organizations use training data, data that your model learns from, to train the model. Machine learning specialists use validation data to review the trained model. The machine learning specialist may then change or tweak inputs, based on this validation. This is part of the iterative process of developing the model. The machine learning model never sees test data except in the final testing steps. It is the gold standard that is only used once the model is fully trained. It may be used to compare the success of two different, trained models.

The pre-processing and training processes look like this:

Find the face: Find the face within the image. Real world images contain more than the face, so you first must isolate the pieces that comprise the face.
Posing and projecting faces: Even the best computer algorithms work better if every image has the same proportions. We needed to align the face within the image frame to improve its use with the machine learning model.
Calculate embedding from faces: A human describes the difference between faces using visual human-readable characteristics, such as nose size, face width or eye color. We use neural networks that automatically determine machine-readable features.
Use embedding for training model: The step where we are training the model with images or using the trained model.

Once we have trained the model and tested it, we can deploy it so that it can be used by the tablet program to check newly created images to see if they match anyone who has visited the office before.

We created a web API that the tablet application uses to send in a photo to potentially match the new image against the image database.

Machine learning is still a relatively nascent technology, but its applications are starting to become more pervasive. As we start to better understand the best practices and uses for machine learning, organizations must have the skills ready to keep up with the competition.

All IoT Agenda network contributors are responsible for the content and accuracy of their posts. Opinions are of the writers and do not necessarily convey the thoughts of IoT Agenda.