Whether to wear mask for face detection

Posted by nloding on Sun, 09 Jan 2022 16:05:27 +0100

As an undergraduate majoring in artificial intelligence, I came into contact with face detection in my freshman study. And with the help of teachers, successfully use the code to recognize and train a series of face pictures. Now I want to share the problems and training results I encountered in the process of training the model.

Face Detection and face recognition technology are Deep learning One of the important applications of.

Face detection is very common in today's society. Face detection is very easy for us. For the needs of social life, we have a special face detection module in our brain, which is very sensitive to faces. However, computer face detection is a relatively difficult problem. Although the structure of human face is determined, which is composed of eyebrows, eyes, nose and mouth, and is approximately a rigid body, due to the changes of posture and expression, the appearance differences of different people, the influence of illumination and occlusion, it is relatively difficult to accurately detect human face under various conditions by computer.

Under the epidemic situation, masks have become necessary for people to go out. It is very important to recognize faces when wearing masks. Especially in the dense places such as shopping malls, it is unrealistic for you to ask customers to take off their masks in public places. People will certainly refuse. So how to recognize faces with masks? The biggest difficulty here is how to detect the face area blocked by obstacles.

Reference source code link:

GitHub - Linzaer/Ultra-Light-Fast-Generic-Face-Detector-1MB: 💎 1MB lightweight face detection model

code:

"""
This code uses the pytorch model to detect faces from live video or camera.
"""
import argparse
import sys
import cv2

from vision.ssd.config.fd_config import define_img_size

parser = argparse.ArgumentParser(
    description='detect_video')

parser.add_argument('--net_type', default="RFB", type=str,
                    help='The network architecture ,optional: RFB (higher precision) or slim (faster)')
parser.add_argument('--input_size', default=480, type=int,
                    help='define network input size,default optional value 128/160/320/480/640/1280')
parser.add_argument('--threshold', default=0.7, type=float,
                    help='score threshold')
parser.add_argument('--candidate_size', default=1000, type=int,
                    help='nms candidate size')
parser.add_argument('--path', default="imgs", type=str,
                    help='imgs dir')
parser.add_argument('--test_device', default="cpu", type=str,
                    help='cuda:0 or cpu')
parser.add_argument('--video_path', default="/home/linzai/Videos/video/16_1.MP4", type=str,
                    help='path of video')
args = parser.parse_args()

input_img_size = args.input_size
define_img_size(input_img_size)  # must put define_img_size() before 'import create_mb_tiny_fd, create_mb_tiny_fd_predictor'

from vision.ssd.mb_tiny_fd import create_mb_tiny_fd, create_mb_tiny_fd_predictor
from vision.ssd.mb_tiny_RFB_fd import create_Mb_Tiny_RFB_fd, create_Mb_Tiny_RFB_fd_predictor
from vision.utils.misc import Timer

label_path = "./models/voc-model-labels.txt"

net_type = args.net_type

#cap = cv2.VideoCapture(args.video_path)  # capture from video
cap = cv2.VideoCapture(0)  # capture from camera

class_names = [name.strip() for name in open(label_path).readlines()]
num_classes = len(class_names)
test_device = args.test_device

candidate_size = args.candidate_size
threshold = args.threshold

if net_type == 'slim':
    model_path = "models/pretrained/version-slim-320.pth"
    # model_path = "models/pretrained/version-slim-640.pth"
    net = create_mb_tiny_fd(len(class_names), is_test=True, device=test_device)
    predictor = create_mb_tiny_fd_predictor(net, candidate_size=candidate_size, device=test_device)
elif net_type == 'RFB':
    model_path = "models/pretrained/version-RFB-320.pth"
    # model_path = "models/pretrained/version-RFB-640.pth"
    net = create_Mb_Tiny_RFB_fd(len(class_names), is_test=True, device=test_device)
    predictor = create_Mb_Tiny_RFB_fd_predictor(net, candidate_size=candidate_size, device=test_device)
else:
    print("The net type is wrong!")
    sys.exit(1)
net.load(model_path)

timer = Timer()
sum = 0
while True:
    ret, orig_image = cap.read()
    if orig_image is None:
        print("end")
        break
    image = cv2.cvtColor(orig_image, cv2.COLOR_BGR2RGB)
    timer.start()
    boxes, labels, probs = predictor.predict(image, candidate_size / 2, threshold)
    interval = timer.end()
    print('Time: {:.6f}s, Detect Objects: {:d}.'.format(interval, labels.size(0)))
    for i in range(boxes.size(0)):
        box = boxes[i, :]
        label = f" {probs[i]:.2f}"
        cv2.rectangle(orig_image, (int(box[0]), int(box[1])), (int(box[2]), int(box[3])), (0, 255, 0), 4)

        # cv2.putText(orig_image, label,
        #             (box[0], box[1] - 10),
        #             cv2.FONT_HERSHEY_SIMPLEX,
        #             0.5,  # font scale
        #             (0, 0, 255),
        #             2)  # line type
    orig_image = cv2.resize(orig_image, None, None, fx=0.8, fy=0.8)
    sum += boxes.size(0)
    cv2.imshow('annotated', orig_image)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break
cap.release()
cv2.destroyAllWindows()
print("all face num:{}".format(sum))

1: Preparatory work

Preparation in advance:

Install Python 3

Configure and set up the virtual environment for Python and python development environment under windows

Download and install OpenCV Library

Download the above code and training data set

2: Run

First of all, I choose a lightweight face detection model. Its feature is that the model size is very small and the speed is very fast. Try the program first after the new project

cuda may report errors. In this case, change 'cuda: 0' to cpu

'CV2' may occur Error: opencv (4.5.4): - 1: '', put CV2 box() in rectangle (orig_image, (int (box [0]), int (box [1]), (int (box [2]), int (box [3]), (0, 255, 0), 4) can be changed to integer type

3, Experimental results

Face detection

Topics: AI Computer Vision Deep Learning