This is the sixth day of my participation in the First Challenge 2022

Today we realize fatigue detection. If the eyes have been closed for a while, we think they’ve started to nod off and sound an alarm to wake them up and get their attention. Let’s test a video to show the effect. At the same time, the code for opening the camera is reserved in the code, which can be used by unannotation.

Use OpenCV to build the sleepiness detector

To start our implementation, open a new file, call it detect_drowsiness. Py, and insert the following code:

# import the necessary packages
from scipy.spatial import distance as dist
from imutils.video import VideoStream
from imutils import face_utils
from threading import Thread
import numpy as np
import playsound
import argparse
import imutils
import time
import dlib
import cv2
Copy the code

Import the Python packages you need.

We also need the ImUtils package, my set of computer vision and image processing capabilities, to make it easier to use OpenCV.

If imUtils is not already installed on your system, you can install/upgrade imUtils by:

pip install --upgrade imutils
Copy the code

The Thread class will also be imported so that we can play our alerts in a different Thread from the main Thread to ensure that our script does not pause when the alarm sounds.

To actually play our WAV/MP3 alarm clock, we need the PlaySound library, which is a pure Python cross-platform implementation for playing simple sounds.

The PlaySound library can be easily installed via PIP:

pip install playsound
Copy the code

However, if you’re using macOS (as I did for this project), you’ll also need to install PyObjC, otherwise you’ll get appKit-related errors when you actually try to play sounds:

pip install pyobjc
Copy the code

Next, we need to define the sound_alarm function, which plays the audio file:

def sound_alarm(path):
	# play an alarm sound
	playsound.playsound(path)
Copy the code

Defines the eye_aspect_ratio function, which calculates the ratio of the distance between vertical and horizontal eye scales:

def eye_aspect_ratio(eye): # compute the euclidean distances between the two sets of # vertical eye landmarks (x, y)-coordinates A = dist.euclidean(eye[1], eye[5]) B = dist.euclidean(eye[2], eye[4]) # compute the euclidean distance between the horizontal # eye landmark (x, y)-coordinates C = dist.euclidean(eye[0], Eye [3]) # compute the eye aspect ratio ear = (A + B)/(2.0 * C) # return the eye aspect ratio return earCopy the code

Since OpenCV cannot draw Chinese directly, we also need to define the method of drawing Chinese:

def cv2ImgAddText(img, text, left, top, textColor=(0, 255, 0), textSize=20): if (isinstance(img, np.ndarray)): Img = image.fromarray (cv2.cvtcolor (img, Cv2.color_bgr2rgb)) # create an object that can draw on the given ImageDraw = imagedraw.draw (img) # font format = imagefont.truetype ( "Font /simsun.ttc", textSize, encoding=" utF-8 ") # Draw text.text ((left, top), textColor, encoding=" utF-8 ") Return cv2.cvtColor(np.asarray(img), cv2.color_rgb2bgr)Copy the code

Next, define the command line arguments:

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-p", "--shape-predictor", required=True,
	help="path to facial landmark predictor")
ap.add_argument("-v", "--video", type=str, default="",
                help="path to input video file")
ap.add_argument("-a", "--alarm", type=str, default="",
	help="path alarm .WAV file")
ap.add_argument("-w", "--webcam", type=int, default=0,
	help="index of webcam on system")
args = vars(ap.parse_args())
Copy the code

The sleepy detector requires a command line argument followed by two optional arguments, each with the following details:

Shape-predictor: This is the path to dLIB’s pre-trained facial marker detector. You can download the detector along with the source code for this tutorial using the Downloads section at the bottom of this post.

— Video file. This article uses video files to test.

–alarm: Here you can select the path to specify the input audio file that you want to use as an alarm.

— WebCAM: This integer controls the index of the built-in webcam /USB camera.

With the command line arguments defined, we also need to define several important variables:

# define two constants, one for the eye aspect ratio to indicate # blink and then a second constant for the number of consecutive # frames the Eye must be below the threshold for setting off the # alarm EYE_AR_THRESH = 0.3 eye_AR_frames = 48 # initialize the frame counter as well as a boolean used to # indicate if the alarm is going off COUNTER = 0 ALARM_ON = FalseCopy the code

EYE_AR_THRESH is defined. If the aspect ratio falls below this threshold, we start counting frames in which the person closes their eyes.

If the person’s closed eyes exceed EYE_AR_CONSEC_FRAMES, we will issue an alert.

In my experiments, I found that the 0.3 EYE_AR_THRESH worked well under various conditions (although you may need to adjust it yourself for your own applications).

I also set EYE_AR_CONSEC_FRAMES to 48, which means that if a person closes their eyes for 48 consecutive frames, we’ll play an alarm.

You can make a fatigue detector more sensitive by lowering EYE_AR_CONSEC_FRAMES — likewise, you can reduce the fatigue detector’s sensitivity by increasing it.

Defines COUNTER, the total number of consecutive frames with an eye aspect ratio lower than EYE_AR_THRESH.

If COUNTER exceeds EYE_AR_CONSEC_FRAMES, then we update the Boolean value ALARM_ON.

The dlib library comes with a histogram of a face detector based on directional gradients and a face landmark predictor — we instantiate both in the following code block:

# initialize dlib's face detector (HOG-based) and then create # the facial landmark predictor print("[INFO] loading facial landmark predictor..." ) detector = dlib.get_frontal_face_detector() predictor = dlib.shape_predictor(args["shape_predictor"])Copy the code

The facial markers produced by dLIB are an indexable list, as shown below:

So to extract the eye region from a set of facial markers, we just need to know the correct array slice index:

# grab the indexes of the facial landmarks for the left and
# right eye, respectively
(lStart, lEnd) = face_utils.FACIAL_LANDMARKS_IDXS["left_eye"]
(rStart, rEnd) = face_utils.FACIAL_LANDMARKS_IDXS["right_eye"]
Copy the code

Using these indexes, we will be able to easily extract the eye region through array slices.

We are now ready to activate the core of our sleepiness detector:

# start the video stream thread print("[INFO] starting video stream thread..." ) vs = VideoStream(SRC =args["webcam"]).start() time.sleep(1.0) # loop over frames from the video stream while True: # grab the frame from the threaded video file stream, resize # it, and convert it to grayscale # channels) frame = vs.read() frame = imutils.resize(frame, width=450) gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) # detect faces in the grayscale frame rects = detector(gray, 0)Copy the code

Instantiate VideoStream.

Pause for a second to allow the camera sensor to warm up.

Start iterating through frames in the video stream.

Read the next frame, then we preprocess it by resizing it to a width of 450 pixels and converting it to grayscale.

Dlib’s face detector is used to find and locate faces in images.

The next step is to apply facial marker detection to locate each important area of the face:

	# loop over the face detections
	for rect in rects:
		# determine the facial landmarks for the face region, then
		# convert the facial landmark (x, y)-coordinates to a NumPy
		# array
		shape = predictor(gray, rect)
		shape = face_utils.shape_to_np(shape)
		# extract the left and right eye coordinates, then use the
		# coordinates to compute the eye aspect ratio for both eyes
		leftEye = shape[lStart:lEnd]
		rightEye = shape[rStart:rEnd]
		leftEAR = eye_aspect_ratio(leftEye)
		rightEAR = eye_aspect_ratio(rightEye)
		# average the eye aspect ratio together for both eyes
		ear = (leftEAR + rightEAR) / 2.0
Copy the code

Looping over each face detected — in our implementation (specifically related to driver sleepiness), we assume there is only one face — the driver — but I’ll leave this for loop here in case you want to apply the technique of multiple face videos.

For each face detected, we apply dlib’s face marker detector and convert the results into a NumPy array.

Using NumPy array slices, we can extract the (x, y) coordinates of the left and right eyes respectively.

Given the (x, y) coordinates of both eyes, we then calculate their aspect ratio.

Soukupova and č e Ch suggested that the vertical and horizontal ratios of the two eyes be average together for a better estimate.

We can then visualize each eye region on the frame using the following cv2.drawContours function — this is often helpful when we are trying to debug the script and want to ensure that the eye is detected and positioned correctly:

		# compute the convex hull for the left and right eye, then
		# visualize each of the eyes
		leftEyeHull = cv2.convexHull(leftEye)
		rightEyeHull = cv2.convexHull(rightEye)
		cv2.drawContours(frame, [leftEyeHull], -1, (0, 255, 0), 1)
		cv2.drawContours(frame, [rightEyeHull], -1, (0, 255, 0), 1)
Copy the code

Finally, we are now ready to examine the video stream for signs of sleepiness:

# check to see if the eye aspect ratio is below the blink # threshold, and if so, increment the blink frame counter if ear < EYE_AR_THRESH: COUNTER += 1 # if the eyes were closed for a sufficient number of # then sound the alarm if COUNTER >= EYE_AR_CONSEC_FRAMES: # if the alarm is not on, turn it on if not ALARM_ON: ALARM_ON = True # check to see if an alarm file was supplied, # and if so, start a thread to have the alarm # sound played in the background if args["alarm"] ! = "" : t = Thread(target=sound_alarm, Args =(args["alarm"],)) t.start() # draw an alarm on the frame frame=cv2ImgAddText(frame," wake up! ,30,(255, 0, 0),30) # otherwise, the eye aspect ratio is not below the blink # threshold, so reset the counter and alarm else: COUNTER = 0 ALARM_ON = FalseCopy the code

Check if the aspect ratio of the eyes is below the “blink/close” eye threshold EYE_AR_THRESH.

If so, we increase COUNTER, which is the total number of consecutive frames for the person to close their eyes.

If COUNTER exceeds EYE_AR_CONSEC_FRAMES, then we assume that the person begins to nod off.

Another check is made to see if the alarm is turned on — if not, we turn it on.

Handles playback of alarm sounds, provided the –alarm path is provided when the script is executed. We took particular care to create a separate thread responsible for calling sound_alarm to ensure that our main program did not block until the sound finished playing.

Draw text DROWSINESS ALERT! On our framework — again, this is often helpful for debugging, especially if you’re not using the PlaySound library.

Finally, lines 136-138 handle the case where the aspect ratio of the eyes is greater than EYE_AR_THRESH, indicating that the eyes are open. If the eyes are open, we reset the counter and make sure the alarm is off.

The final block of code processing in our sleepiness detector displays the output frame on our screen:

# draw the computed eye aspect ratio on the frame to help # with debugging and setting the correct eye aspect ratio # thresholds and frame counters cv2.putText(frame, "EAR: {:.2f}". Format (ear), (300, 30), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 0, 255), 2) # show the frame cv2.imshow(" frame ", frame) key = cv2.waitKey(1) & 0xFF # if the `q` key was pressed, break from the loop if key == ord("q"): break # do a bit of cleanup cv2.destroyAllWindows() vs.stop()Copy the code

Coding is done here!!

Test fatigue detector

Operation instruction:

python detect_drowsiness.py --shape-predictor shape_predictor_68_face_landmarks.dat --video 12.mp4  --alarm alarm.mp3
Copy the code

Running results:

Detect nap will prompt, and will remind to print in the video above complete code is as follows: download.csdn.net/download/hh…