Recognize gesture numbers with Python programming

Google has come up with an open source, cross-platform, customizable machine learning solution toolkit for online streaming (as well as general video, images, etc.). For those of you who are interested, go to mediapipe.dev/

It provides recognition and tracking functions for gestures, body postures, faces, objects, and other programming languages, including toolkits for C++, Python, JavaScript, and solutions for iOS and Android platforms. Today we’ll look at how to use MediaPipe’s gesture recognition to write Python code that recognizes the numbers in a gesture: 0-5.

The preparatory work

Python3 needs to be installed on the PC. Python3.8.x is recommended. In addition to installing Opencv-Python, MediaPipe, and Numpy toolkits, you can use PIP to install:

pip install mediapipe numpy opencv-python
Copy the code

My computer is Python3.8.3, each toolkit version is:

Mediapipe = = 0.8.3.1 numpy = = 1.20.2 opencv - python = = 4.5.1.48Copy the code

Prepare 6 pictures, 6 pictures of hands.

Write a program

  1. This handUtil module has a HandDetector class that provides methods to detect gestures and retrieve gesture data. The code is as follows. See the code comment for detailed explanation:
import cv2
import mediapipe as mp


class HandDetector() :
    Gesture recognition class
    def __init__(self, mode=False, max_hands=2, detection_con=0.5, track_con=0.5) :
        Default: False :param max_hands: Maximum number of hands (default: 2) :param detection_con: Minimum detection reliability value, default 0.5 :param track_con: minimum trace reliability value, default 0.5 ""
        self.mode = mode
        self.max_hands = max_hands
        self.detection_con = detection_con
        self.track_con = track_con

        self.hands = mp.solutions.hands.Hands(self.mode, self.max_hands, self.detection_con, self.track_con)

    def find_hands(self, img, draw=True) :
        Check gesture: Param img: Video frame image: Param draw: whether to draw the node and connection graph in gesture :return: processed video frame image ""
        imgRGB = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
        # Process the image, check for gestures, and store the data in self.results
        self.results = self.hands.process(imgRGB)
        if draw:
            if self.results.multi_hand_landmarks:
                for handlms in self.results.multi_hand_landmarks:
                    mp.solutions.drawing_utils.draw_landmarks(img, handlms, mp.solutions.hands.HAND_CONNECTIONS)
        return img

    def find_positions(self, img, hand_no=0) :
        Hand_no: Hand number (default first hand) :return: list of gesture data, each data member consists of ID, X, y, code the gesture position number and position on the screen.
        self.lmslist = []
        if self.results.multi_hand_landmarks:
            hand = self.results.multi_hand_landmarks[hand_no]
            for id, lm in enumerate(hand.landmark):
                h, w, c = img.shape
                cx, cy = int(lm.x * w), int(lm.y * h)
                self.lmslist.append([id, cx, cy])

        return self.lmslist
Copy the code
  1. Write another fingercount.py code that calls the method provided by handUtil. Py’s HandDetector class to retrieve gesture data. Each gesture data consists of three digits: ID, x, and y, representing a point in the gesture and its X \y coordinate position. Below is the description of each ID pair in gesture recognition.

4, 8, 12, 16 and 20 represent the tips of thumb, index finger, middle finger, ring finger and little finger respectively. The complete code is as follows:

import cv2
from handutil import HandDetector

# Open camera
cap = cv2.VideoCapture(1)
Create a gesture recognition object
detector = HandDetector()

# 6 pictures of hands, representing 0 ~ 5 respectively
finger_img_list = [
    'fingers/0.png'.'fingers/1.png'.'fingers/2.png'.'fingers/3.png'.'fingers/4.png'.'fingers/5.png',
]
finger_list = []
for fi in finger_img_list:
    i = cv2.imread(fi)
    finger_list.append(i)

# Fingertip list, representing the tips of thumb, index, middle, ring and little fingers
tip_ids = [4.8.12.16.20]

while True:
    success, img = cap.read()

    if success:
        # Detect gestures
        img = detector.find_hands(img, draw=True)
        Get gesture data
        lmslist = detector.find_positions(img)
        if len(lmslist) > 0:
            fingers = []
            for tid in tip_ids:
                # Locate each fingertip
                x, y = lmslist[tid][1], lmslist[tid][2]
                cv2.circle(img, (x, y), 10, (0.255.0), cv2.FILLED)
                # If it is thumb, if the x position of the tip of the thumb is greater than the position of the second joint of the thumb, the thumb is considered open; otherwise, the thumb is considered closed
                if tid == 4:
                    if lmslist[tid][1] > lmslist[tid - 1] [1]:
                        fingers.append(1)
                    else:
                        fingers.append(0)
                # If it is another finger, this finger is considered open if the y position of the finger tip is greater than the position of the second joint, otherwise the finger is considered closed
                else:
                    if lmslist[tid][2] < lmslist[tid - 2] [2]:
                        fingers.append(1)
                    else:
                        fingers.append(0)
            # Fingers is a list of 5 values, 0 for closed and 1 for open
            # Determine how many fingers are open
            cnt = fingers.count(1)
            # Find the corresponding gesture picture and display it
            finger_img = finger_list[cnt]
            w, h, c = finger_img.shape
            img[0:w, 0:h] = finger_img
            cv2.rectangle(img, (200.0), (300.100), (0.255.0), cv2.FILLED)
            cv2.putText(img, str(cnt), (200.100), cv2.FONT_HERSHEY_DUPLEX, 5, (0.0.255))

        cv2.imshow('Image', img)

    k = cv2.waitKey(1)
    if k == ord('q') :break

cap.release()
cv2.destroyAllWindows()

Copy the code

Running the code, we can see that we can recognize the numbers in the gesture and display the corresponding picture and number.

Welcome to “programming players Club” public account, learn more fun and interesting programming.