Don’t tell you I used it with Python simple development of OCR recognition, with you to identify handwriting, printing, ID card and other N, attached code!

The article directories

I. OCR in your mind

How big is OCR in your mind (… , good, good, very good…) ?

It’s this big:

As big as ever:

Anyway in my heart and Ma Yun’s father as big.

No matter how old, to fulfill my promise before, even more within a month, to finish several articles.

The serial number Estimated completion time Develop dome name and features & publish article content Is it finished The article links
1 On September 3 Text translation, single text translation, batch translation demo. Has been completed CSDN:Blog.csdn.net/qq_17623363…Wechat Official Account:Mp.weixin.qq.com/s/6AMzkTtPK…
2 On September 11 Ocr-demo, complete batch upload identification; In a demo, you can select different types of OCR recognition “include handwriting/print/ID card/form/whole topic/business card), and then call the platform capabilities, specific implementation steps, etc. Has been completed CSDN: wechat Official Number:
3 On October 27 Voice recognition Demo, demo upload a video, and capture the video short voice recognition – Demo audio for short voice recognition CSDN: wechat Official Number:
4 On September 17 Intelligent voice evaluation – Demo CSDN: wechat Official Number:
5 On September 24 Essay correction – Demo CSDN: wechat Official Number:
6 On September 30 Voice synthesis – Demo CSDN: wechat Official Number:
7 On October 15 Single question pat-demo CSDN: wechat Official Number:
8 On October 20 Picture translation – Demo CSDN: wechat Official Number:

Follow my wechat public account and push it to you for the first time:

Reply menu, more good gift, surprise is waiting for you.

Ii. Achievements Display

Recently, I was involved in the verification of some documents and paper documents. I wanted to take pictures of paper documents and check each other with words. Think of the previous call youdao Wisdom cloud interface to do a document translation. After looking at the API interface of OCR character recognition, Youdao provides a variety of different INTERFACES for OCR recognition, including handwriting, printing, forms, whole problem recognition, shopping receipt recognition, ID card, business card, etc. Simply this time continue to use Youdao Wisdom cloud interface to do a small demo, these functions have been tried, when practice, but also when for the future may use the function to prepare.

(1) Display of handwritten achievements

(1) Display of printing achievements

(3) Business card recognition achievements display

Here I found a business card template, seems to be accurate or ok

(4) Id card (also template) achievement display

(V) Display of Form recognition Achievements:

(This super long JSON, >_< emmm…)

(VI) Demonstration of whole problem identification Results:

(Formula identification is also done, identification result JSON is long, it looks not so intuitive, I will not paste here).

Three, early preparation

First of all, you need to create an instance, create an application, bind the application and instance on the personal page of Youdao Wisdom Cloud, and obtain the ID and key of the application. Details of the individual registration process and application creation process are listed in the first article above.

Four, the development process hand in hand, hand in hand to teach you

The following describes the specific code development process:

This demo is developed using PYTHon3 and includes maindow. Py, ocrprocesser.py, and ocrTools. py files.

In the interface part, in order to simplify the development process, python tkinter library is used to provide the function of selecting the file to be recognized and the recognition type, and displaying the recognition results. Ocrprocesser.py calls the appropriate API interface based on the selected type to complete the recognition process and return the result; Ocrtools. py encapsulates all kinds of YOUdao OCR apis after finishing, and implements categorical calls.

(I) Development interface

Part of the interface code is as follows, using Tkinter grid to arrange elements.

root=tk.Tk()
root.title("netease youdao ocr test")
frm = tk.Frame(root)
frm.grid(padx='50', pady='50')

btn_get_file = tk.Button(frm, text='Select image to be identified', command=get_files)
btn_get_file.grid(row=0, column=0,  padx='10', pady='20')
text1 = tk.Text(frm, width='40', height='5')
text1.grid(row=0, column=1)

combox=ttk.Combobox(frm,textvariable=tk.StringVar(),width=38)
combox["value"]=img_type_dict
combox.current(0)
combox.bind("<<ComboboxSelected>>",get_img_type)
combox.grid(row=1,column=1)

label=tk.Label(frm,text="Identification Result:")
label.grid(row=2,column=0)
text_result=tk.Text(frm,width='40',height='10')
text_result.grid(row=2,column=1)

btn_sure=tk.Button(frm,text="Begin to identify",command=ocr_files)
btn_sure.grid(row=3,column=1)
btn_clean=tk.Button(frm,text="Empty",command=clean_text)
btn_clean.grid(row=3,column=2)

root.mainloop()
Copy the code

Ocr_files method

Where the bTN_sure binding event ocr_Files () passes the file path and identification type into the ocrProcesser:

def ocr_files() :
    if ocr_model.img_paths:
        ocr_result=ocr_model.ocr_files()
        text_result.insert(tk.END,ocr_result)
    else :
        tk.messagebox.showinfo("Tip"."No file")
Copy the code

The main method in ocrProcesser is ocr_files(), which processes the image base64 and calls the encapsulated API.

def ocr_files(self) :
    for img_path in self.img_paths:
        img_file_name=os.path.basename(img_path).split('. ') [0]
        #print('==========='+img_file_name+'===========')
        f=open(img_path,'rb')
        img_code=base64.b64encode(f.read()).decode('utf-8')
        f.close()
        print(img_code)
        ocr_result= self.ocr_by_netease(img_code, self.img_type)
        print(ocr_result)
        return ocr_result
Copy the code

(2) get_ocr_result

After reading through and sorting out the document with AN API, it can be roughly divided into the following four API entrances: handwriting/print recognition, ID card/business card recognition, form recognition and whole question recognition. The URL of each interface is different, and the request parameters are not all the same. Therefore, the demo first distinguishes them according to the recognition type:

# 0-hand write
# 1-print
# 2-ID card
# 3-name card
# 4-table
# 5-problem
def get_ocr_result(img_code,img_type) :
    if img_type==0 or img_type==1:
        return ocr_common(img_code)
    elif img_type==2 or img_type==3 :
        return ocr_card(img_code,img_type)
    elif img_type==4:
        return ocr_table(img_code)
    elif img_type==5:
        return ocr_problem(img_code)
    else:
        return "error:undefined type!"
Copy the code

(3) recognition of ordinary characters function development

Then organize data and other fields according to the parameters required by the interface, and perform simple parsing and processing for the return values of different interfaces, and return:

def ocr_common(img_code) :
    YOUDAO_URL='https://openapi.youdao.com/ocrapi'
    data = {}
    data['detectType'] = '10012'
    data['imageType'] = '1'
    data['langType'] = 'auto'
    data['img'] =img_code
    data['docType'] = 'json'
    data=get_sign_and_salt(data,img_code)
    response=do_request(YOUDAO_URL,data)['regions']
    result=[]
    for r in response:
        for line in r['lines']:
            result.append(line['text'])
    return result
Copy the code

(4) identification paper function development

def ocr_card(img_code,img_type) :
    YOUDAO_URL='https://openapi.youdao.com/ocr_structure'
    data={}
    if img_type==2:
        data['structureType'] = 'idcard'
    elif img_type==3:
        data['structureType'] = 'namecard'
    data['q'] = img_code
    data['docType'] = 'json'
    data=get_sign_and_salt(data,img_code)
    return do_request(YOUDAO_URL,data)
Copy the code

(5) Functional development of identification forms

def ocr_table(img_code) :
    YOUDAO_URL='https://openapi.youdao.com/ocr_table'
    data = {}
    data['type'] = '1'
    data['q'] = img_code
    data['docType'] = 'json'
    data=get_sign_and_salt(data,img_code)
    return do_request(YOUDAO_URL,data)
Copy the code

(6) Functional development of identification questions

def ocr_problem(img_code) :
    YOUDAO_URL='https://openapi.youdao.com/ocr_formula'
    data = {}
    data['detectType'] = '10011'
    data['imageType'] = '1'
    data['img'] = img_code
    data['docType'] = 'json'
    data=get_sign_and_salt(data,img_code)
    response=do_request(YOUDAO_URL,data)['regions']
    result = []
    for r in response:
        for line in r['lines'] :for l in line:
                result.append(l['text'])
    return result
Copy the code

(7) get_sign_AND_salt method signature information

Get_sign_and_salt () adds the necessary signatures and other information to data:

def get_sign_and_salt(data,img_code) :
    data['signType'] = 'v3'
    curtime = str(int(time.time()))
    data['curtime'] = curtime
    salt = str(uuid.uuid1())
    signStr = APP_KEY + truncate(img_code) + salt + curtime + APP_SECRET
    sign = encrypt(signStr)
    data['appKey'] = APP_KEY
    data['salt'] = salt
    data['sign'] = sign
    return data
Copy the code

Five, the summary

Overall, the function is still very powerful, all kinds of support. That is, visual algorithm engineers do not have the classification function, so they need to call each type of image separately by interface, and the interface cannot be mixed at all. For example, in the development process, I submitted the business card picture to THE API as id card, and the result returned “Items not found!” , it is a bit troublesome for developers to call the API, of course, it also improves the identification accuracy to a certain extent, and I guess it is also for the convenience of charging by interface: P.

Project address: github.com/LemonQH/Wor…

Follow my wechat public account and push it to you for the first time:

Reply menu, more good gift, surprise is waiting for you.

Come to my fan base: Fun every day