This is the 16th day of my participation in the August Text Challenge.More challenges in August

Up to today, xianyu and wrote a lot of articles about Js reverse, but so many articles have a common point, are about encryption parameters or password encryption analysis, many readers in the background private letter hope to be able to give some about sliding verification or man-machine verification analysis tutorial.

So xianyu summed up the type of verification code encountered and summed up the relevant processing and we talk about it.

There are several types of captcha in the market.

Graphic verification code

The graphic captcha, which is composed of common English numbers, is often supplemented with various interference lines and distorted images to improve the difficulty of confusion, and the cost of recognition is increased by lengthening the length of the text in the pictures.

There are many kinds of processing schemes like this type of verification code, simply summarize for you.

Tesserocr can be used to identify tesserocr. Tesserocr can be used to identify tesserocr. If there are simple interference lines in the verification code, gray scale and binarization can also be used to improve the recognition rate of the code.

Common examples of code:

import tesserocr
from PIL import Image
image = Image.open('code2.jpg')
image = image.convert('L')
threshold = 127
table = []
for i in range(256) :if i < threshold:
        table.append(0)
    else:
        table.append(1)
image = image.point(table, '1')
result = tesserocr.image_to_text(image)
print(result)
Copy the code

The more difficult multi-digit + twisted graphic captcha includes the medium and low difficulty graphic captcha summarized above, which can be recognized by Tensorflow training.

I have a series of articles covering the entire training process that you can refer to.

Mp.weixin.qq.com/s/-BfjGC6KZ…

Mp.weixin.qq.com/s/qVZtKveH8…

Mp.weixin.qq.com/s/AfefH4b5H…

Those who use this method should remember to prepare enough samples of captcha, as long as your model is not too bad, through enough samples, continuous tuning can achieve a relatively significant recognition rate.

At present, the best program I have experienced is the recognition success rate of four English digits of Cold Moon is as high as 99.99%. However, according to an insider, the sample of the whole training has reached 6000 W, which takes a lot of time and energy.

Another solution is to use a coding service, but more on that later.

Rotation verification code

This type of verification code rotates the picture of the verification code and requires the user to drag the lower slider to complete the operation of the picture to complete the verification.

At present, no service providers in the domestic market have a good solution, but there are some small bugs in the verification code of a certain company. Relying on the wisdom of the working public, I found a Nice project on GitHub.

Project address: github.com/scupte/xuan…

Because of the capacity of the gallery, there is no large gallery as a backing, will all the original map grab down the comparison can be completely rotated Angle.

Partial comparison code:

# -*- coding: utf-8 -*-
import cv2
import numpy as np

imagepath = '9_1.png'
img = cv2.imread(imagepath)
gray = cv2.cvtColor ( img , cv2.COLOR_BGR2GRAY )
ret, binary = cv2.threshold(gray,127.255,cv2.THRESH_BINARY)  
  
contours, hierarchy = cv2.findContours(binary,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)  
# cv2. DrawContours (img, contours, 1, (0,0,255), 1)
for cnt in contours:

    The width and height of the smallest outside rectangle
    width, height = cv2.minAreaRect(cnt)[1]
    
    if width* height > 100:
        The smallest enclosing rectangle
        rect = cv2.minAreaRect(cnt)
        box = cv2.boxPoints(rect)  Get the four vertices of the smallest enclosing rectangle
        
        box = np.int0(box)
        print box

        if 0 not in box.ravel():

            Draw the smallest external rectangle
            for i in range(4):
                cv2.line(img, tuple(box[i]), tuple(box[(i+1) %4]), 0)  # 5
            theta = cv2.minAreaRect(cnt)[2]
            if abs(theta) <= 45:
                print('Image rotation is %s.'%theta)

             
            # angle = theta
print theta            
cv2.imshow("img", img)  
cv2.waitKey(0)  
Copy the code

Sliding verification code

When it comes to sliding verification code, we must mention some verification, although there are a lot of products about sliding verification code on the market, but the status of some verification is like 10 years ago brain platinum in the health product market, the industry benchmark ah.

But it is more cattle force, the market with it to do protection of the website also more, like the national enterprise credit information publicity system, B station, dog east and so on.

There are a lot of solutions like this, but they’re pretty much the same.

Selenium simulated sliding

Using Selenium, you’ve all heard of it. Basically, you compare the gap map with the original image to get the abscissa of the gap, perform some calculations to complete the trajectory simulation, and then use Selenium to slide to finish the stitching of the gap.

The advantages of this kind of method are low threshold and simple principle, but the disadvantages are that it takes a long time to complete sliding and the success rate cannot be estimated (the success rate decreases rapidly after the same trajectory calculation is used for many times).

Common trace generation code:

import numpy as np
import math


def ease_out_expo(x) :
    Param x: :return: """
    if x == 1:
        return 1
    else:
        return 1 - pow(2, -10 * x)


def get_tracks(distance, seconds) :
    """ Track-generating function :param distance: param seconds: total sliding time :return: """
    tracks = [0]  # An array of tracks
    offsets = [0]  # Store the record array of the total sliding distance
    for t in np.arange(0.0, seconds, 0.1) :Generate a sequence like [0.0, 0.1, 0.2, 0.3]
        offset = round(ease_out_expo(t/seconds) * distance)  # Calculate the sliding distance on the curve according to time t
        tracks.append(offset - offsets[-1])  The trajectory of this time is obtained by subtracting the distance calculated this time from the distance moved last time
        offsets.append(offset)  # The total distance to this slide
    return offsets, tracks


a, b = get_tracks(138.3)
print(a, b)


def get_tracksb(distance) :
    """ According to the physical acceleration then deceleration law calculation :param distance: :return: ""
    distance += 20  The # plus 20 is for sliding past the notch and back
    v = 0  # velocity
    t = 0.2  # A calculation cycle of 0.2 seconds
    forward_tracks = []  Track record array
    current = 0  # Initial movement distance
    mid = distance * 3 / 5  The deceleration threshold is three fifths of the distance acceleration and the rest of the distance deceleration
    while current < distance:  End when the total distance moved equals the input distance
        if current < mid:  # Acceleration state
            a = 2  The acceleration is +2
        else:  # Deceleration state
            a = -3  Acceleration minus 3
            
        s = v * t + 0.5 * a * (t ** 2)  # Calculate the displacement in a 0.2-second period
        v = v + a * t  # Calculate the speed after this cycle
        current += s  # Add the total distance you moved before to the distance you moved in a 0.2-second period
        forward_tracks.append(round(s))  # Record the movement distance within the 0.2-second period as the track

    back_tracks = [-3, -3, -2, -2, -2, -2, -2, -1, -1, -1]  # Manually add the 20 at the beginning to generate a subtractive track, that is, the backslide track
    return {'forward_tracks': forward_tracks, 'back_tracks': back_tracks}
Copy the code

Js decrypts key parameters

The threshold of this kind of method is relatively high, through breakpoint debugging Js, reverse analysis of the generation logic of the parameters submitted after sliding to complete the generation of parameters, and then the construction request to complete the submission, of course, the middle also needs to analyze the gap position of the picture and simulation track, but did not use simulation so fast and high success rate.

The disadvantages are high risk, high maintenance cost of the code, update a new version will have to re-analysis and reverse the code of related products is a certain legal risk, free food and accommodation is not a joke, so many business leaders are silent to make a fortune and do not publicity everywhere.

Use existing services

The above two methods have their own advantages and disadvantages, and many people want to distribute the workload and risk of this piece, which is to use a third-party service provider.

However, there is no such service in the domestic market at present. Xianyu is currently using a Russian service provider – 2Captcha

The service offers a variety of captcha services, including GeeTest, which we are concerned about.

Here is a simple introduction to how to use the service. (Don’t ask why you charge, the service providers also want to eat, and this price is really cheap)

First, sign up for an account. The website is 2captcha.com/zh

After completing the registration, you will jump to the console interface, where the most important thing is to get your API Key.

Ok, after getting this API Key, you can use the service to complete the cracking of sliding.

By referring to the official API documentation, we only need to build the Get request.

The first Get request consists of this:

https://2captcha.com/in.php ? Key = API key obtained above &method=geetest &gt= extreme parameters &challenge= extreme parameters &api_server=api-na.geetest.com(optional) &pageurl= address of the web page where the sliding verification code residesCopy the code

Parameter list:

Parameter names Parameter is introduced
key API KEY
method Indicates the verification code type
gt Polar parameter 1
challenge Polar parameter 2
api_server Api-na.geetest.com (optional)
pageurl The address of the web page where the slide verification code is located

Here’s how to get gt and Challenge.

The gt parameter in the first request is fixed and can be obtained by looking for a website that uses a certain test. Such as:

The challenge parameter is a Get request that is returned. You can retrieve the request once you find it, or if it is XHR you can reply XHR directly.

After submitting the first request, a result similar to the following is returned.

OK|2122988149 or as JSON {"status":1."request":"2122988149"}
Copy the code

This string of numbers is the session ID.

Once we have this session ID, we can build the next request, which takes some time.

https://2captcha.com/res.php
?key=API KEY
&action=get
&id=2122988149
Copy the code

Parameter list:

Parameter names Parameter is introduced
key API KEY
action Get
id The session ID returned from the previous request

The result of this request is the encryption parameter we need.

{
      "challenge":"1a2b3456cd67890e12345fab678901c2de"."validate":"09fe8d7c6ba54f32e1dcb0a9fedc8765"."seccode":"12fe3d4c56789ba01f2e345d6789c012|jordan"
}
Copy the code

The common types of captcha have all been introduced.

I’m sure some people are asking that solutions like ReCaptcha at Google and similar hCaptcha solutions are not mentioned?

As for the above two types of verification codes, the service provider just mentioned also provides interface coding. As for other solutions, Salted fish has not contacted with them yet. After all, salted fish cannot pass these two types of verification codes manually by clicking, so it can only rely on service providers at present.