I wrote a post earlier about how to circumnavigate the Google ReCAPTCHA captchas using 2Captcha, but some friends told me that 2Captcha is expensive and complicated to use.

Today I’m going to show you another way to crack Google ReCAPTCHA.

With ReCAPTCHA is introduced

You probably haven’t heard of ReCAPTCHA, and for some reason, it’s a captcha that doesn’t appear very often in the country, but you probably haven’t seen or used ReCAPTCHA. It looks like this:

As long as we click the first check box, the captcha algorithm uses its “risk analysis engine” to perform a security check. If the check is successful, we get the following result:

If the algorithm detects that there is a risk in the current system, such as an unfamiliar network environment or an analog program, it needs to perform a second check. It will further pop up something like this:

Such as this picture above, the code page will appear nine images, most appear above the word “tree” at the same time, we need to click on the below 9 “trees” images appear in the picture, click on the completed, there may be a few new pictures, we have to finish click again, finally click on the “validate” button to complete the verification. Or we can click on the “headphones” icon at the bottom, which will switch to dictation mode and the verification code will look like this:

At this time, if we can fill in the audio content to read the verification code, we can also pass the verification. Both of these methods can be verified, after which we can complete the form submission, such as login, registration and other operations. What is the name of this captcha? This captcha is the Google ReCAPTCHA V2 captcha, and it is one of the behavioral captcha captcha that can only be verified if the behavior is checked, such as clicking on a checkbox, selecting an image, and dictation. Compared with common graphic captcha, this kind of captcha has better interactive experience, higher security and more difficult to crack.

And the captcha that I just described is just a form of ReCAPTCHA captcha, which is an explicit version of V2, which also has an implicit version of V2, and the implicit version of V2, which doesn’t have an explicit validation page anymore when it’s checked, it’s a JavaScript that binds the captcha to the submit button, Validation is done automatically when the form is submitted. In addition to version V2, The latest version of reCAPTCHA V3 is the reCAPTCHA V3, which calculates a score based on the user’s behavior that represents the probability that the user is a robot, which is used to determine whether the verification will pass. It’s more secure and a better experience.

experience

So where do you experience ReCAPTCHA? We can open the web site: www.google.com/recaptcha/a… , it is suggested to surf the Internet scientifically and open it with an anonymous window, so that the test will not be interfered by historical Cookies, as shown in the figure:

And at this point, you can see that there’s a ReCAPTCHA window down here, and then you click on it and there’s a validation block.

Of course, it can be solved manually, but it certainly can’t be solved for reptiles. How can it be solved automatically?

Let’s take a look at a simple platform to use.

The solution

This we introduce a with ReCAPTCHA decoding service called YesCaptcha, homepage is yescaptcha.365world.com.cn/, it can now also support V2 and V3 version of the crack.

We can use it to try the solution about just now with ReCAPTCHA V2 type verification code: www.google.com/recaptcha/a… .

After simple registration, you can find a Token on the home page. We can copy it for later use, as shown below:

It has two key apis, one is to create captcha service task, the other is to query the status of the task, the API is as follows:

  • Create a task: api.yescaptcha.365world.com.cn/v3/recaptch…

  • Query condition: api.yescaptcha.365world.com.cn/v3/recaptch…

The API documentation can reference here: docs.yescaptcha.365world.com.cn/

It can be seen from the API documentation that the following parameters can be configured:

Parameter names Whether must instructions
token is Please obtain (Token) at personal Center
siteKey is ReCaptcha SiteKey (fixed parameter)
siteReferer is ReCaptcha Referer (also typically a fixed parameter)
captchaType no ReCaptchaV2(default)/ReCaptchaV3
siteAction no ReCaptchaV3 Select Action and default to Verify
minScore no ReCaptchaV3 Select minimum fraction (0.1-0.9)

Here are three key messages:

  • Token: This is the argument we copied on YesCaptcha

  • SiteKey: This is the flag string for ReCAPACHA, and we’ll show you how to find it later.

  • SiteReferer, generally is the source of with ReCAPTCHA website Referer, such as for the current case, the value is www.google.com/recaptcha/a…

How do I find siteKey? It’s really easy. Let’s look at the HTML source code for the current ReCAPTCHA, and just look at the source code:

So you can see that each ReCAPTCHA corresponds to a DIV, and div has a property called date-sitekey, and if you look at the value here, it’s:

6Le-wvkSAAAAAPBMRTvw0Q4Muexq9bi0DJwx_mJ-
Copy the code

Okay, we’re all set, all we need is the code!

starts

Let’s use simple requests for this. Let’s first define constants:

TOKEN = '50a07xxxxxxxxxxxxxxxxxxxxxxxxxf78'  # Please replace with your own TOKEN
REFERER = 'https://www.google.com/recaptcha/api2/demo'
BASE_URL = 'http://api.yescaptcha.365world.com.cn'
SITE_KEY = '6Le-wvkSAAAAAPBMRTvw0Q4Muexq9bi0DJwx_mJ-' Please replace with your own SITE_KEY

Copy the code

Here we define several constants:

  • TOKEN: The TOKEN copied from the website

  • REFERER: A link to the Demo site

  • API_BASE_URL: is the API address of YesCaptcha

  • SITE_KEY: the data-sitekey we found just now

Then we define a method to create a task:

def create_task() :
    url = f"{BASE_URL}/v3/recaptcha/create? token={TOKEN}&siteKey={SITE_KEY}&siteReferer={REFERER}"
    try:
        response = requests.get(url)
        if response.status_code == 200:
            data = response.json()
            print('response data:', data)
            return data.get('data', {}).get('taskId')
    except requests.RequestException as e:
        print('create task failed', e)
Copy the code

This is just calling the API to create the task, nothing to say.

Select * from task_id (task_id, task_id, task_id);

def polling_task(task_id) :
    url = f"{BASE_URL}/v3/recaptcha/status? token={TOKEN}&taskId={task_id}"
    count = 0
    while count < 120:
        try:
            response = requests.get(url)
            if response.status_code == 200:
                data = response.json()
                print('polling result', data)
                status = data.get('data', {}).get('status')
                print('status of task', status)
                if status == 'Success':
                    return data.get('data', {}).get('response')
        except requests.RequestException as e:
            print('polling task failed', e)
        finally:
            count += 1
            time.sleep(1)
Copy the code

The request API is the API for querying the task status, and a result of the task status will be obtained. If the result is Success, the task is successful, and the response result is the token obtained after the verification code cracking.

Two methods are called:

if __name__ == '__main__':
    task_id = create_task()
    print('create task successfully', task_id)
    response = polling_task(task_id)
    print('get response:', response[0:40] +'... ')
Copy the code

The result is similar to the following:

response data: {'status': 0.'msg': 'ok'.'data': {'taskId': '1479436991'}}
create task successfully 1479436991
polling result {'status': 0.'msg': 'ok'.'data': {'status': 'Working'}}
status of task Working
polling result {'status': 0.'msg': 'ok'.'data': {'status': 'Working'}}
status of task Working
polling result {'status': 0.'msg': 'ok'.'data': {'status': 'Working'}}
status of task Working
polling result {'status': 0.'msg': 'ok'.'data': {'status': 'Success'.'response': '03AGdBq27-ABqvNmgq96iuprN8Mvzfq6_8noknIed5foLb15oWvWVksq9KesDkDd7dgMMr-UmqULZduXTWr87scJXl3djhl2btPO721eFAYsVzSk7ftr4uH BdJWonnEemr9dNaFB9qx5pnxr3P24AC7cCfKlOH_XARaN4pvbPNxx_UY5G5fzKUPFDOV14nNkCWl61jwwC0fuwetH1q99r4hBQxyI6XICD3PiHyHJMZ_-wol cO1R9C90iGQyjzrSMiNqErezO24ODCiKRyX2cVaMwM9plbxDSuyKUVaDHqccz8UrTNNdJ4m2WxKrD9wZDWaSK10Ti1LgsqOWKjKwqBbuyRS_BkSjG6OJdHqJ N4bpk_jAcPMO13wXrnHBaXdK4FNDR9-dUvupHEnr7QZEuNoRxwl8FnO2Fgwzp2sJbGeQkMbSVYWdAalE6fzJ8NwsFJxCdDyeyO817buBtvTJ4C06C1uZ92fp PTeYGJwbbicOuqbGfHNTyiSJeRNmt-5RKz0OUiPJOPnmVKGlWBOqwbwCW1WZt-E-hH4FEg4En5TITmmPb_feS9dWKUxudn1U0hHk2vV9PerjZLtI7F67Ktgm cqRrARPbwnc6KyAi3Hy1hthP92lv4MRIcO2jx0Llvsja-G2nhjZB0ZoJwkb9106pmqldiwlXxky4Dcg7VPStiCYJvhQpRYol7Iq1_ltU2tyhMqsu_Xa8Z6Mr 5ykRCLnmlLb8DV8isndrdwp84wo_vPARGRj7Up9ov-ycb5lDKTf1XRaHiMCa8d2WLy0Pjco9UnsRAPw0FW3MsBJah6ryHUUDho7ffhUUgV1k86ryJym6xbWc h1sVC4D5owzrCFn6L-rSLc5SS1pza2zU5LK4kAZCmbXNRffiFrhUY8nP4T1xaR2KMhIaN8HhJQpR8sQh1Azc-QkDy4rwbYmxUrysYGMrAOnmDx9z7tWQXbJE 4IgCVMx5wihSiE-T8nbF5y1aJ0Ru9zqg1nZ3GSqsucSnvJA8HV5t9v0QSG5cBC1x5HIceA-2uEGSjwcmYOMw8D_65Dl-d6yVk1YN2FZCgMWY5ewzB1RAFN1B MqKoITQJ64jq3lKATpkc5i7aTA2bRGQyXrbDyMRIrVXKnYMHegfMbDn0l4O81a8vxmevLspKkacVPiqLsAe-73jAxMvsOqaG7cKxMQO9CY3qbtD55YgN0W4p 2jyNSVz3aEpffHRqYyWMsRI5LddLgaZQDoHHgGUhV580PSIdZJ5eKd0gOjxIYxKlr0IgbMWRmsG_TgDNImy1c5oey8ojl-zWpOQW7bnfq5Z4tZ10_sCTfoOZ VLqRuOsqB1OOO9pLRQojLBP0HUiGhRAr_As9EIDu6F9NIQfdAmCaVvavJbi1CZITFjcywP-tBrHsxpwkCXlwl996MK_XyEDuyWnJVGiVSthUMY306tIh1Xxj 93W3KQJCzsfJQcjN-3lGLLeDFddypHyG4yrpRqRHHBNyiNJHgxSk5SaShEhXvByjkepvhrKX3kJssCU04biqqmkrQ49GqBV9OsWIy0nN3OJTx8v05MP8aU8Y YkYBF01UbSff4mTfLAhin6iWk84Y074mRbe2MbgFAdU58KnCrwYVxcAR8voZsFxbxNwZXdVeexNx5HlIlSgaAHLWm2kFWmGPPW-ZA7R8Wst-mc7oIKft5iJl 8Ea0YFz8oXyVgQk1rd9nDR3xGe5mWL1co0MiW1yvHg'}}
Copy the code

If the ReCAPTCHA returns data in this format, then the ReCAPTCHA has recognized the token, and the content of the Response field that it returns is the token that was recognized, and we just took that token and put it in the form and submitted.

How do you use this token? In fact, if we’re going to do a browser validation, and when we click on the submit form, we’re going to assign a textarea in the form whose name is G-RECAPtcha-Response, and if the validation is successful, its value is the token that was authenticated, This is sent to the server for validation as part of the form submission. If this field validates successfully, then no problem.

So, the above process simulates the process of clicking the captcha for us, and the resulting token is actually the one we should assign to name G-recaptcha – Response. So how do I do that? It’s easy, just use JavaScript. We can use JavaScript to select the textarea and assign it directly, as follows:

document.getElementById("g-recaptcha-response").innerHTML="TOKEN_FROM_YESCAPTCHA";
Copy the code

Notice that TOKEN_FROM_YESCAPTCHA needs to be replaced with the token value we just got. When we do the crawler simulation login, if we use Selenium, Puppeteer and other software, in the simulation program, just need to simulate the execution of this JavaScript code, you can successfully assign values. After executing, submit the form directly. Let’s look at the Network request:

So you can see that it’s just submitting a form, and one of the fields in it is g-RECAPtcha-Response, and it’s going to be sent to the server for validation, and if it passes, it’s successful. So, if we get the token using YesCaptcha and assign it to the textarea of the form, the form will be submitted, and if the token is valid, we can successfully bypass the login without having to click the captcha again. We ended up with the following successful page:

Of course we can also use Requests to simulate completing form submission:

def verify(response) :
    url = "https://www.google.com/recaptcha/api2/demo"
    data = {"g-recaptcha-response": response}
    response = requests.post(url, data=data)
    if response.status_code == 200:
        return response.text
Copy the code

One final refinement of the call:

if __name__ == '__main__':
    task_id = create_task()
    print('create task successfully', task_id)
    response = polling_task(task_id)
    print('get response:', response[0:40] +'... ')
    result = verify(response)
    print(result)
Copy the code

The running results are as follows:

response data: {'status': 0.'msg': 'ok'.'data': {'taskId': '1479436991'}}
create task successfully 1479436991
polling result {'status': 0.'msg': 'ok'.'data': {'status': 'Working'}}
status of task Working
polling result {'status': 0.'msg': 'ok'.'data': {'status': 'Working'}}
status of task Working
polling result {'status': 0.'msg': 'ok'.'data': {'status': 'Working'}}
status of task Working
polling result {'status': 0.'msg': 'ok'.'data': {'status': 'Success'.'response': '03AGdBq27-ABqvNmgq96iuprN8Mvzfq6_8noknIed5foLb15oWvWVksq9KesDkDd7dgMMr-UmqULZduXTWr87scJXl3djhl2btPO721eFAYsVzSk7ftr4uH BdJWonnEemr9dNaFB9qx5pnxr3P24AC7cCfKlOH_XARaN4pvbPNxx_UY5G5fzKUPFDOV14nNkCWl61jwwC0fuwetH1q99r4hBQxyI6XICD3PiHyHJMZ_-wol cO1R9C90iGQyjzrSMiNqErezO24ODCiKRyX2cVaMwM9plbxDSuyKUVaDHqccz8UrTNNdJ4m2WxKrD9wZDWaSK10Ti1LgsqOWKjKwqBbuyRS_BkSjG6OJdHqJ N4bpk_jAcPMO13wXrnHBaXdK4FNDR9-dUvupHEnr7QZEuNoRxwl8FnO2Fgwzp2sJbGeQkMbSVYWdAalE6fzJ8NwsFJxCdDyeyO817buBtvTJ4C06C1uZ92fp PTeYGJwbbicOuqbGfHNTyiSJeRNmt-5RKz0OUiPJOPnmVKGlWBOqwbwCW1WZt-E-hH4FEg4En5TITmmPb_feS9dWKUxudn1U0hHk2vV9PerjZLtI7F67Ktgm cqRrARPbwnc6KyAi3Hy1hthP92lv4MRIcO2jx0Llvsja-G2nhjZB0ZoJwkb9106pmqldiwlXxky4Dcg7VPStiCYJvhQpRYol7Iq1_ltU2tyhMqsu_Xa8Z6Mr 5ykRCLnmlLb8DV8isndrdwp84wo_vPARGRj7Up9ov-ycb5lDKTf1XRaHiMCa8d2WLy0Pjco9UnsRAPw0FW3MsBJah6ryHUUDho7ffhUUgV1k86ryJym6xbWc h1sVC4D5owzrCFn6L-rSLc5SS1pza2zU5LK4kAZCmbXNRffiFrhUY8nP4T1xaR2KMhIaN8HhJQpR8sQh1Azc-QkDy4rwbYmxUrysYGMrAOnmDx9z7tWQXbJE 4IgCVMx5wihSiE-T8nbF5y1aJ0Ru9zqg1nZ3GSqsucSnvJA8HV5t9v0QSG5cBC1x5HIceA-2uEGSjwcmYOMw8D_65Dl-d6yVk1YN2FZCgMWY5ewzB1RAFN1B MqKoITQJ64jq3lKATpkc5i7aTA2bRGQyXrbDyMRIrVXKnYMHegfMbDn0l4O81a8vxmevLspKkacVPiqLsAe-73jAxMvsOqaG7cKxMQO9CY3qbtD55YgN0W4p 2jyNSVz3aEpffHRqYyWMsRI5LddLgaZQDoHHgGUhV580PSIdZJ5eKd0gOjxIYxKlr0IgbMWRmsG_TgDNImy1c5oey8ojl-zWpOQW7bnfq5Z4tZ10_sCTfoOZ VLqRuOsqB1OOO9pLRQojLBP0HUiGhRAr_As9EIDu6F9NIQfdAmCaVvavJbi1CZITFjcywP-tBrHsxpwkCXlwl996MK_XyEDuyWnJVGiVSthUMY306tIh1Xxj 93W3KQJCzsfJQcjN-3lGLLeDFddypHyG4yrpRqRHHBNyiNJHgxSk5SaShEhXvByjkepvhrKX3kJssCU04biqqmkrQ49GqBV9OsWIy0nN3OJTx8v05MP8aU8Y YkYBF01UbSff4mTfLAhin6iWk84Y074mRbe2MbgFAdU58KnCrwYVxcAR8voZsFxbxNwZXdVeexNx5HlIlSgaAHLWm2kFWmGPPW-ZA7R8Wst-mc7oIKft5iJl 8Ea0YFz8oXyVgQk1rd9nDR3xGe5mWL1co0MiW1yvHg'}} status of task Success get response: 03AGdBq27-ABqvNmgq96iuprN8Mvzfq6_8noknIe... <! DOCTYPE HTML><htmldir="ltr"><head><meta http-equiv="content-type" content="text/html; charset=UTF-8"><meta name="viewport" content="width=device-width, user-scalable=yes"><title>ReCAPTCHA demo</title><link rel="stylesheet" href="https://www.gstatic.com/recaptcha/releases/TbD3vPFlUWKZD-9L4ZxB0HJI/demo__ltr.css" type="text/css"></head><body><div class="recaptcha-success">Verification Success.Hooray! </div></body></html>
Copy the code

Finally, you can see that after the mock commit, the results will have a Verification Success… Hooray! On behalf of the successful verification of the text!

At this point, we have successfully cracked ReCAPTCHA.

We introduced the implementation of Requests above, of course, using Selenium and other tools can also be implemented, specific Demo is written in the documentation, please refer to the documentation instructions to use.

Small welfare

I think the price of YesCaptcha is much more affordable than 2Captcha introduced before. It costs 10 points to crack once, and 10 yuan is 10,000 points. Therefore, the average cost of cracking captcha is one penny, and new users are given 1000 points, which can crack 100 times. Personally, I think it’s a bargain.

You can have a try!

Slip slip slip ~

For more exciting content, please pay attention to my public account “Attack Coder” and “Cui Qingcai | Jingmi”.