For the record, we’re not really talking about cracking 12306 captcha, even though the Dragon Boat Festival holiday is just around the corner, and you know why.

Today we’ll take cracking the world’s most popular WordPress captcha plug-in as an example.


Everyone hates captchas, right? These annoying little photos contain a lot of text messages that you have to type in to visit the site. Captcha systems are designed to verify that a user visiting a website is a real person. But with advances in deep learning and computer vision, it’s easy to beat these captcha systems. (Unless you run into 12306’s slutty photo verification, sometimes desperation is real.)

Adrian didn’t have access to the source code of the site’s tool for generating captcha photos, so in order to crack the captcha, he had to download hundreds of sample images and then hand-train the machine learning system he had built.

But what if we wanted to hack an open source captcha system where we had access to the source code?

I did a search for “captcha” on the WordPress plugin registration site, and the first result was “Really Simple Captcha” with over 1 million active installs:

Can we crack the whole captcha system in 15 minutes? Give it a try!

For the record: this is in no way intended to criticize the “Really Simple CAPTCHA” plug-in or its authors. The author of the plugin himself has said that this plugin is not very secure and suggests switching to another plugin. So it was a fun, quick technical challenge. However, if you’re a user of this plugin, you might want to switch.

Challenge to

Before we “attack”, let’s see what CAPTCHA photos Really Simple CAPTCHA can produce. On the demo site, we see this:

OK, so the captcha appears to be a combination of four letters and numbers. Let’s verify this in PHP source code:

public function __construct() {
		/* Characters available in images */
		$this->chars = 'ABCDEFGHJKLMNPQRSTUVWXYZ23456789';

		/* Length of a word in an image */
		$this->char_length = 4;

		/* Array of fonts. Randomly picked up per character */
		$this->fonts = array(
			dirname( __FILE__ ) . '/gentium/GenBkBasR.ttf',
			dirname( __FILE__ ) . '/gentium/GenBkBasI.ttf',
			dirname( __FILE__ ) . '/gentium/GenBkBasBI.ttf',
			dirname( __FILE__ ) . '/gentium/GenBkBasB.ttf',);Copy the code

That’s right, it generates a four-character captcha in the form of a random mix of four different fonts. We can also see that it never uses the letters “O” and “I” because they are easy for users to confuse. So we need to recognize 24 English letters and 10 Arabic digits, which is a mixture of 32 characters. No problem!

Time so far: 2 minutes

The use of tools

Before we start the challenge, let’s talk about the tools we’ll use:

Python 3

Python is an interesting programming language, with many good libraries for machine learning and computer vision.

OpenCV

OpenCV is a popular framework for computer vision and image processing. We’re going to use OpenCV to process captcha photos. It has a Python API, so we can use it directly from Python.

Keras

Keras is a deep learning framework written in Python. With Keras, it is easy to define, train, and use deep neural networks with very little code.

TensorFlow

TensorFlow is a Google library for machine learning. We’ll be programming on Keras, but Keras won’t actually use the neural network logic itself, instead doing the heavy lifting behind the scenes using Google’s TensorFlow library.

OK, let’s go back to the challenge!

Create our data set

Whatever machine learning system we’re training, we need to have training data. To break a captcha system, we want to have training data like this:

As shown, the model can be trained by entering a captcha photo, and then the model can output the correct answer.

Because we had the source code for the WordPress plug-in, we were able to modify it and save 10,000 captcha photos, along with the expected answers for each photo.

After a few minutes of tinkering with the code and adding a simple for loop, I got a training data folder containing 10,000 PNG photos, each with the correct answer as the file name of the photo:

This is the only part of this article where I didn’t give examples of my working code, because on reflection I decided that this tutorial was purely a fun idea and I didn’t want anyone to actually spam the WordPress site. But I will provide the 10,000 photos I created above at the end of this post so you can use them.

Time so far: 5 minutes

Simplify things

Now we have the training data we need to directly train a neural network:

With enough training data, this should work, but we can make the problem easier. The simpler the problem, the less training data and computational power required to solve it. After all, we only have 15 minutes!

Fortunately, the captcha photos generated by the plugin are only four characters long, so we can somehow break up the photos, each word as a separate little photo, so we just need to train the neural network to recognize a single word at a time:

So we start with a raw captcha image:

We then converted it to a pure black and white image (this is called threshing) so that we could easily see the continuous areas in the image:

But wait, I found a problem! Sometimes captchas overlap letters, as in this case:

This means that it is possible to extract two letters into a region:

If we don’t fix this, the quality of our training data will be terrible

Here’s a simple trick: If an outline area is much wider than its height, it means that two letters may overlap. In this case, we just split them in the middle as two separate letters:

Now that we can get the individual letters exactly, let’s go through all the captcha photos. The goal is to collect different variations of each letter. To make it easier to organize, we can save each letter in its own folder.

For example, here’s what the “W” folder looks like after I extract all the letters:

Time so far: 10 minutes

Build and train neural networks

Because we only need to recognize individual letters and numbers, we don’t need a very complex neural network architecture. It’s much easier to read letters than complicated photos of dogs and cats. We will use a simple convolutional neural network with two convolutional layers and two completely connected layers:

If you want to learn more about how convolutional neural networks work and why they are ideal for photo recognition, check out Adrian’s book Deep Learning for Computer Vision with Python.

Using Keras, it takes only a few lines of code to define this neural network architecture:

# Build a neural network!
model = Sequential()

# the first convolution layer is maximized
model.add(Conv2D(20, (5, 5), padding="same", input_shape=(20, 20, 1), activation="relu"))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))

# second convolution layer and maximum pooling
model.add(Conv2D(50, (5, 5), padding="same", activation="relu"))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))

# Hidden layer with 500 nodes
model.add(Flatten())
model.add(Dense(500, activation="relu"))

# Output layer with 32 nodes (each node corresponds to the possible letter or number we predict)
model.add(Dense(32, activation="softmax"))

# Build TensorFlow model with Keras
model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])
Copy the code

Now you can train him!

# Train the neural network
model.fit(X_train, Y_train, validation_data=(X_test, Y_test), batch_size=32, epochs=10, ver
Copy the code

After 10 training sessions with the training data set, we achieved nearly 100 percent accuracy. At this point, we can automatically pass the tool’s verification code at any time! Finish the job!

Time so far: 15 minutes (Bingo!

Use the trained model to solve the problem of filling in the verification code

Now that we have a trained neural network, it’s very easy to crack real captchas:

  • Grab a real captcha photo from a site that uses the WordPress plugin in this article.

  • The captcha photos are divided into four separate alphabetic photos in the same way we created the training data set.

  • Let our neural network make an independent prediction for each letter photo.

  • Fill in the captcha with 4 predicted letters.

  • Cheers!!!!

Here’s what our model looks like when cracking a real captcha system:

Try it yourself!

If you want to try it yourself, you can click here to get the code, which includes 10,000 sample photos used in this article and all the code for each step. Look at the readme.md file in the folder for a description of the running code.

But if you really want to learn how to run every line of code, I highly recommend Deep Learning for Computer Vision with Python, which is full of details and lots of detailed examples. It’s the only book I’ve seen that explains both principles and how to apply them to real problems. Check it out!


Welcome to pay attention to us, learning resources, AI tutorial, paper interpretation, interesting projects, you want to see all here!