Yesterday, nuggets saw the recommended article “from zero to develop a lightweight sliding verification code plug-in”, introduced some knowledge points related to the verification code. Coincidentally, just two weeks ago, the company held a crawler attack and defense contest, which required a variety of machine verified cracking, one of which was slider verification.

Today in this article to introduce you how to use code to crack the slider captcha. At the end of the article there is a link to the source code, the need for friends can be taken, but remember before taking three even wow!

Thinking on

Open the page

Slider captchas are a type of sliding captchas that are generated in at least three steps:

  1. Obtain the captcha image from the background according to the user id
  2. Listen for mouse events and send back to the background
  3. The background determines the authenticity of the event and sends back the verification result

Regardless of the details of the generation mechanism, sliders are meant to be displayed on the page. Puppeteer is used to open the page. Take the react-Slider-Vertify website for example.

Puppeteer is a library that uses the DevTools protocol to control the behavior of browsers. With very little coding, Puppeteer can manipulate real browsers to perform functions such as crawlers, automated testing, web page performance analysis, and browser extension testing. It is easy to use, it is quite fully documented, according to the document, before opening the page, you need to start a browser instance, and then call the newPage method to create a newPage. The core code is as follows.

const puppeteer = require('puppeteer')

puppeteer.launch().then(async browser => {
  const page = await browser.newPage()
  await page.goto('http://h5.dooring.cn/slider-vertify/vertify')})Copy the code

In normal web browsing, slider validation is rarely encountered because it takes a long path and takes a long time for the user. Even though the copy says “Congratulations 0.9s beat 99% of users”, from pre-script requests, to loading images, to user sliders, to postback validation… The back and forth probably took you more than nine seconds. It’s defensive in nature, usually when the server is limiting traffic, or when the server already suspects you’re a crawler, it will pop up and ask for further verification. Generally speaking, crawler code can ignore slider verification by default, but in this code we assume that the default verification code must exist.

Next, analyze user behavior. The solution to slider verification is to determine the location of the notch and move the mouse. This can be further subdivided into mouse click events and mouse movement events. The code logic is roughly as follows.

  1. Wait until the verification code image is loaded
  2. Move the mouse to the slider position
  3. Press the mouse
  4. Move the mouse to the notch position
  5. Release the mouse
  6. Wait for results to return

Ah, I know the process, but the problem is, how do you know where to move the mouse? The server side returns an image with a notch that is not passed through the interface.

Some of you might ask, how do I move the mouse? If I move the slider to the specified location all at once, the server will mark me as a crawler immediately. I’ll answer these two questions one by one.

Determine the position

To analyze the location of the gap, we must first know how the gap was drawn. Opening the console for a preliminary check, it can be found that the page first got a complete picture from the server, and then the gap position was randomly generated by JS. This… This is… This is because the test page we use is a documentation page, so you don’t have to worry about these little security details.

To continue the idea, since the notch is a hole in the original image, we just need to identify the location of the hole in the image. A simpler solution is to use a third party’s image recognition technology (or related technology), upload the image to the third party, and directly get the relative coordinates of the notch location. The following figure shows the image segmentation effect of Ali Cloud alimagic.

If you want to make a more stable captcha cracking tool, I suggest you use your own model, or write your own algorithm. Third, this image recognition is not specially trained for captcha, so it is not mature enough to be used in crawlers. Once the background image is complex, the recognition rate drops quickly.

Here’s a new idea. We already have a complete image, so just keep sliding and comparing the result to the original image. In theory, as long as the slip is about “that point”, the naked eye does not look too much of a violation, you are done. Take the picture below.

For screenshots, you can use Pupeteer’s screenshots function, which provides the corresponding API to accurately capture specific elements.

As for the picture comparison, in fact, after the initial processing of the picture, the two pictures are compared pixel by pixel. If the color difference between two pixels exceeds the threshold, they are considered to be two different pixels. For simplicity, let’s go straight to the open source library Rembrandt, which returns the difference between two images.

Finally, the slider, since to simulate the human operation, then operate CSS, with absolute positioning, the slider pixel by pixel to the right; With each move, record the results of the image comparison.

The core code for the above three processes is very simple, requiring only the following lines:

while (left <= maxOffset) {

  /* Use the CSS left property to control the offset of the floating slider */

  await page.evaluate(async ($sliderFloat, left) => {
    $sliderFloat.setAttribute('style'.`left: ${left}px`)
  }, $sliderFloat, left)

  /* Take a screenshot and compare with the original image, save the result to the results array */

  const $panel = await page.$('#Vertify-demo-4 .canvasArea')
  const panelImgBase64 = await $panel.screenshot({
    type: 'jpeg'
  })
  const compareRes = await rembrandt({
    imageA: panelImgBase64,
    imageB: rawImage
  })
  results.push({
    left,
    diff: compareRes.differences
  })

  left += 1
}
Copy the code

Finally, throw results in (here’s an example of an ECharts line chart) and you’ll get something like this. The answer is obvious. As we move the slider from left to right and it approaches the gap, the more the image looks like the original, the smaller the pixel difference between the two. As you keep moving to the right, the slider moves away from the gap, and the pixel difference between the screenshot and the original image gradually increases. We just need to find the point with the least difference, and then slide the slider to the corresponding left offset to widen.

As an aside, why is the biggest difference around 3000? Let’s do a simple calculation.

The size of the slider is 45 by 45, and the outer circle takes up about 2100 pixels; That means the gap plus the slider, theoretically at most 4200 pixels different from the original image. However, the slider may overlap with the hidden pixels, so let’s say there’s 350 pixels of overlap, plus the difference in our lowest point is 351, subtract these errors, and you get 3499. Boo hoo, 3499 is about 3000, estimate successful (manual dog head).

But that’s not all. If you run the code, you’ll find it’s too slow! It takes a normal person two seconds to swipe a captcha, but it takes us 40 seconds to get a picture of the bottom of the valley, frame by frame.

Here are several ideas for optimizing efficiency:

  1. Reduce the elements, copy more than one, spread out to display; So as long as cut once, and then cut, alignment is good.
  2. Take big steps, like shifting 15px at a time to find a local optimal solution, and then go back around the local optimal solution by shifting 1px to find the optimal solution.
  3. Because the result of picture comparison is similar to “V”, the right side of “V” can not be calculated.

Using 1+2+3, I think the optimal solution can be solved in 3s, but the code complexity will become very high. For the sake of simplicity, this paper only implements scheme 2 for the moment.

The first is to move 15px at a time to find a local optimal solution.

// Image notches will not be dug near the original,
// So left from 45 pixels can save a lot of calculation,
let left = 45;
const max15Offset = 265;
const res15px = [];
while (left <= max15Offset) {
  await setLeft(left);
  const compareRes = await compare();
  res15px.push({
    left,
    diff: compareRes.differences,
  });
  left += 15;
}
Copy the code

Then try to move 2px each time to find the optimal solution. The search scope is 20 pixels of the left offset of the optimal solution with 15px step size.

const min15pxDiff = Math.min(... res15px.map((x) = > x.diff));
const min15pxLeft = res15px.find((x) = > x.diff === min15pxDiff).left;

left = min15pxLeft - 12;
const max2Offset = min15pxLeft + 8;
const res2px = [];
while (left <= max2Offset) {
  await setLeft(left);
  const compareRes = await compare();
  res2px.push({
    left,
    diff: compareRes.differences,
  });
  left += 2;
}
Copy the code

So the solution is going to be approximately equal to the optimal solution. Of course, if you feel unstable, you can also use 1px steps to look for it.

By some estimates, the number of screenshots needed to be 245 is now down to 1/10, 23. Don’t be too happy, though, because tests have found that with optimization 2, it still takes 7s to solve captchas…

Move the mouse

Notch position is done, that move the mouse is not simple ~

Puppeteer already provides four mouse-related interfaces: mouse.click, mouse.down, mouse.move and mouse.up. Use mouse.move to move the mouse position directly to a specific coordinate. Suppose we now take about 1s to move the mouse from coordinate (100,100) to coordinate (200,200), using a loop.

const now = {
  x: 100.y: 100
}
const target = {
  time: 1000.x: 200.y: 200,}const steps = 10
const step = {
  x: Math.floor((target.x - now.x) / steps),
  y: Math.floor((target.y - now.y) / steps),
  time: target.time / steps
}
while (now.x < target.x) {
  await sleep(step.time)
  now.x += step.x
  now.y += step.y
  await page.mouse.move(now)
}
Copy the code

If I play games like this code, I think it will click wherever MY hand goes

Machines don’t shake, and this code is so far from sliding in the real world! Let’s take a look at a picture of me using my hand, and in particular watch carefully where the mouse is as I slide.

  • The Y-axis position of the mouse is always changing
  • The mouse will overslide on the X-axis

So let’s do a little bit of optimization here to integrate these two details.

// Get a random offset
const getRandOffset = (enableNegative = true, max = 3) = > {
  const negative = enableNegative
    ? (Math.random() < 0.5)? -1 : 1
    : 1
  return Math.floor(Math.random() * max) * negative
}

// Slide the first ten pixels, then take 100 milliseconds to slide back to the correct position
const targets = [
  {
    time: 1000.x: 200 + getRandOffset(false.15),
    y: 200 + getRandOffset(false.15),
    steps: 10
  },
  {
    time: 100.x: 200.y: 200.steps: 3}]// Note that targets are strung together with the for await loop
for await (const target of targets) {
  const step = {
    x: Math.floor((target.x - now.x) / target.steps),
    y: Math.floor((target.y - now.y) / target.steps),
    time: target.time / target.steps,
  }
  let gap
  while (gap = Math.abs((target.x - now.x)), gap > 0) {
    await sleep(step.time)
    // The last step is to slide directly into place, no need for random number
    const inOneStep = Math.abs(target.x - now.x) <= Math.abs(step.x);
    if (inOneStep) {
      now.x = target.x;
      now.y = target.y;
    } else {
      now.x += step.x + getRandOffset();
      now.y += step.y + getRandOffset();
    }
    moveMouseTo(now)
  }
}
Copy the code

How to move the mouse to here is solved, if you want to consider the acceleration, user habits and other factors, the code will be more complicated, I won’t go into the discussion for the moment, interested students can do their own research.

The end result looks like this (this GIF forgot to loop, so you might have to open it on a new page to see it).

To read more

Source address: crack-the-slider

I hope this article can help you. I am a bionic lion. See you next time

Want to see how this article was created? You can find out from my blog project; Welcome Star & Follow; Also please visit my online blog more, super Nice oh ~


  • In Python crawler: Cracking sliding Captchas, this guy uses edge detection to determine where the gaps are.
  • Shocker: Swiping captcha can crack this, which uses a trick of seeing whose face is black to determine where the gaps are.

  1. Address of the test↩