background

Participated in a crawler application, climbed the bidding information released by a bidding website, and sent bidding information to sales and senior executives according to the keywords of purchased products.

The crawler application was written in 2018. So far, this website has been upgraded three times, and the first two times can be solved successfully. However, this time, the other side with the means of climbing is too advanced, beyond my ability category. All the way baidu, finally found a special crawler to introduce the public number, there is a set of JS reverse course, the price of nearly two thousand, look at the remaining credit card limit, think quietly, or their own research.

I have to admit, technology is money. I don’t know the browser developer mode until today. There are a lot of features I don’t know. JavaScript is so extensive and profound!

Sources panel

Chrome, for example, F12 open, inside the Sources panel there are three JS reverse common tools, first look at the panel content:

Custom JS fragments

In the panel corresponding to the following three points, there is a function to add Script fragments:Click on the New Snippet and an edit TAB appears on the right. After entering the JS snippet, click on the button to the right of red number four to execute JS directly.

This is much more convenient than writing JS on the Console TAB page. This function can be used to write code on the Console page.

Code formatting

Pages of TAB ②, which is very basic, shows resource files for the entire network request; You can use jquery to format a resource file directly. You can use jquery to format a resource file directly.

Filesystem and Overrides

These two functions are also very useful for JS reverse, so far I haven’t tested them yet, so I know this thing for now, and I will add this content after playing it completely one day.

Override ajax events

The latest update is to add hook events to Ajax requests and add anti-crawl parameters to send requests. Although the key of the parameter is fixed, the generation process of the value is quite complicated. Currently, the forgery has been tested, but the background fails, and 400 Bad Request will be returned.

Open the target site, add a New snippet and use Prototype to override the two Ajax events as follows:

(function() { var openoverride = window.XMLHttpRequest.prototype.open; window.XMLHttpRequest.prototype.open = function() { console.log( arguments ); return openoverride.apply(this, [].slice.call(arguments)); }; var sendoverride = window.XMLHttpRequest.prototype.send; window.XMLHttpRequest.prototype.send = function() { console.log( arguments ); return sendoverride.apply(this, [].slice.call(arguments)); }; }) (); Var XHR = new window.xmlhttprequest (); xhr.open('POST','/hello',true); xhr.send('abc'); gotoPage(20); // The function of the page itself, to decrypt the data requestCopy the code

Click Run:Console output:

These two Ajax requests are added with an additional parameter with a fixed key and an undefined value, which is a barrier to crawlers:Now the problem is that forging parameter values won’t get through either. Since these two parameters are generated in the front end, how are they used in the background? And the values of these two parameters vary with each Ajax request. There is information on the Internet that this is anti-replay, I simulate by using the length of these two parameters using random string, but also cheated the background.

This is a big gap!