About wechat public number "cloud crawler technology research notes" can see more oh!Copy the code

background

Recently it was found that Sogou wechat quietly offline a function on October 29, 2019.That is, it cannot specify the name of the public account in sogou search, as shown below

In this case, we have no way to accurately search to a public number of the latest article, so if we want to go to real-time tracking to a public number of the latest article, it can not be through the sogou channel to crawl. So, I sorted out the current wechat public number crawling way

  1. Sogou wechat channel (abandoned)
  2. AnyProxy+Appium
  3. X-weike-key (Universal Key)
  4. Hook wechat to obtain wechat public account pushCompare these approaches
  • The second uses automated tools that simulate human manipulation,AnyProxyWe can replace it withMitmProxyThat is to say, do oneMiddleman interceptionThe specific words can seeChen Wenguan’s blogLet’s see how it works, but middlemen interceptAppThere are always hidden dangers.
  • The third is known in the industry asWechat universal key, that is, like the public number article praise, reading what all need to pass thisKeyTo operate, so some black production (brush praise, brush reading) what like to take this thing, getKeyIt’s just reverse writing the source logic orHooktakeKeyAt present, there is no relevant article on the market.
  • And the last one is the one we’re going to do today, which isHookWechat official account push, because logically, the official account push is also the process of wechat official to send us messages, we canHookThis process, in which we adopt some of our own processing logic every time we receive a push, is by design the most “real-time” of the four.

In actual combat

Actual combat reference four elder brother’s article to do some improvement, we began to analyze the side of actual combat

To tell a common sense, such as wechat social apps, we and each other in the process of messaging chat records are saved in our local, so we can usually see our chat records, can also clean them, so if we want to intercept wechat messages, we have to Hook the wechat Insert method, That’s how they plug into the database. So where do we start? I believe that when you search Hook wechat on Baidu, you will find one kind of content, that is, how to decrypt the local database of wechat, and there is a keyword in the storage location of wechat on our mobile phone — enmicromsG. db. This is our entrance, we need to search this word in the full text in the source code of wechat, and we use wechat 6.5.3. The tool is Jadx. The actual operation is as follows

EnMicroMsg.db
onSQLExecuted

Help
Insert

Sql
Hook

Apk
Xposed

Content
1

We can see a large block of code in the Reversed field. The Reversed field is made up of a lot of garbled characters and fields and values. I suspect that wechat has implemented a decoding tool internally. Speaking of decoding, according to wechat’s previous data transmission, these data are probably transmitted in the format of XML. Since XML is involved, it must be in the form of key-value pairs. The data we went to are not only small squares in chaos, There are such as “MSG. Appmsg. Category. The item” this class looks useful content. Let’s do a full text search

msg.appmsg.mmreader.category.item

Yd
Yd
az.Yd
Hook

Hook

conclusion

This Hook is relatively simple case, the main is to look for ways to Hook the train of thought, the WeChat public push as a message, we went to Hook this process of information in the database to get original data, then find a place to native data decryption, thus by Hook native data decryption method to get the correct decoding data, Finally, we achieve the purpose of real-time access to wechat public account push.

Pit point

  1. The code just completed the function, in the additional wechat risk control part, Xposed detection part estimates also need to do additional work.
  2. The project is developed based on mobile wechat, which needs to be kept on normally. Stability needs extra consideration.

Note: the project has been completed, want to get the source code can pay attention to the following wechat signal, reply"Hook wechat public Account"Get the project address as well as ready-madeApk

Main introduction

  • I have worked in a second-tier factory for the past two years, and now I am moving bricks in a start-up company

  • The direction of contact is crawler and cloud native architecture

  • I have rich experience in reverse climbing and cloud primary secondary development

  • Others, such as data analytics and hacking growth, have also dabbled

  • I have done business sharing with more than 100 people and held training courses for many times

  • At present, he is also CSDN blog expert and Huawei cloud sharing expert

Only 3 minutes to shock |! Fast deployment of personal Docker cloud platform

In-depth understanding of Python’s TLS mechanism and threading.local ()

Why don’t I suggest you use Python3.7.3?

Next generation container architecture is out, where is Docker going? Check out the 6 questions and 6 answers here!!

You can get the advanced reverse teaching video of crawler and multi-platform Chinese data set by replying to “private data” in the public account