Youdao translation JS reverse parsing

Crawl target

Web site:Youdao translation

Tool use

Development tool: PyCharm

Development environment: python3.7, Windows10

Use toolkits: Requests, Random, hashlib

Key learning content

  • Sending network requests
  • Js code debugging
  • Js code parses backwards

Project idea analysis

First of all, the data loading method is different. The url is not changed when the data is requested. Dynamic data is obtained through packet capture

Find the corresponding data interface to get the request to the web page interface request method is post request

Post data that needs to be submitted

It is obvious that the salt is the timestamp of multiple requests for data to be compared only the sign value is in the ever-changing encrypted data find the encryption rules for the data find the encrypted file

Hit the breakpoint to find the location of the corresponding encrypted data and debug the code to enter the breakpoint

The sign value is derived from r.ign to find the generation of r

Locate the final encrypted data location

Ts is the timestamp

Salt adds a random number sign to the timestamp. The data sign generated by the MD5 encryption method is changed by the timestamp change

All that’s left is to write the code

Simple source code analysis

import time import random import hashlib def main(): "" "the main program "" "url = "http://fanyi.youdao.com/translate_o?smartresult=dict&smartresult=rule" headers = {' X - Requested - With: 'XMLHttpRequest', 'the user-agent' : 'Mozilla / 5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.75 Safari/537.36', 'Origin': 'http://fanyi.youdao.com', 'Referer': 'http://fanyi.youdao.com/', 'Cookie': 'OUTFOX_SEARCH_USER_ID = - 1808168645 - @10.108.160.208; JSESSIONID=aaaRyVJv8oEwg7dPaWrux; OUTFOX_SEARCH_USER_ID_NCOO = 704285648.1294403; ___rl__test__cookies=1602406917270'} I = input(" Please enter the data you want: LTS = STR (int(time.time()*100)) salt = LTS + STR (random. 9)) content = "fanyideskweb" + i + salt + "]BjuETDhU)zqSxf-=B#7m" sign = hashlib.md5(content.encode("utf-8")).hexdigest() data = { 'action': 'FY_BY_CLICKBUTTION', 'bv': '9caf244986fe6d1de38207408302e500', 'client': 'fanyideskweb', 'doctype': 'json', 'from': 'AUTO', 'i': i, 'keyfrom': 'fanyi.web', 'lts': lts, 'salt': salt, 'sign': sign, 'smartresult': 'dict', 'to': 'AUTO', 'version': '2.1'} response = requests. Post (url=url,headers=headers,data=data) print(response.json()["translateResult"][0][0]["tgt"]) if __name__ == '__main__': main()Copy the code

I am ** white and white I **, a program yuan like to share knowledge ❤️

If you are not familiar with programming, you can leave a comment on this blog or you want to learn Python.