Go Movies

A crawler movie station based on Golang, effect station: go-movies.hezhizheng.com/

Making:
Github.com/hezhizheng/…

Use the installation

# download git clone https://github.com/hezhizheng/go-movies # # CD go into the directory - movies start go run main. Go visit # or install the bee bee run  http://127.0.0.1:8899Copy the code

Open the crawler

  • Direct access to the link http://127.0.0.1:8899/movies-spider
  • Consumption: in Windows, about 10% of the CPU and 30mb of memory are consumed
  • When the network is normal, it takes about 21 minutes to complete crawler (there are some records of crawler failure)

Tools

  • Github.com/gocolly/col… The crawler frame
  • HTML /template Template engine
  • Database Redis cache/persistence github.com/Go-redis/re…
  • Routing github.com/julienschmi…
  • Jsoniter github.com/json-iterat…

Refer to beego for directory structure

TODO

  • [] Cross-platform packaging, template path is incorrect
  • [] Goroutine concurrency control
  • [] Crawl data integrity
  • [] redis query problem?

Other

Many Go principles have not been understood, and I will study further slowly if I have energy. It’s a little sloppy, I’ll give you a break.


go
The crawler

hezhizheng