The article directories

  1. 1. The resource
  2. 2. To summarize

Do you want to get all the answers from Zhihu Daily on a certain topic? Do you want to know how zhihu users ranked the likes on these answers? Do you want to know the comments of Zhihu users under these answers?

This week, the author realized a retrieval tool of Zhihu Daily database in his spare time. Zhihu Daily database contains all the contents of Zhihu Daily from its emergence (2013.05.24) to the present (2016.12.01), including the main body, popularity and comments. The retrieval tool implements database structure, concrete data table content, detailed data entry display and the most important text search and popularity ranking functions.

Please copy the database file named zhihudaily.db to the folder where the executable file resides or to the Eclipse project directory.

Further data mining to be continued…

Without further ado, let’s take a look at the picture above.

Database structure diagram

Data table structure diagram

Query result table

Detailed table of data entries

resources
  • Zhihu Daily database retrieval tool code and executable file: Chao’s Github
  • Zhihu Daily database acquisition: Baidu web disk Python crawler
  • SQLite database operation Guide: Online courses
conclusion
  • Be sure to use search engines;
  • The structure of the software must be clear, confident to slowly map;
  • Learning new knowledge requires analogies;
  • Database instruction formats are important;
  • Because the database DELETE command only clears data flag bits, VACUUM is required to reclaim database space.
  • Be sure to catch exceptions in your program that can be handled and handle them appropriately.