This article is participating in Python Theme Month. See the link to the event for more details

It’s not for nothing that Python is called the most convenient language and its third-party libraries are incredibly powerful (just kidding)

Here are a few particularly useful third-party libraries to give you a direction, want to detailed research partners can be checked to the official website so start.

Web crawler

• Requests [1] are best accessed with the simplest web crawler

•BeautifulSoup[2] The simplest web page parsing library

• PyQuery [3] is the most concise web parsing library

• Scrapy [4] The most popular crawler framework

• PySpider [5] a crawler framework developed by China

• Selenium [6] is a browser automation framework that can be used to crawl backwards

• Scylla [7] Intelligent IP proxy pool for reverse crawling

• ShReport [8] Shanghai Stock Exchange listed companies regular report download

• Newspaper [9] news crawler library, according to the PROVIDED URL can extract news headlines, authors, keywords, summary, some functions support Chinese

The database

• PyMySQL [10]

•Sqlite3[11] Lightweight SQL database (Built-in python library)

• Pymongo [12] Non-relational MongoDB library

• Redis cache database

The data analysis

• Pandas [13] must analyze the Python data library, read files, preprocess data, analyze, and store them

• Modin [14] The Interface syntax for Pandas is the same as that for pandas

• Dask [15] The Interface syntax for pandas is the same as that for pandas

• PlyData [16] Pandas Pipeline syntax library

• Networkx [17] Social network analysis database

Machine learning

• Scikit-Learn [18] is a required library for machine learning that supports both supervised and unsupervised algorithms, including text analysis

•Orange3[19] click-action machine learning analysis software for text analysis

• Doccano [20] Text data annotation tool

• Label-Studio [21] is the most excellent text data annotation tool

visualization

• Matplotlib [22] Is the most versatile drawing library in Python. But the grammar is difficult, static diagram

• Seaborn [23] developed a simplified visualization library based on Matplotlib. General diagrams can be drawn with TA; High customization still needs to be combined with Matplotlib for style customization; Static figure

• Plotnine [24] A Python visualization library of ggplot2 syntax, which can be used in conjunction with plyData [25]

• Pyecharts [26], a dynamic visualization library developed and encapsulated by Chinese people; Chinese document

• Plotly [27] Dynamic visualization library

• Bokeh [28] Dynamic visualization map rendering library

•SciencePlots[29] 科研论文绘图,基于matplotlib

• Datapane [30] data analysis report generation

• Superset [31] Open source business intelligence analysis visualization library

The text analysis

• NLTK [32] Natural language analysis suite, unfriendly to Chinese

• Spacy [33] Industrial level natural language model base, supporting Chinese

• Pattern [34] Natural language processing, network analysis, visualization library

• Jieba [35] Chinese text thesaurus

• SnownLP [36

• Gensim [37] best to use the most comprehensive topic model

• CNSENTI [38] Chinese Sentiment Database can be used to analyze text Sentiment.

• Label-Studio [39] is the most excellent text data annotation tool

• Doccano [40] Text data annotation tool

• TextStat [41] Text readability calculation package (algorithm full, but English only)

• Texthero [42] Text preprocessing, display, visualization library, support English only

GUI forms software development

• Tkinter [43] Python’s built-in GUI library

•PySimpleGUI[44] The simplest GUI development library

• PyQt5, Pyside [45] the most excellent GUI software development library

Automated office

• Zmail [46] Automated send and receive mail management library

• PyWinAuto [47] Automated Python library for Windows computers

WeasyPrint[48] automates production of PDF reports

• Selenium [49] Browser automation framework that automatically clicks the browser to do some work

• mkdocx [50]

• Python-docx [51] creates and modifies a docX file library

• Python-PPT [52] Create and modify the PPT file library

• OpenPyXL [53] XLSX file library