This article is participating in Python Theme Month. See the link to the event for more details

Hello everyone!! Recently, I have sorted out some useful libraries, but it is only a preliminary introduction. If you can use them, please check them on the official website. Because there are too many things, I may not be realistic if I introduce them

Hahaha then let’s get down to business and do it!!

takeaway

To avoid this, we create objects using as many classes as we can from the library, often with as little as one line of code. As a result, libraries can help us perform important tasks with just the right amount of code. I think this is one of the reasons why Python is so popular among us. Welcome to like it and learn it later.

foreplay

I forgot to mention that when you use Python, I still recommend the next integrated development environment, Anaconda, which can better manage these third-party library files. The benefits of Anaconda are only known when you really use it. The old rule is to check if you want to use it. Two is to seek others to help you baidu!!” That’s a real quote

1. Python standard library

Sometimes you might not expect the amount of functionality available in the Python standard library. Including text/binary data processing, mathematical operation, functional programming, file/directory access, data persistence, data compression/archive, encryption, operating system services, concurrent programming, inter-process communication, network protocol, JSON/XML/other Internet data format, multimedia, internationalization, GUI, debugging, analysis, etc. Some of the Python standard library modules are listed below.

  • Difflib: a difference calculation tool

  • Collections: Enhanced data structures based on lists, tuples, dictionaries, and collections.

  • CSV: Handles a file with values separated by commas.

  • Datetime, Time: Date and time operations.

  • Decimal: fixed-point or floating-point operations, including currency calculations.

  • Doctest: Simple unit tests with validation tests or expected results embedded in docString.

  • Json: Handles JSON (JavaScript Object Notation) data for Web services and NoSQL document databases.

  • Math: Common mathematical constants and operations.

  • OS: Interacts with the operating system.

  • Queue: A first-in, first-out data structure.

  • Random: Pseudo-random number operation.

  • Re: Regular expression for pattern matching.

  • Sqlite3: SQLite relational database access.

  • Statistics: Mathematical statistical functions, such as mean, median, mode, and variance.

  • Sys: – Command line parameter processing, such as standard input stream, output stream, and error stream.

  • Timeit: Performance analysis.

  • String: a generic string operation

  • Textwrap: Text fill

  • Unicodedata: Unicode character database

  • Stringprep: Internet string preparation tool

  • Readline: GNU read by line interface

  • Rlcompleter: GNU implementation function for reading by line

Python has a large and still rapidly growing open source community of developers from many different fields. The large number of open source libraries in this community is one of the most important reasons for Python’s popularity.

It’s amazing how many tasks can be done with just a few lines of Python code. Some popular data science libraries are listed below.

2. Scientific calculation and statistics

  • NumPy (Numerical Python) : Python has no built-in array data structures. It provides list types that are easier to use, but slower to process. NumPy provides high-performance Ndarray data structures to represent lists and matrices, as well as operations to process these data structures.
  • SciPy (Scientific Python) : SciPy was developed based on NumPy, adding programs for Scientific processing, such as integrals, differential equations, additional matrix processing, etc. Scipy.org manages SciPy and NumPy.

StatsModels: Provides support for statistical model evaluation, statistical testing and statistical data research.

  • IPython is part of Python’s standard toolkit for scientific computing, and it ties a lot of things together, sort of like an enhanced Version of the Python shell. IPython is designed to improve the speed of programming, testing, and debugging Python code. IPython is designed to improve the speed of programming, testing, and debugging Python code.

3 data processing and analysis

  • Pandas: A very popular data processing library. Pandas takes full advantage of NumPy’s Ndarray type, and its two key data structures are Series (one-dimensional) and DataFrame (two-dimensional).
  • [14] Pandas accelerates the library. The interface syntax is highly consistent with that of pandas
  • Dask [15] The interface syntax for pandas is the same as that for pandas
  • Plydata [16] Pandas Pipeline syntax library

4 visualization

  • Pyecharts Echarts is an open source data visualization by Baidu. With good interactivity and exquisite graphic design, it has been recognized by many developers. Python, on the other hand, is an expressive language that is well suited for data processing. Pyecharts is born when data analysis meets data visualization
  • Matplotlib: highly customizable visualization and drawing library. Matplotlib can draw regular graphs, scatter graphs, bar charts, contour charts, pie charts, vector field charts, grid charts, polar charts, 3D maps and add text descriptions.

Seaborn: Higher level visualization library built on Matplotlib. Compared to Matplotlib, Seaborn improves the look and feel, adds ways to visualize, and creates visualizations using less code.

Machine learning, deep learning and reinforcement learning

  • Scikit-learn: A top machine learning library. Machine learning is a subset of AI, and deep learning is a subset of machine learning, focusing on neural networks.

  • Keras: One of the easiest deep learning libraries to use. Keras runs on top of TensorFlow (Google), CNTK (Microsoft’s deep learning cognitive toolkit) or Theano (University of Montreal).

  • TensorFlow: Developed by Google, it is the most widely used deep learning library. TensorFlow works best with gpus (Graphics processing units) or Google’s custom TPU (Tensor processing units). TensorFlow plays a very important role in artificial intelligence and big data analysis, because artificial intelligence and big data have a huge demand for data processing.

  • OpenAI Gym: library and development environment for developing, testing, and comparing reinforcement learning algorithms.

  • Pytorch PyTorch is the Python version of Torch, which is an open-source neural network framework created by Facebook specifically for GPU-accelerated deep Neural Networks (DNN) programming. Torch is a classic tensor library for manipulating multi-dimensional matrix data and is widely used in machine learning and other math-intensive applications. Unlike the static graph of Tensorflow, the graph of PyTorch is dynamic and can be changed in real time as the calculation needs. But because of the Lua language, Torch has been a small audience in China, and it has gradually lost users to Python-enabled Tensorflow. As a port for the classic machine learning library Torch, PyTorch provides Python language users with comfortable coding options.

Natural language processing

  • Natural Language Toolkit (NLTK) : Used to complete Natural Language processing (NLP) tasks.
  • TextBlob: An object-oriented NLP text processing library, built on NLTK and pattern NLP libraries, simplifying many NLP tasks.
  • Gensim: Features similar to NLTK. Commonly used to index a collection of documents and then determine how similar another document is to each document in the index.