Wolfs Mount Dreams – Début Album

The statement

  • The idea of this paper is to deploy based on cloud database from the very beginning, regardless of the production environment to build, maintain, backup

  • This article is for Github Gitee only: Pcarpes-boot-Generator code architecture service

  • Thanks for group old iron children shoes review!

PostgreSQL data package

  • Follow the public account: Cd-k8s, chat window input: postgresQL

  • You can get 1 set of videos and 9 PDF files

conclusion

  • MySQL has a better user base than PostgreSQL, especially in China, and is easier to use. But PostgreSQL’s biggest advantage in terms of the experience is the plug-ins and the possibilities that come with them.

  • The two benchmark stress test tools are different, so it is difficult to say that the comparison of test data is fair. If the SAME CURD is tested through Java code, there is little difference in the use experience of the two under non-limit conditions.

  • PostgreSQL has an advantage in data analysis because it supports various functions and syntax for statistical analysis

  • PostgreSQL is a full-stack database: at a considerable scale, it can be done in one move

    • So, as a small business, I feel I can rely on PostgreSQL, especially in the live phase

This article outline

  • About the PostgreSQL official website

  • PostgreSQL and MySQL syntax differences

  • PostgreSQL in Ali Cloud

  • PostgreSQL Docker local deployment

  • PostgreSQL benchmark

  • This section describes the PostgreSQL index

  • PostgreSQL Json and JSONB

  • PostgreSQL execution process analysis

  • PostgreSQL Chinese word segmentation, full text search, custom Chinese word library

  • PostgreSQL routine maintenance

  • The PostgreSQL plug-in is recommended

  • PostgreSQL derived ecology introduction

About the PostgreSQL official website

  • Website: www.postgresql.org/

  • Download website: www.postgresql.org/download/

  • Official document: www.postgresql.org/docs/

    • English document 10: www.postgres.cn/docs/10/

    • 11 Chinese documents (incomplete) : www.postgres.cn/docs/11/ind…

  • Version history: www.postgresql.org/support/ver…

  • According to the use PostgreSQL:www.postgresql.org/about/

  • pgAdmin:www.pgadmin.org/

GUI Tool

  • I don’t recommend the official pgAdmin, simply because there is a better product alternative: Navicat Premium

  • Navicat Premium supports cross-platform, various relational and non-relational databases, and various cloud platforms

  • The interface experience is more friendly than DBeaver, supporting model and SQL transformation

  • Supports import and export in various scenarios

  • Website: www.navicat.com.cn/products/na…

  • However, pgAdmin also has some advantages, such as dashboards and Statistics, which allow you to view server Statistics directly.

PostgreSQL and MySQL syntax differences

PostgreSQL and MySQL commonly share data types

  • If you are considering simultaneous compatibility with multiple databases, it is best to consider the differences from the beginning of your design

  • Numeric types

    • smallint

    • integer

    • bigint

    • double

    • decimal

  • Character types

    • char(100)

    • varchar(100)

    • text

  • Date/time type (I personally like to store timestamps, so I use Bigint)

    • timestamp

    • date

    • time

  • JSON type (PostgreSQL recommends using JSONB because it supports GIN indexing)

    • json

PostgreSQL has something in common with MySQL syntax

  • Insert data

INSERT INTO sys_demo(id, demo_name, demo_num, demo_type_enum, ranking, description, state_enum, delete_enum, create_date, create_user_id, update_date, update_user_id, delete_date, delete_user_id)VALUES (111111111111111101, 'Demo times 01', '101', 1, 100,' Demo description 01', 1, 1, 1574663780828, 11111111111111111111111111, 1574663780828, 1111111111111111111111, NULL, NULL);Copy the code
  • Insert the foreign key

alter table sys_login_log add constraint FK_ID1 foreign key (user_id) REFERENCES sys_user (id);Copy the code
  • Insert the index

CREATE INDEX rel_permission_role_index_most ON rel_permission_role (role_id, permission_id); CREATE UNIQUE INDEX unique_username ON sys_user (username);Copy the code

PostgreSQL has similarities and differences with MySQL syntax

  • On the symbol

    • PostgreSQL does not support package table names, field names, etc

  • On the naming

    • To create an index, MySQL must create an index with a unique name in the table. PostgreSQL must create an index with a unique name in the database

  • For pshapes-boot-Generator projects, there is a policy class: Generatorpostgresql. Java generates PostgreSQL statements directly when generated by MySQL, but only for our own convention of field types

  • MySQL to create table

DROP TABLE IF EXISTS sys_user; CREATE TABLE sys_user(id bigint NOT NULL COMMENT 'iD ', username varchar(50) NOT NULL COMMENT' id :oneParam', Real_name varchar(50) NULL COMMENT 'id ', user_password varchar(50) NOT NULL COMMENT' id ', Password_salt varchar(10) NOT NULL COMMENT 'user_email varchar(50) NULL COMMENT' email address :oneParam', Telephone varchar(20) NULL COMMENT 'varchar ', mobile_phone varchar(20) NULL COMMENT' phone :oneParam', Gender_enum TinyINT NOT NULL DEFAULT '1' COMMENT 'Gender :[1= PRIVACY =PRIVACY, 2= MALE =MALE, 3= FEMALE =FEMALE, 4= NEUTRAL =NEUTRAL] Max =4', Register_type_enum TINyINT NOT NULL DEFAULT '1' COMMENT 'Register mode :[1= Preset =SYSTEM_INIT, 2= New =MANAGEMENT_ADD, 3= Active registration =REGISTER, 4= invited registration =INVITE] Max =4', register_origin_enum TINyInt NOT NULL DEFAULT '1' COMMENT 'Registration source :[1=WEB =WEB, 2= ANDROID APP=ANDROID, 3= Apple APP=IOS, 4=H5=H5, 5= WECHAT_MINI_PROGRAM, 6= WECHAT_OFFICIAL_ACCOUNT] Max =6', State_enum TINyINT NOT NULL DEFAULT '1' COMMENT 'ENABLE status :[1= ENABLE =ENABLE, 2= DISABLE =DISABLE] Max =2', Delete_enum TINyINT NOT NULL DEFAULT '1' COMMENT 'Delete status :[1= NOT DELETED =NOT_DELETED, 2= DELETED =DELETED] Max =2', Create_date bigint NOT NULL COMMENT 'create time ', create_user_id bigint NOT NULL COMMENT' create time ', Update_date bigint NOT NULL COMMENT 'update date ', update_user_id bigint NOT NULL COMMENT' update date ', Delete_date bigINT NULL COMMENT 'delete date ', delete_user_id bigint NULL COMMENT' delete date ', PRIMARY KEY (id)) COMMENT =' delete table ';Copy the code
  • PostgreSQL gen table

DROP TABLE IF EXISTS sys_user; CREATE TABLE sys_user( id bigint NOT NULL, username varchar(50) NOT NULL, real_name varchar(50) NULL, user_password varchar(50) NOT NULL, password_salt varchar(10) NOT NULL, user_email varchar(50) NULL, telephone varchar(20) NULL, mobile_phone varchar(20) NULL, gender_enum smallint NOT NULL, register_type_enum smallint NOT NULL, register_origin_enum smallint NOT NULL, state_enum smallint NOT NULL, delete_enum smallint NOT NULL, create_date bigint NOT NULL, create_user_id bigint NOT NULL, update_date bigint NOT NULL, update_user_id bigint NOT NULL, delete_date bigint NULL, delete_user_id bigint NULL, PRIMARY KEY (id))WITH ( OIDS = FALSE); COMMENT ON TABLE sys_user IS 'sys_user '; COMMENT ON COLUMN sys_user.id IS 'id '; COMMENT ON COLUMN sys_user.username IS 'username '; COMMENT ON COLUMN sys_user.real_name IS 'id '; Omit the otherCopy the code

PostgreSQL and MySQL have bad transactions

  • PostgreSQL supports DDL ROLLBACK in addition to regular DML ROLLBACK

  • BEGIN > COMMIT/ROLLBACK is a safe script operation

  • PostgreSQL supports additional DDL operations, such as CREATE TABLE, TRUNCATE, and ROLLBACK

  • Testing:

select count(*) from sys_user; Suppose nine pieces of data are returned. BEGIN; TRUNCATE sys_user; select count(*) from sys_user; ROLLBACK if 0 data is returned. select count(*) from sys_user; It still returns 9 pieces of dataCopy the code

PostgreSQL is used with MySQL in Java

  • There’s not much difference, the ORM framework takes care of it for you (assuming you’re using common types)

  • PostgreSQL Configuration information

spring:  datasource:    driver-class-name: org.postgresql.Driver    url: jdbc:postgresql://postgresql.cdk8s.com:5432/cdk8s_sculptor_boot?useSSL=false    username: root    password: 123456Copy the code
  • MySQL configuration information

spring:  datasource:    driver-class-name: com.mysql.cj.jdbc.Driver    url: jdbc:mysql://mysql.cdk8s.com:3306/cdk8s_sculptor_boot?useSSL=false    username: root    password: 123456Copy the code

PostgreSQL in Ali Cloud

  • Ali cloud PostgreSQL’s official website: www.aliyun.com/product/rds…

  • Ali Cloud PostgreSQL help: help.aliyun.com/document_de…

  • Ali Cloud PostgreSQL version comparison: help.aliyun.com/document_de…

  • PostgreSQL advanced road: RDS PG+HDB PG application and case set – Cloud Computing community – Aliyun: yq.aliyun.com/topic/118

  • See ali cloud version of PostgreSQL plug-in support: www.alibabacloud.com/help/zh/doc…

    • You can also connect to the database and type: SELECT * FROM PG_available_extensions order by name

  • List of plug-ins supported by Aliyun PostgreSQL 12 (currently 12 does not support the high availability version)

  • Supplement: Because ali official supplemented the document after my work order feedback, but still missing part of the information, I made a test after purchase, here is the complete data

(Aliyun Record)
  • List of PostgreSQL 11 plug-ins queried in 2019-12 (Currently, PostgreSQL 11 supports high availability versions)

(Aliyun Record)
(Aliyun Record)
(Aliyun Record)

PostgreSQL 11 Docker deployment with zhParser plugin (unofficial)

  • Hub.docker.com/r/davidlauh…

  • Dictionary files in the container: / usr/share/postgresql / 11 / tsearch_data

  • Configuration files in the container: / var/lib/postgresql/data/postgresql. Conf

mkdir -p /Users/youmeek/docker_data/pgsql11/data /Users/youmeek/docker_data/pgsql11/confsudo chmod -R 777 /Users/youmeek/docker_data/pgsql11 start a simple container to get configuration files: Docker run -d --name pgsql11 DavidLauhn /postgres-11-with-zhparser To the/etc directory docker cp pgsql11: / var/lib/postgresql/data/postgresql conf/Users/youmeek/docker_data/pgsql11 / conf stop old container docker Stop pgSQL11 && docker rm PGSQL11 sudo chmod -r 777 /Users/youmeek/docker_data/ pgSQL11 Start a new container docker run -d --name pgsql11 \ -e POSTGRES_PASSWORD=123456 \ -v /Users/youmeek/docker_data/pgsql11/conf/postgresql.conf:/etc/postgresql/postgresql.conf \ -v /Users/youmeek/docker_data/pgsql11/data:/var/lib/postgresql/data \ -p 5432:5432 \ davidlauhn/postgres-11-with-zhparser \ - c 'config_file = / etc/postgresql/postgresql conf' default library name: postgres user name: postgresCopy the code

Connection configuration

  • This file is specially configured login authentication information: / var/lib/postgresql/data/pg_hba. Conf

  • By default, ip4 can only be 127.0.0.1 to log in, I also add a: 192.168.0.101, you can extend their own, a line of configuration

  • The METHOD column is used to configure the password connection encryption mode

  • After the modification, restart the service

  • There is one other special point:

    • / etc/postgresql/postgresql. Conf inside has a parameter default is: line like listen_addresses = ‘*’

    • However, some versions are localhost, so you need to change the asterisk, otherwise you will not be able to connect

    • It is recommended to change the connection to a larger point: max_connections = 1000

# TYPE DATABASE USER ADDRESS METHOD# "local" is for Unix domain socket connections onlylocal all all trust# IPv4 local Connections :host all all 127.0.0.1/32 trusthost all all 192.168.0.101/32 Trust# IPv6 local connections:host all all ::1/128 trust# Allow replication connections from localhost, By a user with the# replication privilege. Local replication all trusthost replication all 127.0.0.1/32 trusthost replication all ::1/128 trusthost all all all md5Copy the code

PostgreSQL Database and Schema description

  • I remember I saw it in Ali Cloud or something. Let’s call it that for now.

  • PostgreSQL is unique from MySQL tables in that its database also has something called a schema. By default, each library has a public schema

  • PostgreSQL supports multiple modes, but other modes are not recommended. Because public is the default mode, so when writing query SQL can omit this, so convenient later system to do database migration

Common management SQL

Select version(); Check software configuration select * from pg_config; Select * from pg_settings; More directory parameters: https://www.postgresql.org/docs/11/catalogs.htmlhttp://postgres.cn/docs/10/catalogs.html to CREATE USER CREATE USER root WITH PASSWORD '123456'; Create a library and empower it to rootCREATE DATABASE CDK8S_Sculptor_boot OWNER root; GRANT ALL PRIVILEGES ON DATABASE newDbName TO root; Docker exec it container ID pg_dump -h localhost -u postgres database name > /data/backup.sqlCopy the code

Pgbench benchmarking

  • PostgerSQL comes with a benchmark tool: PGBench

  • It can test with custom script files, but we won’t go into details here, just the built-in scripts for testing

  • CentOS 7 contrib: yum install -y postgresql-contrib

  • Version: pgbench -v

Initializing the database

  • Pgbench -i -s 2 -f 100 -h 127.0.0.1 -p 5432 -u myUserName -d myDBName

    • -I stands for initialization

    • -s 2 Initializes data of two merchants. By default, one merchant has 10W accounts (PGbench_accounts) and 10 tellers (pgbench_tellers). If we set 2 here, the corresponding data will be generated twice

    • -f Database fill factor when creating tables. The value ranges from 10 to 100. The smaller the value is, the UPDATE performance improves.

NOTICE: table "pgbench_branches" does not exist, skippingNOTICE: table "pgbench_tellers" does not exist, skippingNOTICE: table "pgbench_accounts" does not exist, skippingNOTICE: table "pgbench_history" does not exist, skippingcreating tables... 10000 tuples done.20000 tuples done.30000 tuples done.40000 tuples done.50000 tuples done.60000 tuples done.70000 tuples  done.80000 tuples done.90000 tuples done.100000 tuples done.set primary key... vacuum... done.Copy the code

The test command

  • Description of test statement parameters

- the number of concurrent client c - j working threads -m submitting a query to the server use agreement: simple | extended | prepared - n when running the test does not perform cleaning - T implement the total time, 60 units seconds - r delay in the average output of each SQL statementCopy the code
  • Read-only test, controlled by -s parameter

Nohup pgbench -c 5-j -m prepared -n -s -t -h 123.0.0.1 -p 123.32 -u mydb -d myDBName > /opt/pgbenchtest-readonly.out 2>&1 &Copy the code
  • Update, query, insert test

Nohup pgbench -c -j -m prepared -n -t 60 -h 127.0.0.1 -p 5432 -u myUserName -d myDBName > /opt/pgbenchtest-all.out 2 > &1 &Copy the code
  • Do not perform update tests, controlled by the -n parameter

Nohup pgbench -c 5-j -m prepared -n-n -t -h 123.0.0.1 -u mydb -d myDBName > /opt/pgbenchtest-noupdate.out 2>&1 &Copy the code

The test report

Pghost: 127.0.0.1 pgport: 5432 NClients: 5 duration: 10 dbName: myDBNametransaction Type: TPC-B (sort of)scaling factor: 2query mode: preparednumber of clients: 5number of threads: 5duration: 60 snumber of transactions actually processed: 7237TPS = 723.210820 (including connections) TPS = 723.959626 (including connections)Copy the code
  • Parameter Description:

  • Number of clients is the number of clients specified during the test

  • Number of Threads is the number of threads per client specified during the test

  • Number of transactions actually processed is the number of transactions actually completed and the number of transactions planned to complete at the end of the test,

  • TPS = 723.210820 (Including Connections)

    • Contains the TPS value for the cost of establishing a network connection

  • TPS = 723.959626 (including connections)

    • The TPS value does not include the cost of establishing a network connection

This section describes the PostgreSQL index

  • Index types: B-tree, Hash, GiST, SP-gist, GIN

  • For a regular table, we use b-tree

  • There are two types of full text retrieval using GIN:

    • Tsvector: Represents a document that is optimized to be searchable

    • Tsquery: a text query, Boolean operation & support |!

    • Both tie-in use: select * from tableName where to_tsvector (‘ zhparser ‘fieldName) @ @ to_tsquery (‘ search content 1 other contents | 2’);

    • GIN index supports operators: @>,? ,? &,? |

PostgreSQL Json and JSONB

  • PostgreSQL JSON stores two types of fields: JSON and JSONB. The difference between the two fields is that JSON is fast to write and jsonB is slow to write and jsonB is fast to read. Because JSONB is binary storage, JSON is text storage.

  • Jsonb supports the GIN index type, so we recommend jSONB for practical use

  • Suppose you have a piece of JSON:

{  "id": 11111111111111,  "username": "admin"}Copy the code
  • CREATE GIN INDEX ON tableName USING GIN(fieldName);

  • Query operation:

select * from tableName where jsonFieldName @> '{"username":"admin"}'Copy the code

PostgreSQL execution process analysis

explain (analyze,verbose,costs,buffers,timing)select * from sys_user where to_tsvector('zhparser', Real_name) @ @ to_tsquery (' li si | admin ');Copy the code
  • The output content is in the following format:

Seq Scan on public.sys_user (cost=0.00.. 66.38 rows=1 width=692) (actual time=0.837.. 1.622 rows = 10 loops = 1) the Output: id, username, real_name, user_password, password_salt, user_email, telephone, mobile_phone, ... Omit... delete_date, delete_user_id Filter: (to_tsvector('zhparser'::regconfig, (sys_user. Real_name) : : text) @ @ to_tsquery (' li si | admin: : text)) Rows Removed by the Filter: 82 Buffers. Shared hit=3Planning Time: 1.640msexecution Time: 1.863 msCopy the code
  • The above information has a content called: scan nodes, common ones are:

    • Seq Scan: Scans all tables in sequence

    • Index Scan, based on an Index Scan, but does not just return the value of the Index column

    • IndexOnly Scan: an index-based Scan that returns only the values of indexed columns

    • BitmapIndex Scan: Uses the Bitmap structure to Scan

    • BitmapHeap Scan converts Bitmap structures returned by BitmapIndex Scan to tuples

    • Tid Scan, used to Scan a tuple Tid array

    • Subquery Scan: Scans a Subquery

    • Function Scan, which handles scans containing functions

    • TableFunc Scan, handles TableFunc related scans

    • Values Scan, used to Scan the Values linked list

    • Cte Scan, which is used to Scan the result set of the WITH sentence

    • NamedTuplestore Scan, which is used to Scan some named result sets

    • WorkTable Scan is used to Scan intermediate data from Recursive Union

    • Foreign Scan: used for Foreign key scanning

    • Custom Scan is used for user-defined Scan

  • Cost estimation information :(cost=0.00.. 66.38 rows = 1 width = 692)

  • Actual time=0.837.. 1.622 rows = 10 loops = 1)

    • The dot symbol is preceded by the actual startup time of the node, which is the actual time required to find the first result that meets the condition of the node

    • This is followed by total execution time

    • It’s all in ms

    • Actual Time Execution time

    • Rows is the actual number of rows returned, not the number of rows scanned by MySQL

    • Loops refers to the actual number of restarts of the node. If a schedule node is running and its related parameter values (such as binding variables) change, you need to rerun the schedule node.

  • In addition to the above three core, more parameters that can see: mysql.taobao.org/monthly/201…

PostgreSQL Chinese word segmentation, full text search, custom Chinese word library

  • PostgreSQL full-text indexes are implemented using gin indexes (common inverted indexes). By default, there are built-in English and Spanish participles, but no Chinese.

  • We need the plug-in: zhParser

  • Ali cloud help documentation: help.aliyun.com/knowledge_d…

Test the segmentation effect

Database level (not software use case level) CREATE EXTENSION zhparser; I'm going to call the CONFIGURATION "zhparser", that's what I'm going to do, or you can change it to whatever you like CREATE TEXT SEARCH CONFIGURATION zhparser (PARSER = zhparser); Modify the definition of a full-text search configuration to map nouns (n), verbs (v), adjectives (a), idioms (I), interjections (e) and idioms (L). The dictionary uses the built-in Simple dictionary, which only does lowercase conversions. The introduction to this article is very well written: https://cloud.tencent.com/developer/article/1430039ALTER TEXT SEARCH CONFIGURATION zhparser ADD MAPPING FOR n,v,a,i,e,l WITH simple; Select ts_debug('zhparser', 'Cretaceous is a time of earth's continental distribution and biological changes with high volcanic activity '); SELECT * from ts_parse('zhparser', 'hello world! In 2010, the construction of affordable housing was launched nationwide, and the central and local governments have increased the construction and investment of affordable housing. In 2011, affordable housing entered a larger construction phase. Jiang Weixin, minister and party secretary of the Ministry of Housing and Urban-Rural Development, said at the national Housing and Urban-rural Development Work Conference late last year that low-income housing will continue to be built. '); SELECT to_tsvector('zhparser',' the number of affordable housing starts will be lower this year, but the actual annual construction scale and completion scale will exceed previous years, corresponding to the demand for funds will be record > record. Chen guoqiang said. In his opinion, compared with 2011, affordable housing construction in 2012 will be more severe pressure on the financial support. '); SELECT to_tsquery('zhparser', 'affordable housing fund '); SELECT plainto_tsquery('zhparser', 'affordable housing fund ');Copy the code

Extended thesaurus – Method 1

  • A user-defined dictionary has a higher priority than a native dictionary

  • Limit the zhParser version to 2.1 or later

SELECT * FROM ts_parse('zhparser', 'zhparser') SELECT * FROM ts_parse('zhparser', 'zhparser'); Insert into zhparser. Zhprs_custom_word values(' fund pressure '); Select sync_zhprS_custom_word (); SELECT * FROM ts_parse('zhparser', 'affordable housing fund '); In particular, although the zhParser of Ali Cloud is shown as version 1.0, we learned through work order communication that they modified this plug-in, so the version is useless. Their custom thesaurus is such: https://help.aliyun.com/knowledge_detail/44451.html table name and function of both the rename: Insert into PG_ts_custom_word values (' affordable housing '); select zhprs_sync_dict_xdb(); SELECT to_tsquery('zhparser', 'affordable housing fund ');Copy the code

Extended thesaurus – Approach 2

Custom dictionary files must be put in/usr/share/postgresql / 11 / tsearch_data TXT indicates that the dictionary is in text format. XDB indicates that the dictionary is in XDB format. Multiple files are separated by commas. Zhparser. extra_dicts = 'dict_extra. TXT,mydict. XDB 'This parameter is added to: postgresql.confCopy the code
  • Dict_extra. TXT Content format

; Dict_extra. TXT I am a new word 2.0 try a 1.0 1.0 @; The following words are delete items this is delete 1.0 1.0!Copy the code
Test: SELECT * FROM ts_parse('zhparser', 'I'm a new word, you can try another '); Result: 114 I 118 is 118 add 110 words 117, 114 you 118 can 100 again 118 try 109 a copy of the dictionary to the container: Docker cp/Users/youmeek dict_extra. TXT pgsql11: / usr/share/postgresql / 11 / tsearch_data restart container: docker restart pgsql11 reconnect the test: SELECT * FROM ts_parse('zhparser', 'I'm a new word, you can try another '); Result: 120 I am the new word 117, 114 you 118 can 120 try anotherCopy the code

Independent full-text search field

  • If a lot of full-text retrieval is required, it is recommended to have additional fields to store the word segmentation content of the corresponding field

create index idx_name on tableName using gin(to_tsvector('zhparser', fieldName)); Select * from tableName where to_tsvector('zhparser', fieldName) @@ 'search for content '; Select * from tableName where to_tsvector('zhparser', fieldName) @@' Select * from tableName where to_tsvector (' zhparser 'fieldName) @ @' search 11 | search content 22; Select * from tableName where to_tsvector (' zhparser 'fieldName) @ @ to_tsquery (' search content 1 other contents | 2'); ALTER TABLE TABLE ADD COLUMN tsv_columnName TSvector; ALTER TABLE TABLE ADD COLUMN tsv_columnName tsvector; UPDATE table SET tsv_columnName = to_tsvector('zhparser', coalesce(fieldName,'')); CREATE INDEX idx_gin_zhCN ON table USING GIN(tsv_columnName); Create trigger trigger_NAME before INSERT or updateon table for each row execute procedure tsvector_update_trigger(tsv_columnName, 'zhparser', fieldname);Copy the code

PostgreSQL routine maintenance

Enable the PG_stat_statements module

  • This module can be used to collect SQL running information in the database, such as the total execution time of SQL, the number of calls, and so on. It is commonly used to monitor SQL performance

Modify the configuration file: / etc/postgresql/postgresql conf changed the default value is: Shared_preload_libraries = ', Change to shared_preload_libraries = 'pg_stat_statements'pg_stat_statements. Max = 5000pg_stat_statements Allpg_stat_statements. Track_utility = onpg_stat_statements. Save = ON Restart the software log in as postgres superuser: create extension pg_stat_statements; This generates 1 pg_stat_statements view and 2 functions: Pg_stat_statements, Pg_stat_statements_reset () Maximum number of statistics calls select * from pg_stat_statements ORDER by calls desc limit 10 Maximum number of statistics calls select * from Pg_stat_statements ORDER by mean_time desc LIMIT 10 Empty statistics select pg_stat_statements_reset();Copy the code
  • Core fields, when the query can be sorted according to the following fields, filter out the corresponding SQL records

Total_time: total execution time, msmin_time: minimum execution time, msmax_time: maximum execution time, msmean_time: average execution time, msrows: Shared_blks_read: number of shared memory data blocks read by SQL shareD_BLks_written: Number of shared memory data blocks written by SQL shareD_BLks_dirtied: indicates the number of dirty shared memory data blocks generated by SQLCopy the code
  • In local deployment, pg_web_STATS is used to view the preceding information on the WEB UI

  • Pg_web_stats:github.com/kirs/pg_web…

Log to check

  • Set the log output format to csvlog

  • Save a portion every day, a maximum of 100 meters

  • Modify the postgresql.conf file and add the following content (default configuration parameters are uncommented)

  • Restart the server. Under the data directory, you can see a new directory: pg_log

log_destination = 'csvlog'              # Valid values are combinations oflogging_collector = on                  # Enable capturing of stderr and csvloglog_directory = 'pg_log'                # directory where log files are written,log_filename = 'postgresql-%d_%H%M%S.log'       # log file name pattern,log_file_mode = 0600                    # creation mode for log files,log_truncate_on_rotation = on           # If on, an existing log file with thelog_rotation_age = 1d                   # Automatic rotation of logfiles willlog_rotation_size = 100MB               # Automatic rotation of logfiles willCopy the code
  • CSV can be opened directly in Excel

  • If you have a lot of data, look at the following headers, create a table, import the CSV data into the table, and then perform SQL filtering

  • If there is the error code can be found here in the log corresponding description: www.postgresql.org/docs/11/err…

Connection number check

  • View information about the current number of connections

SELECT * from  pg_stat_activity;Copy the code

Capacity check

  • The current storage size of the library can be seen from the following SQL

  • If you check regularly, you can summarize the changes of business development, whether there is sudden growth, large table and so on

  • Pg_size_pretty means display in human-readable units

select datname,pg_size_pretty(pg_database_size(oid)) from pg_database;Copy the code
  • Find all tables sorted by size and separate data and index storage size

SELECT table_name, pg_size_pretty(table_size) AS table_size, pg_size_pretty(indexes_size) AS indexes_size, pg_size_pretty(total_size) AS total_sizeFROM ( SELECT table_name, pg_table_size(table_name) AS table_size, pg_indexes_size(table_name) AS indexes_size, pg_total_relation_size(table_name) AS total_size FROM ( SELECT ('"' || table_schema || '"."' || table_name || '"') AS table_name FROM information_schema.tables ) AS all_tables ORDER BY total_size DESC) AS pretty_sizeswhere table_name like  '"public"."%'Copy the code

View commit and rollback numbers

  • If the number of rollback times/commits is too high, there must be something wrong with the program that causes frequent rollback

select datname,xact_commit,xact_rollback from pg_stat_database;Copy the code

Regular vacuum

  • Automatic vacuum is available in PostgreSQL. However, if a large number of data I/OS are used, automatic vacuum may be slow. Therefore, you need to clean the database manually and with your own scripts

  • Open the configuration file and find the configuration for which automatic vacuum = on is enabled

  • If this operation is performed manually, use the postgres user with the highest rights. You are advised to perform this operation at night when services are off-peak

vacuumdb -d dbName -f -z -vCopy the code
  • Check out the critical value, age can refer to this: blog.csdn.net/u011598529/…

The PostgreSQL plug-in is recommended

PostGIS plug-in

  • Supports spatial objects, spatial indexes, spatial manipulation functions, and spatial operators

  • Common service scenarios:

    • Enclosure: with the current or designated center point, find out the businesses or people within the radius of 1 km

    • Fence: Detect which geographical fence the designated point falls in, such as whether shared bikes are parked in a no-parking zone or drones are in a no-fly zone

  • If you need to use spatial data, this plugin is the best support

  • PostgreSQL 11 is currently supported by Aliyun. The latest version of PostgreSQL 12 does not have the plugin

  • The Geometry type is recommended for spatial data types

  • PostgreSQL also has several built-in geometric types, but not the same mechanism as PostGIS. It is recommended to use objects starting with ST (ST = Spatial Type).

pg_pathman

  • Liverpoolfc.tv: github.com/postgrespro…

  • See de Elder brother article said that the built-in partition performance to the good, I did not try, we can try. The plugin is already built into PostgreSQL 11 of Aliyun

PgRouting plug-in

  • Geolocation based on PostgreSQL and PostGIS

  • Core functions:

    • All shortest path combinations (Johnson algorithm)

    • All shortest path combinations (Floyd-Warshall algorithm)

    • Shortest A* path

    • Bidirectional Dijkstra shortest path

    • Bidirectional A* shortest path

    • Dijkstra Shortest path

    • Driving distance

    • K- Shortest path, multiple candidate paths

    • K-dijkstra, one to many shortest paths

    • Traveling Sales Person

    • TRSP (Turn Restriction Shortest Path)

TimescaleDB plug-in

  • As time series data

  • It supports automatic sharding, efficient writing, retrieval and quasi-real-time aggregation of sequential data

  • Ariyun: help.aliyun.com/document_de…

PASE plug-in

  • Vector retrieval, based on IVFFlat or HNSW algorithm

  • Ariyun: help.aliyun.com/document_de…

Other plug-ins

  • Such as reading and writing data source FDW(Foreign data wrappers) directly, such as Redis, Hadoop, MySQL, etc., but I’m not a big fan of this, so it’s not recommended here

  • Use the dBLINK, postgres_FDW plug-ins for cross-library operations

  • Use the mysql_fdw plugin to read and write MySQL data

PostgreSQL derived ecology introduction

  • Amazon Aurora PostgreSQL compatible version: aws.amazon.com/cn/rds/auro…

  • Real-time data warehouse, massively parallel processing, BI, AI

    • GreenPlum:greenplum.org/

    • Ali cloud version: www.aliyun.com/product/gpd…

    • Amazon Redshift:aws.amazon.com/cn/redshift…

  • NewSQL (external interface is PostgreSQL)

    • CockroachDB:github.com/cockroachdb…

    • YugabyteDB:www.yugabyte.com/yugabytedb/

  • Massively parallel processing

    • Postgres-XL:www.postgres-xl.org/

  • High availability solution

    • Stolon Cloud native high availability: github.com/sorintlab/s…

    • Patroni Traditional high availability: github.com/zalando/pat…

References outside the official website

  • www.cnblogs.com/zhenbianshu…

  • Cloud.tencent.com/developer/a…

  • 57 km. Cc/post/postgr…

  • Mysql.taobao.org/monthly/201…

  • blog.windrunner.me/postgres/

  • www.zhihu.com/question/20…

  • Blog.csdn.net/u011598529/…


We will try our best

Focus on Internet smes research and development solutions

Disclaimer:

Some of the pictures and audio resources of this wechat are from the Internet and other public platforms, and the copyright belongs to the original author.

Please inform us if any infringement is involved, and we will delete relevant content within 24 hours. Thank you!