In the process of using Greenplum, you may encounter a database PANIC problem caused by running some SQL, or the execution plan of the database is not that reasonable and the SQL performance is too slow. In such cases, business users can contact VMware technical support for help. Greenplum SQL problem recurrence tool Minirepo can help us determine whether the problem is an execution plan problem or a data level problem.

What is MinirePro?

The Minirepro tool is a lighter tool than GPSD. It collects namespace information and statistics for the corresponding SQL so that the problem can be reproduced on another set of clusters. GPSD tools collect metadata and statistics for the entire database compared to Minirepro tools. Minirepro’s lightweight advantage over GPSD (it only collects information about the SQL, not the full library) makes it a preferred optimizer troubleshooting tool for database administrators and for Pivotal’s aftermarket support. For example, run the -version command to view the version information and the -help command to view the command usage mode:

[gpadmin@mdw]$ /usr/local/GP-4.3.7.2/bin/minirepro --version
minirepro 1.0
[gpadmin@mdw]$ /usr/local/GP-4.3.7.2/bin/minirepro --help
Usage: minirepro  [options]
Copy the code
Options: --version show program's version number and exit -? , --help Show this help message and exit -h HOST, --host=HOST Specify a remote host -p PORT, --port=PORT Specify a port other than 5432 -U USER, --user=USER Database user to connect with.This is mandatory as of now -q QUERY_FILE file name that contains the query -f  OUTPUT_FILE minirepro output file name. Must be absolute pathCopy the code
Example: minirepro database-name -q sql-file-name -f /tmp/output-file-name -U gpadmin
Copy the code

Note: Minirepro is only available on version 4.3.7.

A more detailed example is as follows: for example, prod_db is a production library, and we run the statements in query.sql and PANIC or SQL is running too slow. We can collect mini-repro using the following statement:

gpadmin@mdw]$ minirepro prod_db -q query.sql -f /data/oufile_msq -U gpadmin Connecting to database: Host = MDW, port=5432, user=gpadmin, db=prod_db... Metadata from Query file query. SQL... Invoking PG_dump to dump DDL... pg_dump -h mdw -p 5432 -U gpadmin -sxO prod_db -t 'ravedw.v_f_dtd_modular_scenario_fdsf|fdsf_scenario_map|mt_position|mt_scenario_surf_dim_msr|mt_modular_scenario' -f /tmp/20160419101152/ravedw.dp.sql pg_dump -h mdw -p 5432 -U gpadmin -sxO prod_db -t 'dwuser.u_modular_scenarios_fdsf' -f / TMP / 20160419101152 / dwuser. Dp. Writing SQL schema DDLs... Writing Table & View DDLs... Writing table statistics... Writing the column statistics... Attaching Raw Query text... --- MiniRepro completed! ---Copy the code

/data/oufile_msq is the file generated by mini-repro. We can use this file to import metadata and statistics using PSQL -f /data/ oufile_MSq in another Greenplum database cluster and reproduce the optimizer problems.

Second, we may encounter some problems when using MinirePro:

Question 1:

Does Minirepro collect the data in my table?

A: No, minirePro only collects DDL and statistics for tables, not data from tables, and you can safely hand the results over to Greenplum’s support staff

Question 2:

Will Minirepro run when there are other jobs:

A: Yes, MinirePro does not invoke exclusive locks and can run with jobs that do not have exclusive locks

Question 3:

Error while running gp_toolkit.gp_dump_query_oids(text).
​Please make sure the function is installed and the query file contains single valid query.
​
error 'ERROR:  Cannot parse query. Please make sure the input contains a single valid query. (gpoptutils.c:53)
Copy the code

This is actually a SQL file error. There may be some formatting or indentation errors in the input SQL file. We need to open query.

Question 4:

Error while running gp_toolkit.gp_dump_query_oids(text).
Please make sure the function is installed and the query file contains single valid query.
Copy the code

Gp_toolkit_2. SQL: Delete the “begin;” at the beginning of the file. “And then run the SQL in PSQL

prod_db=# \i gp_toolkit_2.sql ( or ) 
psql -f gp_toolkit_2.sql
Copy the code

When running this SQL, ignore the following problems:

relation "gp_pgdatabase_invalid" already exists
function "gp_skew_idle_fraction" already exists with same argument types
Copy the code

After installation, run Mini-RePro again.

Question 5:

raise ValueError(errmsg("Extra data", s, end, len(s))) 
​ValueError: Extra data: line 2 column 1 - line 3 column 1 (char 134 - 150)
Copy the code

PSQLRCZ > timing > timing > timing > PSQLRCZ > timing > timing > PSQLRCZ > timing > timing

Third, summary

Minirepro is a tool that can reproduce problems related to SQL statistics so that you can quickly reproduce problems in another set of clusters. For most software problems, we can get to the root of the problem relatively quickly if it can be repeated continuously. Hope everyone can skillfully use MiniREPro, so as to accurately locate the problem!