Database belongs to basic software. One characteristic of basic software is that with its development, it tends to be oligopoly, such as Linux in the operating system. At present, the development of domestic database is very promising, in order to occupy the market as soon as possible, development ecology and other different factors, choose open source projects more and more. As practitioners, you can learn from more open source projects.

Pg-based open source project

Here are some pG-based open source projects, on the one hand, to learn about these open source projects, on the other hand, you can pay attention to the changes made by each open source project, and for what purpose or needs they made these changes. If it is us, whether we have these needs, whether we should also do it.

openGauss

OpenGauss is huawei’s open source relational database management system. It features multi-core high performance, full-link security, and intelligent operation and maintenance (O&M). OpenGauss kernel originated from the open source database PostgreSQL in the early stage. It integrates Huawei’s years of kernel experience in the database field and ADAPTS and optimizes the architecture, transaction, storage engine, optimizer and ARM architecture.

Gitee: gitee.com/opengauss/o… OpenGauss community: opengauss.org/zh/

OpenGauss vs. PG: openGauss is based on the PostgreSQL9.2.4 kernel and includes some features of PG9.4, but many features of later versions are not included. Here are some of the major changes:

  • Multi-process changed to multi-thread
  • Change transaction ID from 32-bit to 64-bit (fix transaction ID rollback issue)
  • Add a column engine, memory engine
  • Other kernel enhancements (not all listed……)
  • Other features (AI tuning, transparent encryption, etc.)……

Disadvantages:

  • Parallelism not supported (PG9.6 started to support parallelism)
  • Compile the complicated

For more comparisons, check out the following two blog posts:

  • OpenGauss data comparison with PostgreSQL
  • OpenGauss and PostgreSQL source directory structure comparison
AntDB

AntDB is a general distributed transactional relational database derived from the PG kernel. It is a domestic, independent, safe, reliable and high-performance enterprise-level distributed transactional relational database product for finance, telecommunications, government affairs, security, energy and other industries. With continuous cluster automatic high availability, second online scaling, strong Oracle compatibility (80), remote disaster recovery, SQL statement level custom sharding, distributed transactions and MVCC, maximum protection, maximum performance and maximum availability of adaptive switching, global consistency point recovery and other core capabilities of enterprise applications.

Gitee: gitee.com/adbsql/antd…

It was implemented based on Postgres-XC in 2014. There are also many distributed databases that are pG-XC architectures. For more information, see AntDB, a distributed transactional relational database

The PG kernel is newer than openGauss.

Greenplum

Greenplum is the world’s leading open source big data platform. It is a big data engine capable of providing powerful features including real-time processing, elastic scaling, mixed load, cloud native and integrated data analytics. Greenplum is built on an MPP (massively parallel processing) architecture, with good elasticity and linear scaling capabilities, and built-in parallel storage, parallel communication, parallel computing and optimization technologies.

Making: github.com/greenplum-d…

Postgres-XC

Distributed database based on PostgreSQL. Many distributed databases are developed based on Postgres-XC, such as ANTDB, TBase and so on.

Making: github.com/postgres-x2…

For more information on distributed databases, see the main distributed databases

TBase

TBase is an enterprise-level distributed HTAP database management system developed by Tencent Data Platform team on the basis of open source PostgreSQL. It has high performance and scalable distributed transaction capability and supports RC and RR isolation.

Making: github.com/Tencent/TBa…

PolarDB-for-PG

PolarDB is the next generation cloud native relational database independently developed by Alibaba.

Other PG-based projects (including closed source)
EnterpriseDB

EnterpriseDB provides postgresQL-based enterprise products and services. EnterpriseDB is a branch of PostgreSQL. Based on PostgreSQL, PostgreSQL is specially optimized for enterprise applications. At the same time, it adds a series of advanced features such as dynamic performance tuning (DynaTune), EDB Loader, efficient batch SQL processing and so on. Among other features, EnterpriseDB’s compatibility technology stands out as being compatible with Oracle databases.

PipelineDB

PipelineDB is a PostgresQl-based stream database.

PipelineDB is a high-performance PostgreSQL extension built to run SQL queries continuously on time-series data.

Making: github.com/pipelinedb/…

TimescaleDB

TimescaleDB is a PostgreSQL database based on the PostgreSQL database. The PostgreSQL database can be updated with the PostgreSQL version.

Making: github.com/timescale/t…

AgensGraph

AgensGraph is a PostgresQl-based graph database. AgensGraph is also a next-generation multi-model graph database for modern complex data environments, supporting both relational and graph data models. As a result, developers can integrate legacy relational data models and flexible graphical data models into one database.

Making: github.com/bitnine-oss… Gitee: gitee.com/mirrors/Age…

CitusDB

Citus is an extension of PostgreSQL (not a fork). It uses a shared nothing architecture, and no data is shared between nodes. Citus consists of coordinator nodes and Work nodes to form a database cluster. Citus uses more CPU cores, more memory, and holds more data than PostgreSQL alone. You can easily scale the database by adding nodes to the cluster.

Citus supports new versions of PostgreSQL features and remains compatible with existing tools Citus extends PostgreSQL horizontally across multiple machines using sharding and replication. Its query engine executes SQL on these servers to parallelize queries for real-time (less than a second) responses on large data sets.

Making: github.com/citusdata/c…

Why PG?

Check out this article: Why AntDB is based on PostgreSQL. The following points are summarized:

  1. Maintain independence (in a domestic environment that emphasizes autonomy and control, it has advantages)
  2. Open Source license (bSD-like open source license, users use without any legal risk, can be freely distributed, closed source and open source)
  3. Good momentum of development
  4. Rich product ecology

PostgreSQL, like Linux, is an open source community product that has been developed for nearly 30 years. PostgreSQL is not controlled by any commercial company or any single country. Its main contributors come from companies and individuals in Japan, the United States, Germany, France, Austria, Sweden, the United Kingdom, And Russia. Among them, the main enterprises include CitusData, VMware, EnterpriseDB, Pivotal, NTT Group, Fujitsu, Google, WAS, IBM, Zalando, Yandex and so on, accounting for a total of 24 companies, accounting for 63%, and individual contributors accounting for 37%.

Other open source projects (not PG-based projects)

There are many other open source projects, such as Oceanbase and so on, but just one more is TiDB. TiDB is a converged database product targeted at online transaction processing/online analysis processing. It implements one-button horizontal scaling, strong consistency of multi-copy data security, distributed transaction, real-time OLAP and other important features. At the same time compatible with MySQL protocol and ecology, easy migration, low operation and maintenance cost. As for why TiDB is a single column, check out Github’s 29.8K star.

TiDB: pingcap.com/zh/ github: github.com/pingcap/tid…