What database is currently used on the back end to store data?

Let me talk about the current situation based on the two or three years of research and work experience.

1.Oracle: Traditional industries, especially government, healthcare, schools and large enterprises, are still mostly Oracle, followed by DB2. Instead, middleware like WebLogic and WebSphere have largely gone out of fashion with classic javaee, being replaced by a lightweight combination of rich front-end and microservice frameworks.

2.MySQL: many new projects in traditional industries also start to apply MySQL in large numbers. Because the early cost of lightweight database is very low, which can ensure sufficient project budget, new projects are mainly in the majority, and projects oriented to Internet connection are also in the majority. These systems generally do not carry the business-critical data storage that Oracle does, so the choice of database is a company decision.

At present, a large number of enterprises have begun to go on the cloud, we buy cloud services mainly ali Cloud ECS, generally Ali Cloud is relatively stable, so for the stability of the cloud database requirements of enterprises will choose Ali Cloud’s main RDS series, MySQL is the majority, PostgreSQL is also gradually recognized.

3.PostgreSQL: Speaking of PostgreSQL, it is true that PG is in the hot air these two years. In my previous article, I also mentioned that PostgreSQL is used in the architecture design of the Internet medical product I did, mainly because PostgreSQL has high production stability and low cost. PostgreSQL is especially easy for architects who are proficient in Linux services.

More specifically, the key to using PostgreSQL was mainly business data, because we were hosting Internet medical data, and the nature of medical data itself was critical! Therefore, both stability and security are rigid requirements. At the same time, it is necessary to balance the cost and the flexibility of the Internet. Therefore, MySQL solution is rejected and PostgreSQL solution is resolutely implemented.

4.Hadoop HDFS: The main data set of big data projects still uses Hadoop HDFS as the basic storage facility. Despite the heated discussion that Hadoop is dead, there are other, faster NoSQL storage alternatives. In fact, big data engineers choose Hadoop honestly in the implementation of the final implementation, because its maturity and stability are the final criteria for consideration.

5. Elasticsearch: The ELK family of Elasticsearch is currently heavily used as the main data set for log monitoring and analysis, even ignoring the fact that it is a search engine itself. Elasticsearch is still the number one professional search engine on e-commerce sites, content publishing sites and social media sites.

6. Real-time/temporal databases: In industrial energy and other iot industries, real-time and temporal databases are gradually adopting open source solutions, such as Druid. IO, InfluxDB, OpenTSDB, which are currently the best open source options for storing iot data. Druid. IO is a complete suite of real-time and historical library solutions. InfluxDB is a highly popular temporal database that independently implements a set of native cluster storage structures. OpenTSDB relies on the HBase distributed database and HDFS. In addition, IOTDB, the open source timing database launched by Tsinghua university, has been upgraded to the top project of Apache.org.

Hadoop HBase: As column cluster storage and millisecond k-V storage, Hadoop HBase is more and more suitable for real-time data analysis in general scenarios. It can be used in any field to support real-time online analysis and small batch services. Its distributed consistency and stable storage of HDFS are excellent solutions for real-time analysis of critical business data.

8. TiDB: In the era of massive data query on the Internet, ensuring transaction consistency and large throughput write parallel, two modes will be formed. One is NewSQL’s alternative to relational database. In my previous articles, I have also constantly mentioned the necessity of TiDB to replace relational database. This kind of replacement behavior usually occurs when the upper-level complex business based on relational database is constantly upgraded and updated, leading to the situation that related personnel have no love in the operation and maintenance process. So NewSQL distributed consistency, ACID, with K-V horizontal scaling storage solution is very suitable, do not have to struggle in the mire of relational database branch table.

9. Mongo:Another is the improve of the relational database itself or introduce to partly replace mongo, e-commerce order business data, for example, Internet medical health records data, publishing articles data, can realize the mongo’s document substitution, which not only conforms to the documentation of the business model, and under the premise of guarantee transaction, Achieve massive data support.

10. Relational database parallelism: Relational databases are constantly improving, especially lightweight databases. Cluster in MySQL8 and parallelism in PostgreSQL11 are different ways to achieve the same goal: Relational libraries are doing everything they can to keep users away from relational databases and embrace NoSQL to achieve the parallel processing of massive amounts of data while reducing the huge upgrade costs associated with user replacement.

Note: the above architecture diagrams are from the official website of the database or the authoritative website of related technologies.

Read another thoughtful article on distributed architecture and big data technologies:

Engineers mistakenly deleted the company’s production database, how to view the vulnerability of data security architecture?

Head over to read Byte Creation Center – learn more about Read Byte creation

What database is currently used on the back end to store data?

Related Posts

SpringBoot uses asynchronous thread pools to implement batch data pushing in production environments

Implementation of Mock Dubbo service based on dynamic proxy

Make a pandemic screen in Python (optimized version)