Make writing a habit together! This is my first day to participate in the “Gold Digging Day New Plan · April More text challenge”, click to see the details of the activity.

1. Basic Concepts

1.1 Data Writing Process

In the database, the data files store the actual data, but when the data is updated, the data files on disk are not directly updated. Because this involves a lot of disk I/O, the performance impact is significant. When a user submits a DML request, such asupdate emp set ename='test' where id=1;

The listener receives the request and creates a corresponding server process for the request. The server process first scans the database buffer for blocks with ID =1. If yes, the cache hits the update directly and the data block becomes dirty. If there is no match, the server process copies the corresponding data block from the disk to the buffer and then updates it.

If the database buffer stores blocks that do not match those on disk, the buffer is called a “dirty buffer” and the dirty buffer is eventually written to the disk’s data file by the database writer (DBWn). DBWn does not write in real time, it writes as little as possible, and it only writes in the following four cases:

  • A. There are no available buffers
  • B. Too many dirty buffers
  • C. 3-second timeout (a write will be performed at least once in 3 seconds)
  • D. A checkPoint is reached. A checkpoint is an Oracle event, such as when a database is shut down in an orderly manner, or it can be triggered manually

Since DBWn is not written in real time, if there is a power failure and the data in the buffer is not written to disk, then the data in the buffer will be lost. Data is never lost, which leads us to the redo log.

1.2 What is a Redo Log?

When the user performs a DML operation and the data block changes, the resulting change vector is written to the redo log file. The redo log allows you to recover a large amount of dirty data from the database buffer before it can be written to a data file when the system fails due to power outages.

Accordingly, log writes have an area of memory called the log buffer, and the log writer (LGWR) writes the contents of the log buffer to the redo log file on disk. LGWR writes more frequently than database writers (DBWn). LGWR performs writes in three cases:

  • A. Write the data when committing
  • B. The log buffer usage reaches 1/3.
  • C. DBWN before writing to the dirty buffer

When the user performs DML and commits, DBWn writes are not necessarily triggered, but LGWR writes are triggered. Therefore, even if after DML and COMMIT, only data is written to the database buffer, and data in the database buffer is lost, data can be restored through the redo log.

1.3 What is an Archive Redo Log?

An archive is a permanent save of redo log files to an archive redo log file. The archive redo log is the same as the redo log, except that the redo log is rewritten over time. The Archive redo log keeps a complete history of data changes. The ARCn process backs up redo logs to the archive redo log.

The above is the general process of data writing. Database synchronization mainly relies on redo log and archive redo log.

2. Dr Synchronization mode

2.1 the Oracle Dataguard

The principle of

Transfer redo logs from the original database to the target database, and then apply the log files to the target database to keep the target database in sync with the source database. step

  1. The client connects to the master library and initiates the data update operation
  2. Redo log is generated after data changes are made in primary memory
  3. The client executes commit
  4. The primary library writes the redo log to the log file group and sends it to the standby library
  5. The standby database applies redo logs and responds to the primary database
  6. The primary library answers the client

Through this mechanism, every update operation in the primary database is applied to the standby database, thus achieving synchronization. You can set Maximum Availability, Maximum Performance, and Maximum Protection modes for the standby database response mode. In maximum protection mode, the primary database transfers the redo log to the standby database to ensure consistency.

Active Data Guard (ADG), which is commonly used in Dr Systems, adds the queryable function on the basis of DG to separate the service pressure of some report classes from the master database.

Protected mode Response mechanism
Maximum Performance The primary database can answer the client without waiting for the response from the standby database. The data between the primary database and standby database is asynchronous. If the primary database fails, data will be lost. If the standby fails, the primary remains available. Performance is not affected.
Maximum Protection mode The primary database must confirm that the secondary database has applied redo logs before answering the client. Data is synchronized between the primary database and the secondary database in real time. Data will not be lost when the primary database fails. If the standby fails, the entire DG schema becomes unavailable. The network and read/write speed of the standby library affect the overall performance.
Maximum Availability mode If the standby database is running properly, it is consistent with the maximum protection mode. If the standby database is faulty, it is consistent with the maximum performance mode.

For library type

The physical library prepared Logic for library
The principle of The active and standby databases have the same physical and logical structures. The standby database uses the redo log sent by the active database to synchronize data The active and standby databases have the same logical structure (physical structure is inconsistent). The standby database parses the redo log sent by the active database into SQL statements and executes SQL statements to synchronize data
advantages The logical structure and physical structure of the standby database are the same as those of the primary database, ensuring stronger consistency More flexibility, you can use the DBMS_LOGSTDBY package to make special Settings for the standby library

The physical standby database must be consistent with the primary SCN. The logical standby database only needs data consistency.

2.2 Oracle GoldenGate

The principle of

Incremental changes are obtained by parsing the redo log or Archive log of the source database, and then posted to the target database via TCP/IP. Finally, these changes are parsed and restored to the target database, thus achieving synchronization between the two databases. The DSG replication tool is based on similar principles.

step

  1. Data synchronization is initialized using Extract and Replicat processes
  2. Extract Process reads the online redo log or Archive redo log from the source database and parses it to Extract only changes to the data, such as DML operations
  3. The extracted information is converted into OGG custom intermediate format and stored in the Trail file
  4. The trail file can be sent to the destination through TCP/IP through the transport Process, and the Pump Process can be configured to send the trail file to the destination as a data block
  5. The target Server Collector process receives the data change information from the source and caches the information in the queue file trail file
  6. The replicat process reads the data change information from trail file, creates the corresponding SQL statement, and applies it to the target database. After the submission is successful, it updates its checkpoint, records the location where the replication has been completed, and completes data replication

Third, summary

DataGuard OGG
The principle of Transfer redo logs from the source database to the target database, and then apply the log files to the target database Incremental changes to the data are obtained by parsing the redo log or Archive log of the source database and then posted to the target database via TCP/IP. These changes are then parsed and restored to the target database
The stability of High stability as a DISASTER recovery Because the data replication operation is independent of the database management system, zero data loss cannot be guaranteed
maintenance Simple maintenance and few problems In cli mode, maintenance is complex
Object support Fully support Some objects need to be manually created and maintained
Availability of the destination The destination is in the recovered or read-only state. Procedure The database at both ends is active. The target end can provide real-time data query and simultaneously write data at both ends
Copy the way Real-time replication can be achieved GoldenGate can provide real-time capture and delivery of a large amount of data at the second level. The asynchronous replication mode cannot achieve synchronous replication
Resource usage Replication is done through the LGWR or ARCN process of the database, occupying a small amount of database resources More system resources are consumed during data extraction and transformation at peak times, while less is consumed during peak times
Heterogeneous database support It runs only on the Oracle database and has strict requirements on the source and target operating systems Data can be replicated between different types and versions of databases, and can be synchronized between different operating systems

DataGuard is a good way to implement Oracle database DISASTER recovery. In our project, WE also tried to use OGG for disaster recovery, and mainly encountered the following problems:

1. Ensure data consistency of databases at both ends

The DataGuard can be replicated in real time, and its consistency is guaranteed by the database’s own mechanism. The data replication operation of OGG is independent from the database management system. It can only ensure that the changed data extracted by OGG are applied to the target end, but it cannot ensure that the databases at both ends are consistent. Although OGG currently has very few problems with DML replication, we have written a number of additional audit scripts to ensure data consistency on both ends. Through the audit, it is also found that OGG itself does not report errors, but the table data is inconsistent.

2. Object maintenance

Some objects need to be manually created and maintained. In order to ensure the consistency of objects, additional scripts need to be written for database object auditing.

3. Efficiency

If the table at the source end is not indexed and a large number of add, delete, or change operations are performed, the efficiency of the OGG replication process will be greatly affected. For example, we did not build an index for a table and deleted 480,000 data in batch update. The replication process delayed for 6 hours and applied 150,000 data at the target end. Finally, it took 19 hours to level OGG. Similarly, trancate at the source will also lead to a large number of data changes, thus affecting OGG efficiency.

In general, if both ends are ORACLE and the operating system is the same, DataGuard is better. If you want to synchronize data between heterogeneous databases or the two operating systems are different, you can only choose OGG. Using this approach requires more efforts to ensure data consistency and improve efficiency.