This document is based on the official mongodb documentation, version 4.4.

A replica set is a set of Mongod processes consisting of:

Primary

Member primary receives all write operations.

Secondaries

Secondaries replicate all operations from the primary to keep the same data set. Secondaries may have other uses for special purpose profiles. Secondaries may be non-voting and priority 0.

Arbiter

An arbiter who participates in elections but does not hold data cannot become a primary. An arbiter has 1 vote.

Members

The Primary node of the replica set

MongoDB applies all write operations to the primary node and logs the operations to oplog on the primary node. The Secondary members copy the log and apply the actions to their respective datasets.

Read operations work on primary by default, but all members of the replica set can receive read operations.

The Secondary node of the replica set

Member of Priority 0

A member with priority 0 cannot be a member of the primary node and cannot trigger an election. They maintain a copy of a data set, can receive read operations, and can vote in elections.

Hidden Replica set member

A hidden member holds a copy of the Primary array, but is not visible to the client application. Hidden has a long-term priority of 0, but can vote. Hidden members are used primarily for backups.

Delayed replica set members

The Delayed member’s dataset reflects the early or Delayed status of the primary dataset. The priority must be 0 and should be a hidden member.

Members of the Arbiter

Oplog

All members of the replica set have a copy of Oplog in the local.oplog.rs collection. Oolog records all changes to the data.

To facilitate replication, all members send heartbeats (ping) to all other members. Any Secondary member can import entries from any other member.

Every operation is idempotent, meaning that an operation in Oplog yields the same result whether it is applied once or multiple times.

oplog size

If oplog size is not specified when the replica set is first started, MongoDB creates a default oplog size. Different operating systems have different default sizes.

Starting with MongoDB 4.4, you can specify a minimum retention period for an Oplog entry. Mongod will delete an Oplog entry only if the following conditions are met:

  • Oplog has reached its maximum configuration size.
  • The existence time of Oplog Entry exceeded the configured retention period. Procedure

By default, MongoDB does not set a minimum operation log retention period. Instead, it automatically truncates operation logs from the earliest entries to maintain the maximum configured operation log size.

Minimum Oplog Retention Period

Starting with version 4.4, MongoDB can specify the minimum number of hours an Oplog entry can be retained.

To configure a minimum oplog Retention period when starting Mongod, you can do so in one of two ways:

  • In the configuration file of mongod add storage. OplogMinRetentionHours;
  • When starting from the command line, use the –oplogMinRetentionHours argument.

Use replSetResizeOplog to update the minimum Oplog Retention Period of a running Mongod process.

Workloads that Might Require a Larger Oplog Size

For situations like the following, consider creating an Oplog that is larger than the default. Conversely, if your application runs mainly reads with only a small amount of write operations, you can use a smaller Oplog.

Updates to Multiple Documents at Once

To ensure idempotency, Oplog must convert multiple updates into a single operation. This takes up a lot of operation log space without a corresponding increase in data size or disk usage.

Deletions Equal the Same Amount of Data as Inserts

When there are roughly as many deletes as inserts, the database size will not change significantly, but the operation log will be large.

Significant Number of in-place Updates

When there are a lot of update operations on existing data.

Oplog Status

You can use the rs.printreplicationInfo () method to view the status of Oplog (size and time span of operation).

Replication Lag and Flow Control

In various special cases, updates to secondary oplog may lag behind the expected running time. Use db.getreplicationInfo () on the secondary member to evaluate the status of the current replication from the output of the replication status and determine if there is an unexpected replication delay.

(The following part is not understood about Flow Control)