26 Nov

How Replication in MongoDB Works (Replica Sets)

How replication in MongoDB Works (Replica Sets). Replication is the concept of synchronizing data from one server, also known as the primary server to one or more replica databases . This ensures high availability of data and redundancy since multiple copies of data are spread across several database servers. In case one server crashes , data can always be retrieved from the secondary nodes.

Replication offers fault tolerance and improves data accessibility since copies of the same data are copied on multiple servers in real-time. This helps in disaster recovery and business continuity in case of failure on the primary server which causes service interruption and downtime.

Follow this guide to explore How replication in MongoDB Works (Replica Sets).

Also Read:

SQLite vs MongoDB – What’s the Difference? (Pros and Cons)

The Concept of Replication

In MongoDB replication, entire data sets from a single MongoDB database server are copied and synchronized across multiple server instances. This provides redundancy, which in turn, results in high availability of data. Additionally, it also eliminates any single point of failure should a server develop an issue which might result in service downtime

Moreover, replication is achieved through replica sets. A replica set is a group of mongodb instances that host the same data set. It comprises of one primary node that accepts all the write operations and one or more secondary nodes on which data is copied and synchronized. This ensures that, at any given time, data on the primary node is exactly the same as the data on secondary nodes.

Importantly, in case the primary server goes down as a result of a system crash, one of the secondary server’s takes over and becomes the new primary node via election. In addition, if the failed server comes back online, it becomes as a secondary node.

Here are a few key points to keep in mind when working with replicas:

In a replica set, one node is designated as the primary node whilst the rest are secondary nodes. Usually a minimum of two secondary nodes is recommended.
All data replicates from the primary node to the secondary nodes.
In case of failure of the primary node, automatic failover kicks in and a new primary node is elected among the secondary nodes.
After the recovery of the failed primary node, it re-joins the replicate set, but this time as a secondary node.

Also Read:

How to create MongoDB Cluster multi-node on Ubuntu 20.04

Benefits of Replication

In summary, here are some of the benefits of implementing database replication on your MongoDB setup.

1. High availability of data

The concept of data being evenly synchorized across multiple servers ensures 24/7 availability of data. Thanks to redundancy, any single point of failure is eliminated, thus improving access to your data.

2. Disaster Recovery

High availability guarantees disaster recovery in case of any eventuality. Any failure is easily resolved by retrieving data from other nodes which contains exact copies of data.

3. Horizontal Scaling

The replication architecture by itself results to horizontal scaling where the same data is synchronized across multiple nodes.

4. Data Safety

Because data resides on multiple servers, access to data and its safety is guaranteed should the primary server fail for whatever reason.

In the following steps , we demonstrate with examples replication in MongoDB.

Also Read:

How to Install MongoDB Community Server on Azure/AWS

How MongoDB Replication Works

Mainly, in MongoDB, replication is made possible using a replica set which comprises two or more MongoDB nodes working together as a unit. Best practice demands a minimum of three nodes in a MongoDB replica set:

1. The first node is the primary node. This node accepts all the read and write operations.

2. The remaining nodes are known as secondary nodes. They replicate data from the primary node in real-time.

Below is a basic MongoDB replication setup.

Image Source: www.mongodb.com

The primary node accepts read and write operations while the remaining nodes accept read operations.

In case the primary node experiences downtime or becomes unavailable, one of the secondary nodes takes over the role of the primary node. The most befitting secondary node to take over from the primary node’s role is selected through a process known as Replica Set elections.

Also Read:

How to Install MongoDB on Ubuntu 20.04 (Community Edition Tutorial)

MongoDB Heartbeat Process

Heartbeat is the process of assessing the current status of nodes in a replica set. The nodes ping each other every two seconds, and should a node fail to send a ping within 10 seconds, then other nodes in the replica set mark it as unreachable.

This is critical for automatic failover processes where the primary node is unreachable and the other secondary nodes do not receive a heart beat from it within the allowed time frame. In such a case, MongoDB automatically elevates the secondary server to act as the primary server.

Image Source: www.mongodb.com

Also Read:

How to Install MongoDB on CentOS 8 (Community Edition Tutorial)

MongoDB ReplicaSet Elections

A replica set conducts ‘elections’ to determine which node is to become the primary node. An election is generally triggered in response to the following events.

A. Loss of connection between the secondary nodes and the primary node beyond the configured timeout which, by default, is 10 seconds.

B. Addition of a new node to the replica set.

C. Initialization of a replica set.

D. Replica set maintenance using methods such as rs.stepDown() or rs.reconfig() methods.

For the replica set to process write operations, the election has to complete successfully. The mean time before a cluster elects a new primary node is typically 12 seconds. The assumption here is that replica configuration is using the default configuration settings.

One factor that affects the election time is network latency, which in turn impacts the configured timeout for your replica set to get back to full operation with a new primary node. That being the case, the replica set will , thus, not process any write operations until the election of the new primary is complete. Nonetheless, read operations are still be served if they are configured to be processed on secondary nodes.

In the diagram shown below, an automatic failover process was triggered after the primary node experienced downtime that was longer than the configured timeout. One of the remaining secondary nodes initiates and election to select a new primary node.

Image Source: www.mongodb.com

Also Read:

What is MongoDB Sharding: Step by Step Tutorial with Example

MongoDB Arbiter in a ReplicaSet

In situations where cost constraints prohibit you from adding a secondary node in a replica set (such as one with primary and secondary node) you may opt to add a mongodb instance to the replica set as an arbiter.

The arbiter participates in elections but does not provide redundancy like a standard secondary node would do. An arbiter always remains an arbiter. The primary node can step down and become a secondary node and the secondary can assume the role of the primary during an election.

Image Source: www.mongodb.com

Also Read:

How to Install Docker MongoDB Container Using MongoDB Image

Configuring a MongoDB Replica Set

So far, we have covered the basics of replication and how a replica set works in order to ensure high availability of data. In this section we shift gears and configure a 3-node replica set.

Replica Set Lab environment

To demonstrate the concept of replication in MongoDB, we will use a 3-node setup on Ubuntu 20.04 with one primary node and 2 secondary nodes as shown.

139.144.171.196 mongo-master-node
139.144.171.229 mongo-node-1
139.144.171.250 mongo-node-2

NOTE:

Each node needs to have MongoDB server installed and running. By default, MongoDB service should listen to port 27017. In addition, all nodes need to be configured with a hostname that is resolvable from each other.

Check out our comprehensive guide on how to install a MongoDB on Ubuntu 20.04.

Step 1: Edit the hosts file

To enable seamless communication across all nodes, you need to edit the /etc/hosts file in each node with the IP addresses and corresponding hostnames as shown in our lab setup. Therefore, log into each node and access the hosts file as shown.

				
					sudo nano /etc/hosts

Update the file with the following entries.

				
					139.144.171.196	 mongo-master-node

139.144.171.229  mongo-node-1

139.144.171.250  mongo-node-2

Save the changes and exit.

Step 2: Create an Administrative user on Primary Node

The creation of a replica set requires you to configure an administrative user on the master node. As such, access the MongoDB shell as shown.

				
					$ mongosh

Next, switch to the admin database as shown.

				
					use admin

Use the following function to create an admin user. When prompted for the password, provid a strong password and hit ENTER.

				
					db.createUser(
  {
    user: "admin-user",
    pwd: passwordPrompt(),
    roles: [ { role: "root", db: "admin" }, "readWriteAnyDatabase" ]
 }
)

Once you have created the administrative user, exit the MongoDB shell.

				
					exit

Step 3: Configure the Primary Node

The nodes in the replica set need to communicate with authentication enabled. Instead of using a passphrase, we are going to generate a cryptographic certificate that secures authentication between the nodes.

We are going to use the openssl command to generate a base64 cryptographic key. But first, we will create a directory in which we will place the key.

				
					# mkdir -p /mnt/keys/

Next, switch to root user and generate the base64 cryptographic key as follows

				
					$ sudo -su

				
					# openssl rand -base64 756 &gt; /mnt/keys/mongo-key

Once you have successfully generated the key, set permissions such that only the owner of the key file has read-only access to the file while the group and the rest of the users have no read or write access.

				
					# chmod 400 /mnt/keys/mongo-key

In addition, configure the ownership and group of the cryptograhic key to ‘mongodb‘ user and group.

				
					# chown -R mongodb:mongodb /mnt/keys

Next, copy the generated key to the secondary nodes using the scp command as shown.

				
					# scp /mnt/keyfile root@139.144.171.229:/mnt/
# scp /mnt/keyfile root@139.144.171.250:/mnt/

Once that is done, you need to make a few changes to the default MongoDB configuration file on the primary node. So, access it as shown.

				
					# vim /etc/mongod.conf

Edit the configuration file as shown below. The name of the replica set provided is replica01. Thsis is just an arbitrary value, so feel free to choose your preferred name.

				
					net:
   port: 27017
   bindIp: 127.0.0.1,mongo-master-node

security:
   keyFile: /mnt/keys/mongo-key

replication:
   replSetName: "replica01"

Save the changes and exit the file. Then restart MongoDB to effect the changes

				
					# systemctl restart mongod

Step 4: Configure The Secondary Nodes

To configure the secondary nodes for replication, log into each of them and access the default MongoDB configuration file.

				
					$ sudo vim /etc/mongod.conf

Update the configuration file as shown.

For Secondary Node-1

				
					net:
   port: 27017
   bindIp: 127.0.0.1,mongo-node-1

security:
   keyFile: /mnt/mongo-key

replication:
   replSetName: "replica01"

Save the changes and exit the configuration file.

Once again, set the following file permissions and ownership rights to the copied key.

				
					# chmod 400 /mnt/mongo-key
# chown mongodb:mongodb /mnt/mongo-key

Then restart MongoDB service for the changes to apply.

				
					$ sudo systemctl restart mongod

For Secondary Node 2

As with the first secondary node, update the default configuration file as follows.

				
					net:
   port: 27017
   bindIp: 127.0.0.1,mongo-node-2

security:
   keyFile: /mnt/mongo-key

replication:
   replSetName: "replica01"

Once done, save the changes and exit the configuration file.

As before, assign the following file permissions and ownership rights.

				
					# chmod 400 /mnt/mongo-key
# chown mongodb:mongodb /mnt/mongo-key

Be sure to restart MongoDB service to enforce the changes made.

				
					$ sudo systemctl restart mongod

Also Read:

MongoDB Sharding vs Partitioning (What’s the Difference – Explained)

Step 5: Create the Replica Set

All the node are now fully configured for replication. In this section, we initialize a replica set and create a test database with a few records.

So, log in to the MongoDB shell as the admin user using admin as the authentication database.

				
					mongosh -u admin-user -p --authenticationDatabase admin

Next, bootstrap the replica set as shown.

				
					rs.initiate()

You should get the following output

Next, add the first secondary node to the replica set.

				
					rs.add("mongo-node-1")

As well as the second node

				
					rs.add("mongo-node-2")

To confirm the status of the replica set, run the following command:

				
					rs.status()

Here is a sample of what you should expect.

Also Read:

Reasons Why MongoDB Is A Good Replacement For Memcached

Step 6: Confirm Replication on Secondary Nodes

To verify that the replica set is functional, we create a sample database on the primary node and add a record. Then we confirm whether the database has been replicated to the secondary nodes.

On the master node, login as the Administrative user.

				
					# mongosh -u admin-user -p --authenticationDatabase admin

Create a sample database for testing if the replica set is working. Here, we are creating a database called my-database.

				
					use my-database

Next, create a collection and add a record. Here, we are creating a collection called students with details of a student.

				
					
db.students.insert({Name: "Alex", Age: 21, Residence: "London", Level: "Senior", Status: "Married"})

To confirm the presence of the newly created database, list the databases as shown.

				
					show dbs

Finally, head over to any of the secondary nodes and access the MongoDB shell as the Administrative user, in the same way you did on the master node.

				
					# mongosh -u admin-user -p --authenticationDatabase admin

Next, run the following command to enable read operations on the secondary node.

				
					db.getMongo().setReadPref("primaryPreferred")

Then switch to the test database you created on the master and list the documents in the collection.

				
					use my-database
db.students.find()

As you can observe, the database alongside its collection and records has been replicated.

This is confirmation that the replication is working as expected! Perfect!

Also Read:

Top 10 Security Best Practices for Securing MongoDB

Thank you for reading How replication in MongoDB Works (Replica Sets). We will conclude this article.