MongoDB Cluster Architecture Components (Shard, Replica Sets, Mongos, Config)

MongoDB Cluster Architecture Components (Shard, Replica Sets, Mongos, Config). All in all, MongoDB is an open source, non relational database management system written in C++. Characterized by the lack of a well defined structure of supported databases. Instead, data is assembled as JSON style documents. Additionally, it is a NoSQL database used to store huge amounts of data.

Interestingly, in MongoDB, we use the word “cluster” that means a set of replicas or clusters. In this article, we introduce what MongoDB is and how it works along with its features. Then, we learn about its operating architecture along with the main components.

Shall we start with MongoDB Cluster Architecture Components (Shard, Replica Sets, Mongos, Config)?

What is MongoDB ?

Source Image: mongodb.com

Firstly, MongoDB it is a document oriented NoSQL database used to store large amounts of data. Instead of using tables and rows like traditional relational databases, MongoDB uses collections and documents. This is where a document is made up of key value pairs, which are the fundamental data units in MongoDB. Moreover, it contains the same collection of documents and functions as a relational database table

Key Components of MongoDB Cluster Architecture

Source Image: wideskills.com

1. _id – All items must have this field. Simply put, the _id field of a MongoDB document is a unique value, and the _id field serves as the document’s primary key. Moreover, for the previous customers table, MongoDB assigns a unique 24-digit number to each document in the collection.

2. Collection – Generally collections are groups of MongoDB documents. Collections are very similar to tables in other relational database management systems (RDMS) such as Oracle. 

3. Database – Basically a database is similar to a relational database management system (RDMS). In that, there is a container for collections. In the file system, each database has its own set of files.

4. Cursor – A cursor is referred to as a pointer to a query result set. Clients can use cursors to iterate through the results.

5. Field – Name value pairs in a document are called fields. A document usually contains more than one field in a relational database, and a field is similar to a column.

6. Document – This is essentially a record in a MongoDB collection, and the field names and values ​​can be found in the documentation.

Features of MongoDB

Source Image: techguysfdc.medium.com

Document oriented

Unlike DBMS, MongoDB stores all data in documents rather than tables. Because these documents store data in fields (key value pairs) rather than rows and columns. In turn, they allow for much more flexibility in working with data than relational database management systems. Each document also has a unique object ID.

Scalability

Partitioning refers to distributing data across multiple servers. In this case, a significant amount of data is split into chunks using shard keys. These chunks of data are evenly distributed across multiple physical servers. It is also used to add additional systems to an existing database.

Aggregation

Furthermore aggregation on grouped data and gets you single or calculated result. This is similar to the GROUPBY clause in SQL. Aggregation pipelines, map reduce functions. Single purpose aggregation methods are among the three available aggregation types.

Schema less database

This means that different document types are stored in the same collection. That is, a single collection in a MongoDB database contains multiple documents, each with different fields, content, and size.

Unlike relational databases, one document does not have to be similar to another. Additionally, MongoDB provides a lot of flexibility to the database with this fantastic feature.

Indexing

Certainly, each field in a document is indexed with primary and secondary indexes. Forthwith, this reduces the time it takes to find data in the database. Furthermore, a database engine uses an index to sift through information rather than looking for a specific item in each document one by one. Also, MongoDB indexing feature has proven to be one of the best features by reducing query resolution time.

Load balancing

Fundamentals of database management for scaling large organizations, where client traffics and requests number in the thousands or millions. This should be well distributed across different servers to maximize performance and reduce congestion. However, MongoDB efficiently handles read and write requests, balancing the load across multiple servers and ensuring data consistency. In other words, with MongoDB you don’t need an additional load balancer.

Basic Architecture of the Sharded Cluster

Source Image: mongodb.com

Shard

Mongos

  • Routing, distributing and merging requests.
  • Deploy multiple mongos to ensure high availability.
  • Access to sharded cluster instances.

Config Server

  • Provides Deployment for ReplicaSet ensures high availability.
  • Storage metadata and cluster configurations.

MongoDB Clusters

1. Shard

Firstly, each database in a sharded cluster has a primary shard that contains all unsharded collections for that database. Secondly, each database has its own primary shard. Thirdly, the primary shard has nothing to do with the replica of primary set.

Mongos chooses a primary shard when creating a new database, choosing the shard with the smallest amount of data in the cluster. in addition, Mongos uses the totalSize field returned by the listDatabase command as part of its selection criteria.

Source Image: alibabacloud.com

Well, to use movePrimary command is to change the primary database segment. The main shard migration process takes a significant amount of time, and you should not access the database and associated collections until it is complete. Depending on the amount of data being migrated, the migration affects the overall operation of the cluster. Remember, before changing the primary shard, consider the impact on cluster activity and network load.

When deploying a new sharded cluster with shards previously used as replica sets, all existing databases will still be in the original replica set. Databases created later can be on any segment of the cluster.

2. Mongos

Source Image: stackoverflow.com

Instances route requests and writes to shards in a sharded cluster. It also provides a single interface to sharded clusters from an application perspective. Applications do not directly connect or interact with shards.

Nevertheless, the config server caches metadata to keep track of what data is on which shards. Particularly, Mongos use metadata to route operations from applications and clients to MongoDB instances. Similarly Mongos are stateless and use minimal system resources.

Still the most common way is to run the mongos instance on the same machine as the application server. But you can keep the mongos instance in a shard or other dedicated resource.

3. Config servers

Source Image: medium.com

Generally, the configuration server stores metadata about partitioned clusters. Therefore, metadata reflects the state and configuration of all data and components in a sharded cluster. Also, the metadata includes the list of chunks for each shard and the ranges that define the chunks.

Thus, mongos instances cache this data and use it to route reads and writes to the correct shards. Moreover, mongos updates the cache when the cluster’s metadata changes, such as chunking or adding shards. And shards also read shard metadata from the config server.

The server also stores authentication configuration information, such as role based access control for the cluster or internal authentication settings. Moreover, MongoDB also uses a configuration server to manage distributed locking. Each sharded cluster must have its own configuration server. Please, do not use the same configuration server for different shard clusters.

4. Replica sets

Source Image: medium.com

A set is a group of mongodb instances that keep the same set of data. A replica set contains multiple nodes forwarding data and optionally one arbiter node. Only one member of the node passing data is considered primary, and the rest are considered secondary nodes. They receive all write operations.

Also there can be only one primary in a replica set that commits the write issues. In some cases, another instance of mongodb may temporarily consider itself primary server. So, the primary server records all changes to the dataset in its operational log, or oplog.

Then, the secondary server copies the primary server’s oplog and apply operations to its dataset so that the secondary server’s dataset which mirrors the primary server’s dataset. 

In some cases (for example, if there is a primary and secondary server, but the cost constraints prevent you from adding another secondary server). Or, you may want to add a MongoDB instance as a arbiter to a replica set. 

Thank you for reading MongoDB Cluster Architecture Components (Shard, Replica Sets, Mongos, Config). We shall conclude. 

MongoDB Cluster Architecture Components (Shard, Replica Sets, Mongos, Config) Conclusion

Finally MongoDB is a general purpose database that provides many benefits to the application development process. Its scalability and flexible layout  helps you to build more future proof applications. It provides a great developer experience with drivers for most major programming languages ​​and a large user community. It is also available with all major cloud providers.

In this article we have learned about MongoDB clusters. In addition, you have learnt what MongoDB is and the key elements of its architecture. The features of MongoDB architecture such as shard, replica sets, mongos and config servers were also presented.

To read more about MongoDB content, please read more content in our blog over here.

Avatar for Kamil Wisniowski
Kamil Wisniowski

I love technology. I have been working with Cloud and Security technology for 5 years. I love writing about new IT tools.

0 0 votes
Article Rating
Subscribe
Notify of
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x