Sharding is the process of storing data records across multiple machines and it is MongoDB's approach to meeting the demands of data growth. As the size of the data increases, a single machine may not be sufficient to store the data nor provide an acceptable read and write throughput. Sharding solves the problem with horizontal scaling. With sharding, you add more                                                       machines to support data growth and the demands of                                               read and write operations.


Why Sharding?

  • Latency sensitive queries still go to master
  • Memory can't be large enough when active dataset is big
  • Local disk is not big enough
  • Vertical scaling is too expensive

The following diagram shows the Sharding in MongoDB using sharded cluster.


ShardsShards are used to store data.

Config Servers - Config servers store the cluster's metadata.

Query Routers - Query routers are basically mongo instances, interface with client applications and direct operations to the appropriate shard. The query router processes and targets the operations to shards and then returns results to the clients.