We have some open questions we were hoping others might help us with.
What level of vm would be needed for each type of shard machine configuration? Do we need 3 gigs ram, 14 gigs, etc. Is there any information that gives us information into how each shard machine operates? Something that would indicate to us what level of virtual machine we should test with?
Is it advantageous to mix windows and linux machines to accomplish sharding? Or should we go with all linux, and all windows.
It might be helpful to look at Production Cluster Architecture from the MongoDB Manual. A sharded cluster has three types of servers: mongos, config servers, and shards. mongos are lightweight query routing processes. We generally advise colocating them with application servers because they do not require a lot of resources and colocation cuts out one network hop. The config servers, of which there must be exactly 3, are special mongod instances. They hold cluster metadata. This makes them lighter weight than the shards, which hold the actual data, but more resource-intensive than the mongos processes. The shards, each of which should be a replica set, holds the data. The key thing is to size the shard machines so that the working set of your data fits in memory. You might need to do testing with realistic workloads to understand the general size of the working set, or you can estimate it and divide by the number of shards to get the approximate working set size per shard, assuming a good shard key that distributes documents and work efficiently.
You can mix Windows and Linux machines in the same cluster, but it's not advantageous. Linux and Windows have different performance characteristics and configurations - running them together introduces additional complications with no benefits. I would use all Linux machines.
If I have to run config servers separately do we have any guidelines about its hardware requirements? In particular, I am looking for the disk and RAM requirements.
What is the recommended setup of deploying configdb considering that they are "resource-intensive"?
What is the recommended setup of deploying configdb considering that they are "resource-intensive"?
As Will said, the config servers are more resource intensive than the mongos daemons, but they aren't terribly resource intensive. You can run a config server on a relatively smaller machine compared to the shards.
Thanks. Is there any requirement that the config database working set + indexes should entirely be in RAM? I guess it will always be preferred :). What are the consequences if it is not the case?
I am trying to size up machines where I will run my config servers. We are required to run them on separate machines. Hence my question.
I am trying to size up machines where I will run my config servers. We are required to run them on separate machines. Hence my question.
Will could probably give a much better answer, but I would guess, since all the config server is holding is informational metadata about the clusters state and configuration, the working set shouldn't be really too big. A couple to few gigs should cover it, I believe. Depends of course on how big of a cluster you have too.
Here is some more info on what is being stored on the config server.
댓글 없음:
댓글 쓰기