This caught us off guard when testing 2.6, because when we run a 2.6 binary against our existing data files, the index is there and is still used. Only when we create a new database and attempt to index its namespace collection do we run into problems. Further, if a 2.6 secondary is syncing to a 2.4 primary, an index build on system.namespaces on the primary is happily picked up and built on the secondary without error. Given this, the restriction seems arbitrary. I tested one of our sample workloads against 2.6.5 where all of the existing indexes on system.namespaces were removed, and saw an 11% performance degradation. Is indexing system.namespaces harmful in some way? Are there any workarounds for doing the kind of lookups we're doing now?
You shouldn't be querying against system collection since it's an implementation detail. It will not be available with other storage engines anyway.
Why do you need to query to see if collection exists? You can just query for count from it - if it's not there you'll get 0...
Didn't know that system.namespaces might be unavailable in other storage engines. That's definitely good to know. :)
Turns out our biggest use cases are actually checking if indexes exist already, and also checking the total # of indexes on a collection. Some parts of our application logic check if indexes already exists before building them. We also need to know the total # of indexes so we don't exceed the max indexes per collection. We use caching to avoid repeatedly querying for this information, but it's still important that it remains fast when we do go to the db. Without an index on system.namespaces we are left without an efficient way to check if an index already exists.
We don't want to make unnecessary calls to ensureIndex either. From what we've been able to tell, mongo has to do a collection scan when determining if an index already exists: https://github.com/ mongodb/mongo/blob/r2.6.5/src/ mongo/db/catalog/index_ catalog.cpp#L1044-L1052
There might be ways around this - remind me again, how big are your larger system.namespaces collections (i.e. how many indexes and collections across a single database)?
Our largest system.namespaces collection is around 32000.
댓글 없음:
댓글 쓰기