SETUP
SERVER: Windows2012
VERSION: 2.6.4
DRIVER: C#
NUMBER OF UPDATED OBJECTS: 8000
AVERAGE TIME: 20 seconds
DURING BUG TIME: 5 minutes!!
BACKGROUND
Every 5 minutes or so we have a process that kicks off a 8k object bulk upsert operation. We have noticed that after some time, the average processing time goes from 20 seconds to over 5 minutes for no apparent reason.
FACTS
In the logs, there doesn't appear to be any errors.
Once the spike occurs, every subsequent batch upsert takes just as long
The dataset is nearly identical every time
I checked the index used in the update and it's a proper b-tree before and after the spike.
Restarting the mongod.exe fixes the issue.
Server resources are not capped. 63/96 GB available.
Additionally, the logs go from the typical "query too big to record" to showing each individual insert. This has to be related or indicator of what is going on.
Another fact, (wish I could edit the original post btw)
If I step down the primary, the new primary runs fine (20 second bulk times). When I go back to the original primary, the issue still exists.
AHA!
I cleared the cache plans it performance returned to normal.
db.<collection>getPlanCache(). clear()
and.... the problem returns.
At this point we have to now script the clearing of the query plan before each write. Ugh.
You might consider upgrading to 2.6.5 - (well, 2.6.6 is now out).
There were some index related bugs fixed in 2.6.5.
There were some index related bugs fixed in 2.6.5.
댓글 없음:
댓글 쓰기