2014년 12월 27일 토요일

Mongodb response slows down incredibly after 200,000 records

Currently our task is to fetch 1 million records from an external server, process it and save it in the db. We are using node.js for fetching the records and mongodb as the database.
We decided to split the process into 2 tasks, fetching the records and processing it. Now we are able to fetch all the records and dump it in mongo but when we are trying to process it(by processing I mean change a few attribute values, do some simple calculation and update the attributes), we see drastically slow response in mongodb updates around 200,000 records.
For processing the data, we take batches of 1000 records process it, update the records( individually) and then go for the next batch. How could the performance be made better?



I am no expert, but from what I've learned, when Mongo starts to slow down, it is usually a sign of a lack of RAM. 



Splitting into fetching and processing the way you are is probably showing you down as you have to store each record twice.  What sort of processing?  Can you process before saving in db?


댓글 없음:

댓글 쓰기