2014년 12월 4일 목요일

How to rename all fields in a collection

I have a case where I have a database about 200GB with around 30 collections. I found out recently that shortening field names allows me to save on disk space and RAM. I did an experiment with a test db with single collection of 10M documents and found savings upto 50%. So I want to go ahead and rename fields in my production db. My question is what choices do I have considering following.
- Any option with mongodump and mongorestore where I can rename fields on the fly. I searched around for this but didn't find anything.
- $rename operator: I can do db.collection.update() for one field at a time. Can I rename multiple fields in the same update command? Which one is more efficient?
- what happens to existing indexes? I guess I need to drop all of them first and then recreate them based on new field names.
- what happens when the database/collection is sharded?



Check this blog post out.


It is strange the MongoDB documentation, where $update is referenced, doesn't mention anything about the "multi" or "upsert" options. It is only mentioned in the part about CRUD



Thanks for the link but I know about the approach mentioned in the blog post. I was wondering if there is any other way (more efficient)? 



Each document would need to be touched on, so I can't imagine anything else being faster than the update with the multi:true option. It is the name change in all documents in one command, which you asked for.



Thanks. Is there a way to rename multiple fields in the same update command? I couldn't find a way. 
I want to rename several fields which are present in all documents of a collection. 



Below is the  update command to change multiple fields :-

db.test.update({},{$rename:{"uname":"name","uid":"id"}},false,true)

Fields are comma separated in rename block.

Hope that solves your problem



You only have to run a single rename update since $rename is just like
any other operator, you use the syntax
db.collection.update({},{$rename:{old:"new",old2:"new2"}},{multi:true})

But I'm curious, how did you measure the space saving?   Did your test
collection have the same indexes as your actual production?   50%
savings of total size is pretty huge - unless you had *really* long
field names and very few indexes, I wouldn't expect the space savings
to be anywhere near that large.



> It is strange the MongoDB documentation, where $update is referenced,
> doesn't mention anything about the "multi" or "upsert" options. It is only
> mentioned in the part about CRUD.
You linked to the page that lists update operators.   Neither "multi"
nor "upsert" are update operators.



I will try out rename with all fields. Thanks.

I have yet to run the rename on my production database. But just as a test (which is not representative of my production database), I did a simple database which just has 10M following documents.
{
    "_id" : {
        "f" : ObjectId("547cac618fe940ef24602c1c"),
        "a" : ObjectId("547cac618fe940ef24602c1d")
    }
}

After using long field names, it has 10M of following type of documents.
{
    "_id" : {
        "fileForFeature" : ObjectId("547cb51c38cbb3e4d0cdedb2"),
        "attributeForFile" : ObjectId("547cb51c38cbb3e4d0cdedb3")
    }
}

Here's my output of running stats(1024) on the collection and you can see all the improvements.
With short names:
mongos> db.test.stats(1024)
{
    "sharded" : false,
    "primary" : "shard0001",
    "ns" : "fftest1.test",
    "count" : 10000000,
    "size" : 468750,
    "avgObjSize" : 48,
    "storageSize" : 837344,
    "numExtents" : 17,
    "nindexes" : 1,
    "lastExtentSize" : 222464,
    "paddingFactor" : 1,
    "systemFlags" : 1,
    "userFlags" : 1,
    "totalIndexSize" : 643915,
    "indexSizes" : {
        "_id_" : 643915
    },
    "ok" : 1
}

With long names:
mongos> db.test.stats(1024)
{
    "sharded" : false,
    "primary" : "shard0001",
    "ns" : "fftest2.test",
    "count" : 10000000,
    "size" : 1093750,
    "avgObjSize" : 112,
    "storageSize" : 1543116,
    "numExtents" : 19,
    "nindexes" : 1,
    "lastExtentSize" : 405444,
    "paddingFactor" : 1,
    "systemFlags" : 1,
    "userFlags" : 1,
    "totalIndexSize" : 962739,
    "indexSizes" : {
        "_id_" : 962739
    },
    "ok" : 1



You linked to the page that lists update operators.   Neither "multi"
nor "upsert" are update operators. 

Oh Duh! 

Found what I should have found in the reference. Thanks Asya.



댓글 없음:

댓글 쓰기