2015년 1월 9일 금요일

The parsing error from BsonDocument to JsonReader

I have to work the MongoDB C# Driver and Json.Net.

This sources installed from Nuget.

And i made a function of deserializer.  It is as follows:

public static T ConvertTo<T>(this JsonSerializer serializer, BsonDocument bson)
{
    using (var jsonReader = new JTokenReader(JToken.Parse(bson.ToJson())))
    {
        jsonReader.DateTimeZoneHandling = DateTimeZoneHandling.Local;
        var result = serializer.Deserialize<T>(jsonReader);
        return result;
    }
}

But.. The problem of parsing error occurs when BsonDocument to JsonReader coverting.

"_id" : ObjectId("54ab8b06d05efe1e5063941f") => String
"date" : ISODate("2015-01-06T07:13:09.993Z") => DateTime

ObjectId and ISODate types are not converted.

I would like to know a solution for this.


How to measure the performance of Mongodb Database?

Kindly any one help me in understanding the performance metrics for mongodb. I am unable to measure it. Kindly give ur feedback, if u ve idea regarding this?



It would help to know what you're trying to measure and why you think you can't.

MongoDB ships with some tools (mongostat and mongotop) to see real time metrics. You can use MMS monitoring to collect and view historical data.


2015년 1월 7일 수요일

passing lambda expression to collection.Find(...)

I'm following the instructions at the github page and am using what I understand to be the latest version (1.9.2.235)


var list = await collection.Find(=> x.Name == "Jack")

This does not compile for me. The only arguments to find I can see are IMongoQuery. Are the docs out of date or is something else wrong??



The github page is referring to what is in master. To use this code, you'll need to build the driver from master or pull it from our build feed. 

However, the 1.x  branch (which includes 1.9.2.235) has a quickstart for it as well: https://github.com/mongodb/mongo-csharp-driver/tree/v1.x.



thank you. so to confirm, the code in master (and the docs) is *ahead* of the 1.9.2 (nuget) build? Do you typically update nuget after a stable version is published?



The code in master is ahead of the 1.9.2. You can find a tag for 1.9.2 which gives the exact source snapshot of it.

There are some RCs that are also published, 1.10.0-rc1 for example. The code in master is also ahead of those. You can get master build fields here: https://www.myget.org/gallery/mongodb. Any 2.0.0 build is from master.


duplicate key error index when doing an upsert

I have some (C#) code that upserts a bunch of documents as follows:

        public void UpsertByProductID(IEnumerable<Product> documents)
        {
            if (documents.Any())
            {
                var query = Query.In("Attributes.Product ID", documents.Select(x => BsonValue.Create(x["Product ID"])));
                productsCollection.Remove(query, RemoveFlags.None, WriteConcern.Acknowledged); 
                productsCollection.InsertBatch(documents);
            }
        }

And I'm regularly getting the following error:

WriteConcern detected an error ''. (Response was { "err" : "E11000 duplicate key error index: skubrain.Product.$Attributes.Product ID_1  dup key: { : null }", "code" : 11000, "n" : 0, "connectionId" : 11, "ok" : 1.0 }).

There is (obviously) a unique index defined on the  Attributes.Product ID field. However as far as I can tell, from the above logic, it shouldn't be possible for this code to cause this write concern error... it is specifically deleting any documents that match the Attributes.Product ID of the documents being inserted before performing the insert. 

Any ideas?



I'm regularly getting the following error:

WriteConcern detected an error ''. (Response was { "err" : "E11000 duplicate key error index: mydatabase.Product.$Attributes.Product ID_1  dup key: { : null }", "code" : 11000, "n" : 0, "connectionId" : 11, "ok" : 1.0 }).

There is (obviously) a unique index defined on the  Attributes.Product ID field. However as far as I can tell, from the above logic, it shouldn't be possible for this code to cause this write concern error... it is specifically deleting any documents that match the Attributes.Product ID of the documents being inserted before performing the insert. 

The duplicate key error indicates that the issue is with "null" values for "Attributes.Product ID".

If a document doesn't have a value for the field in a unique index, the index entry will be null unless the index is both unique and sparse:

In a non-sparse unique index, that means that only one document in the collection can have a null/missing value.

I would check that your documents either always include the "Attributes.Product ID" field, or drop and recreate the unique index as unique & sparse if the "Product ID" is not always present.



OK so I've got no idea why the bulk insert didn't work but I've basically worked around it for the time with a loop that upserts documents individually:


       public void UpsertByProductID(IEnumerable<Product> documents)
        {
            if (documents.Any())
            {
                foreach (var doc in documents)
                    try
                    {
                        var query = Query.EQ("Attributes.Product ID", BsonValue.Create(doc["Product ID"]));
                        var existing = FindOne(query);
                        if (existing != null)
                            doc.Id = existing.Id;
                        Update(
                            query, 
                            Update<Product>.Replace(doc), 
                            UpdateFlags.Upsert
                            );
                    }
                    catch (Exception ex)
                    {
                        ex.Data["Json"] = doc.ToJson();
                        throw ex;
                    }
            }
        }

It's a bit more wordy that my original code but it has the incomparable advantage that it works...



I noticed that as well... however the error message was erroneous in that respect since:

a) the index was defined as sparse and
b) none of the documents being inserted had a null or otherwise blank (e.g. whitespace) value for Attributes.Product ID. 

I initially tried replacing the bulk insert with a loop that used FindAndModify to individually upsert each of the documents and I was still getting the error... so I knew exactly which document had caused the error. Something was really odd about it. If I took the JSON for that document and simply did a find in MongoVue for any documents matching the index fields, I was getting nothing back. Also, if I performed the Upsert from within MongoVue everything went swimmingly.

I suspect then that it might be an issue with the C# driver. In any event, using Update (with upsert = true) instead of FindAndModify appears to have resolved the issue for the time being.


Course M102 starts tomorrow?

Newbie signs up for dba course that is slated to begin tomorrow.  I see no entry point into the class.  I've not received any emails with directions.  How does one begin the course? The about page doesn't mention anything on the entry point or a how to: https://university.mongodb.com/courses/M102/about  



If you go to university.mongodb.com and sign in using the button in the upper right, then, after signing in, hit the My Courses button that should show up where the Sign In button used to be, do you see M102 listed under your courses? If you click on the course title, it should take you to the released course content once the course begins.

If you don't see M102 listed there, can you re-register for it on https://university.mongodb.com/courses/M102/about?



If you click on the course title, it should take you to the released course content once the course begins.

Thanks, Will.

Yes, I get to the course title that I'm registered for, but when I click on the course title it sends me to the "about" page you posted.  You seem to indicate that information about how to attend the class doesn't get posted until the class starts. The FAQ on the "about" page probably should have a bullet to indicate how to take the class, even if it's just in general, like "Don't look for how to take the class until the class starts."  or "Come back here after the class start time to get the youtube url to enter the class."

The "about" pages says this:

Next Session

06 Jan 2015 at 17:00 UTC
24 Feb 2015 at 17:00 UTC
You are registered for this course.
It just doesn't say how to attend the class anywhere.



Hi, Raymond. Great idea to include a statement telling those who register to return to the course page once the course begins in order to begin. We will do that. Once the course opens you will see a "View Courseware" button on the course page that will enable you to enter the course.

I do want to clarify, however. As the FAQ on the course page states, you do not need to be available at a particular time of day to attend the course. The course is composed of pre-recorded video segments and weekly homework assignments. You may watch the videos and complete the homework at any time during the week that material is released.


toring xml files in mongodb

I have a use case that I need help from you all. I have xml data roughly around 50 terabytes to start with supplied through some external source and then every day additional xml data will be around 1 - 2 terabytes roughly to be stored to mongodb. Ultimate aim is to query the xml data in some web app and throw an analytics dashboard. Like top 10 products, top 10 routes etc. As mongodb native format is json , is it a viable option to convert this many xml data to json or can I just store the whole xml data as it is with the transformation.

Any suggestions with the storing part and the querying part. Also please let me know if there are say 50 million such documents how much the query time would be to do joins .



You will want to convert the XML to JSON/BSON; otherwise, it won't be possible to do any reasonable queries. It's difficult to say much more than that about your questions because we have so little information. Can you be more specific about

- what the XML looks like
- what the use case is for the dashboard- what specific analytics do you want?

You will want to design the transformed JSON/BSON documents to fit your use case, especially given the large quantity of data that you have. MongoDB doesn't do join. Joins must be done application side. You should design the documents to make large-scale joins unnecessary, or necessary only for rare queries that you're willing to wait for.


renaming a collection using the c++ driver

Could anyone provide an example of how I can rename a Mongodb collection using the C++ driver? 



The C++ driver does not currently offer an explicit helper method to invoke the renameCollection command. However, you can issue all database commands honored by the server (see http://docs.mongodb.org/manual/reference/command/) by using the DBClientWithCommands::runCommand method (declared in mongo/client/dbclientinterface.h).

For the specific parameters for the renameCollection command, see http://docs.mongodb.org/manual/reference/command/renameCollection/