2014년 12월 22일 월요일

get all the matched embedded documents from an array in Mongodb

"MongoDB"
The sample structure is as follows:

{
    "_id": 1,
    "College_name" : "abcdef"
    "Students": [
{
  "name": "a",
  "Grade": "First"
},
{
  "name": "b",
  "Grade": "Second"
},
{
  "name": "c",
  "Grade": "First"
},
{
  "name": "d",
  "Grade": "First"

}
]
}

The query is like this:

I want to get all the embedded documents

                     where "College_name" : "abcdef" 
                     and    "Students.Grade": "First"

I want to get 3 documents in the above structure.

I could get only the first matched document in that array but not all (3 documents in the above example); I do not want "Grade" : "Second"  document.

Please help me out solving this problem.



AFAIK (and I don't know much TBH), this isn't possible in MongoDB. At least not with the thinking that subdocuments can be returned like single documents. Mongo can only give you the complete document with all subdocuments as a result. There might be a way to do it with the aggregation framework, but that is way above my current knowledge.

I think you have two choices. 

Create a "students" collection and query on the students collection, where the id of the schools collection (I am assuming that is the name of the collection of your example data above) is a reference field back to the "schools" collection _id. If you need more data about the school, it would be a very fast second query to that collection. Yes, this is going back to normalized data, but that isn't necessarily a bad thing in Mongo. 

Or work with the result data for the school from the whole document in your application and "pick out" the data you want to show.   

Seeing you will be having probably 1000's and 1000's of student and a single school could have 1000s of students and over the years, get a whole lot more and most likely you'll want to do a lot more querying on them, I'd say it is better to reference between two separate collections. 



Thanks for the information :) :) 

I have few more questions..

How far we can go for aggregation?? Does this work fine for around say 2 Lakh documents???

When is the right time to go for MongoDB-Hadoop stuff??

Please gimme some very good insights...



As I mentioned above, I am not very knowledgeable on the aggregation framework or MR (map/reduce) system in Mongo.  But, I'll try to answer your questions. 

I had to look up what the value of a Lakh is and if I understood correctly, the value of one Lakh is 100,000? So you are asking about a ~200,000 document database? 

If so, Mongo's aggregation framework or MR system should have no problem working with this number of documents.

As for when to go for Hadoop? I think the criteria to use Hadoop over Mongo's systems would be determined by how complex the analysis of your data needs to be and not necessarily by the size of your data. I could be wrong on that though. Maybe someone more knowledgeable could answer. 

Also as a general rule in a community, if you have questions on another subject, it is better to start a fresh new thread. You will have a better chance to get proper answers.:)


댓글 없음:

댓글 쓰기