For the case of study of my thesis, I really need a original MongoDB database, a sample of the database or something like that. (the data could be made anonimous)
My project is to offer a adapted methodology to analyse the data quality in the NoSQL database. But to give some value to my thesis it is important that the database exist in the real world.
If someone can help, it would be great!
A sample of a database? There are a plethora of public datasets available. You can also just import the zip code json file mongo provides. However, I am guessing you want a messy live database? If that is the case, I would suggest this ecommerce dataset
I need to start collecting data by Dec 7. My thesis involves search results in structured (sql) and unstructured (no-sql) environments using a financial ontology for Chinese students (I live and teach in Chinese university)
Maybe you are looking for 'dirtier' data that you can improve upon.
I just learned how to import csv files into mongo and I'm going to download as many of these datasets as possible. If you think you can use these datasets I can tell you how to load them into mongo from csv.
Im not really sure if there is a more direct way of doing this, I think we must use csv or json files that were exported from established datasets such as these.
Xampp is easy to set up and you'll have to access it through the browser.
Northwind is a very famous database and should give your research the validity it needs. You can add complexity to it by creating some table joins etc.. but I'm not sure how it will export to a JSON file. I'm am also new to this MongoDB unstructured data environment. I am going to try it.
If you have any questions about setting up and using xampp and phpmyadmin feel free to ask me.
Might be what you need.
Hi I am also doing a dissertation invoving unstructured data vs structured data i.e., noSQL compared to SQL. Can I ask what methodology you will use and what kind of interface are you using to access MongoDB? or are you just using the shell? My research involves user queries made on each system and wish to use a search algorithm (haven't decided yet) but am looking at something involving Machine Learning or Statistical Machine Learning
Thank you for the datasets, but it seems not to be what i'm looking for. This data are in csv and I'm looking for original datasets produced with mongoDB. Maybe do you have an other idea ? And MongoDB provides a "Zip code json file" ? I didn't find it.
I'm using the methodology of Isabelle Boydens. She developped the cycle of improvement. That implied data profiling, data standardisation&matching, data monitoring.
And I'm stil using the shell but I should find also a better interface.
I will check into Isabelle's methodology, thanks. And I'd still like to code a PHP interface but it looks like my search for a driver is futile. So, going to look into Java or Python to connect to Mongo, if successful I'll let you know right away.
When is your thesis due? I have made a search engine for Php / MySql in xammpp, you are welcome to the code if you need it.
Have you looked at
What do you mean by:
> And I'd still like to code a PHP interface but it looks like my search for a driver is futile.There is a supported PHP driver for MongoDB. See here:
http://docs.mongodb.org/ ecosystem/drivers/php/
> And I'd still like to code a PHP interface but it looks like my search for a driver is futile.There is a supported PHP driver for MongoDB. See here:
http://docs.mongodb.org/
To make a Php interface for mongodb (through a webserver such as xampp) a driver is required. I want to make a html / php etc form so that a user can interface with mongodb (do searches etc..) instead of directly through the mongo shell.
I've tried all the , tutorials, blogs and did all the steps many times over and nothing has worked. But thanks for everyone's suggestions.
And about my question, is there one of the mongoDB users here someone who would maybe be agree to share his database in the context of my thesis? Or someone who knows where to find a free mongoDB datasets?
I will look around and if I find one I'll send it to you. I am also looking for a dataset in the business domain.
Thank you Bruce, I'll do the same for you!
Martin, I found this site:
https://www.quandl.com/ collections
These are established datasets that can be downloaded into csv format.https://www.quandl.com/
I just learned how to import csv files into mongo and I'm going to download as many of these datasets as possible. If you think you can use these datasets I can tell you how to load them into mongo from csv.
Anyway, I am starting the process of importing these into my database.
Thank you for the link!
So, you think it is impossible to find free datasets produced on mongoDB on the internet?
Yes I would prefer to find data with more complex and various metadata and data.
But the way you will import this files into mongo interest me surely!
Hi Martin,
Sorry for the slow reply. I've attached the MySql version of Microsoft's Northwind database. You can convert this into a JSON file by using by using phpmyadmin. Phpmyadmin comes with the xampp download and has an export feature that will convert it to a JSON file, after that you can import it into MongoDB. But first you have to create a database in phpmyadmin then paste the file into the sql query analyzer.
Surley a relational dataset designed as an sql schema is the worse case example for mongodb, a non-relational db system with no server side join capability.
Perhaps something more like
No problem. If I would have problem during importation I won't hesitate!
I had already seen this website that offers differents datasets but I'm confused in the attempt of chosing the right dataset for my research.
In my thesis, I would like to apply the dataset with the data quality tool : OpenRefine (GoogleRefine) that permits to do data profiling and matching. This is used for data from SQL databases and I would like analyze how it works with NoSQL dataset.
Maybe would you have a idea of a well adapted dataset for my case ? I was thinking about the "US patent data" you suggested.
And also I would like to develop the dataset on mongoDD*
댓글 없음:
댓글 쓰기