Tuesday, April 22, 2014

MongoDB MapReduce Notes

A few things that stumped me for a while and took me some time to figure out:

  1. Each iteration of the Map step applies only to one document. However, there is a lot of flexibility in terms of what gets emitted in iteration. It can either emit multiple items, or none if certain conditions are not met.
  2. It is also completely up to the user to create the appropriate key in the Map step. The key thing is that all items with the same key (regardless of which document it originated from) all funnels into the same Reduce iteration eventually (it is possible for e.g. 200 values to be reduce to 1 value in multiple steps, with the reduce output from previous steps appearing in the list as one of the items).
  3. The Reduce function will not be called for keys that have single values. This means that the same object structure should be used throughout the process.
  4. There is a "scope" attribute which can be used for global parameters.

Saturday, April 19, 2014

MongoDB Collection Counting

This is done via the method "count()"

db.userAccount.count()

This can be extended to perform conditional counting:

e.g. number of users in New York city

db.userAccount.count({"city":"New York"})

More examples and references: http://docs.mongodb.org/manual/reference/method/db.collection.count/

MongoDB Limit Fields to Return

This is done using the projection feature. Some examples:

To find all entries in userAccount but only have the email returned:

db.userAccount.find({},{"email":"1"});

To find all entries in userAccount from city "New York" and only have the email returned:

db.userAccount.find({"city","New York"},{"email":"1"});

0 can be used to exclude the field

More examples and references: http://docs.mongodb.org/manual/tutorial/project-fields-from-query-results/