{ name: 'Old Novice', rating : [ {user: 'ngsiolei', point: 3}, {user: 'lei', point: 4} ] }
Rating distribution (count per rating point) is a common data we need to know. Now i try to get rating distribution group by business. map function walks through all documents, i create a count object for each rating and emit the count object with the key business name. reduce function accumulates the counts associated with keys.
db.runCommand({ mapreduce: 'business', map: function() { var rating = this.rating for (var i = 0; i < rating.length; i++) { var count = { '1' : 0, '2' : 0, '3' : 0, '4' : 0, '5' : 0, 'all' : 0 }; count[rating[i]['point'].toString()] = 1; emit(this.name, count); } }, reduce: function(key, values) { var count = { '1' : 0, '2' : 0, '3' : 0, '4' : 0, '5' : 0, 'all' : 0 }; for (var i = 0; i < values.length; i++) { count['1'] += values[i]['1']; count['2'] += values[i]['2']; count['3'] += values[i]['3']; count['4'] += values[i]['4']; count['5'] += values[i]['5']; } count['all'] = count['1'] + count['2'] + count['3'] + count['4'] + count['5']; return count; }, out: 'res20101031', });
So, i can find a business's rating distribution by simple query
db.res20101031.find({ 'name' : 'Old Novice' }); { "_id" : "Old Novice", "value" : { "1" : 0, "2" : 0, "3" : 1, "4" : 1, "5" : 0, "all" : 2 } }
i had a lesson about reading documentation carelessly. That is, i spent hours on debugging on inconsistent data format between map and reduce and finally found that MongoDB document said it explicitly
The output of the map function's emit (the second argument) and the value returned by reduce should be the same format to make iterative reduce possible. If not, there will be weird bugs that are hard to debug.
References
http://www.mongodb.org/display/DOCS/MapReducehttp://kylebanker.com/blog/2009/12/mongodb-map-reduce-basics/
http://rickosborne.org/blog/index.php/2010/02/08/playing-around-with-mongodb-and-mapreduce-functions/
No comments:
Post a Comment