Chủ Nhật, 12 tháng 8, 2012

Map/Reduce Input and Output



The Map/Reduce framework operates exclusively on pairs, that is, the framework views the input to the job as a set of pairs and produces a set of pairs as the output of the job, conceivably of different types.

The key and value classes have to be serializable by the framework and hence need to implement the Writable interface. Additionally, the key classes have to implement the WritableComparable interface to facilitate sorting by the framework.

Input and Output types of a Map/Reduce job:

(input) -> map -> -> combine -> -> reduce -> (output)