Hadoop::Streaming::* provides a simple perl interface to the Streaming interface of Hadoop. Hadoop is a system "reliable, scalable, distributed computing." Hadoop was developed at Yahoo! and is now maintained by the Apache Software Foundation. Hadoop...
Hadoop::Streaming::Reducer - Simplify writing Hadoop Streaming jobs. Write a reduce() function and let this role handle the Stream interface. This Reducer roll provides an iterator over the multiple values for a given key.
Hadoop::Streaming::Combiner - Simplify writing Hadoop Streaming jobs. Combiner follows the same interface as Reducer. Requires a combine() function which will be called for each line of combiner data. Combiners are run on the same machine as the mapper as a pre-reduce reduction step.