MapReduce

Tagged

A collection of 2 posts

Going Beyond MapReduce for Hadoop ETL Pt.2 : Introducing Apache YARN and Apache Tez

In the first post [https://www.rittmanmead.com/blog/2014/12/going-beyond-mapreduce-for-hadoop-etl-pt-1-why-mapreduce-is-only-for-batch-processing/] in this three part series on going beyond MapReduce for Hadoop ETL, I looked at how a typical Apache Pig script gets compiled into a series of MapReduce jobs, and those MapReduce jobs pass data between themselves by

Technical

Going Beyond MapReduce for Hadoop ETL Pt.1 : Why MapReduce Is Only for Batch Processing

Over the previous few months I’ve been looking at the various ways you can load data into Hadoop, process it and then report on it using Oracle tools [https://www.rittmanmead.com/blog/2013/11/why-odi-dw-and-obiee-developers-should-be-interested-in-hadoop/] . We’ve looked at Apache Hive and how it provides a SQL layer