Rittman Mead
  • Training
  • Case Studies
  • About
  • Blog
  • Search
Subscribe
Tagged

emr

A collection of 2 posts

emr

ETL Offload with Spark and Amazon EMR - Part 3 - Running pySpark on EMR

In the previous articles (here [https://www.rittmanmead.com/blog/2016/12/etl-offload-with-spark-and-amazon-emr-part-1], and here [https://www.rittmanmead.com/blog/2016/12/etl-offload-with-spark-and-amazon-emr-part-2-code-development-with-notebooks-and-docker/] ) I gave the background to a project we did for a client, exploring the benefits of Spark-based ETL processing running on Amazon's Elastic Map Reduce

  • Robin Moffatt
Robin Moffatt Dec 19, 2016 • 11 min read
obiee

ETL Offload with Spark and Amazon EMR - Part 1 - Introduction

We recently undertook a two-week Proof of Concept exercise for a client, evaluating whether their existing ETL processing could be done faster and more cheaply using Spark. They were also interested in whether something like Redshift [http://docs.aws.amazon.com/redshift/latest/mgmt/welcome.html] would provide a suitable

  • Robin Moffatt
Robin Moffatt Dec 15, 2016 • 3 min read
Rittman Mead © 2025
Powered by Ghost