Rittman Mead
  • Services
  • Training
  • Products
  • Case Studies
  • About
Subscribe
Tagged

emr

A collection of 2 posts

emr

ETL Offload with Spark and Amazon EMR - Part 3 - Running pySpark on EMR

In the previous articles (here, and here) I gave the background to a project we did for a client, exploring the benefits of Spark-based ETL processing running on Amazon's Elastic Map Reduce (EMR) Hadoop platform. The proof of concept we ran was on a very simple requirement, taking inbound files

  • Robin Moffatt
Robin Moffatt Dec 19, 2016 • 11 min read
obiee

ETL Offload with Spark and Amazon EMR - Part 1 - Introduction

We recently undertook a two-week Proof of Concept exercise for a client, evaluating whether their existing ETL processing could be done faster and more cheaply using Spark. They were also interested in whether something like Redshift would provide a suitable data warehouse platform for them. In this series of blog

  • Robin Moffatt
Robin Moffatt Dec 15, 2016 • 3 min read
Rittman Mead © 2022
Powered by Ghost