ETL Offload with Spark and Amazon EMR - Part 3 - Running pySpark on EMR

In the previous articles (here [], and here [] ) I gave the background to a project we did for a client, exploring the benefits of Spark-based ETL processing running on Amazon's Elastic Map Reduce (EMR) Hadoop


Streaming data from Oracle using Oracle GoldenGate and Kafka Connect

This article was also posted on the Confluent blog [] , head over there for more great Kafka-related content! -------------------------------------------------------------------------------- Kafka Connect [] is part of the Confluent Platform [], providing a set

Oracle GoldenGate

Using logdump to Troubleshoot the Oracle GoldenGate for Big Data Kafka Handler

Oracle GoldenGate [] for Big Data (OGG BD) supports sending transactions as messages to Kafka topics, both through the native Oracle handler [] as well as a connector into Confluent's Kafka