unbounded data - Rittman Mead

Getting Started with Spark Streaming, Python, and Kafka

Last month I wrote a series of articles [https://www.rittmanmead.com/blog/2016/12/etl-offload-with-spark-and-amazon-emr-part-5/] in which I looked at the use of Spark for performing data transformation and manipulation. This was in the context of replatforming an existing Oracle-based ETL and datawarehouse solution onto cheaper and more elastic