Tagged

Hive

A collection of 11 posts

Using SparkSQL and Pandas to Import Data into Hive and Big Data Discovery
Big Data

Using SparkSQL and Pandas to Import Data into Hive and Big Data Discovery

Big Data Discovery [https://www.oracle.com/big-data/big-data-discovery/index.html] (BDD) is a great tool for exploring, transforming, and visualising data stored in your organisation’s Data Reservoir. I presented a workshop on it at a recent conference [https://speakerdeck.com/rmoff/unlock-the-value-in-your-big-data-reservoir-using-oracle-big-data-discovery-and-oracle-big-data-spatial-and-graph] , and got an interesting question from

OBIEE 11.1.1.9 Now Supports HiveServer2 and Cloudera Impala
Big Data

OBIEE 11.1.1.9 Now Supports HiveServer2 and Cloudera Impala

As you all probably know I’m a big fan of Oracle’s BI and Big Data products [http://www.rittmanmead.com/biforum2015/optional-masterclass-delivering-the-oracle-information-management-big-data-reference-architecture/] , but something I’ve been critical of [https://www.rittmanmead.com/blog/2014/12/connecting-obiee11g-on-windows-to-a-kerberos-secured-cdh5-hadoop-cluster-using-cloudera-hiveserver2-odbc-drivers/] is OBIEE11g’s lack of support for HiveServer2 connections to Hadoop

Technical

OBIEE and ODI on Hadoop : Next-Generation Initiatives To Improve Hive Performance

The other week I posted a three-part series (part 1 [https://www.rittmanmead.com/blog/2014/12/going-beyond-mapreduce-for-hadoop-etl-pt-1-why-mapreduce-is-only-for-batch-processing/] , part 2 [https://www.rittmanmead.com/blog/2014/12/going-beyond-mapreduce-for-hadoop-etl-pt-2-introducing-apache-yarn-and-apache-tez/] and part 3 [https://www.rittmanmead.com/blog/2014/12/going-beyond-mapreduce-for-hadoop-etl-pt-3-introducing-apache-spark/] ) on going beyond MapReduce for Hadoop-based ETL, where I l

Technical

Analytics with Kibana and Elasticsearch through Hadoop - part 2 - Getting data into Elasticsearch

Introduction In the first part of this series [https://www.rittmanmead.com/blog/2014/11/analytics-with-kibana-and-elasticsearch-through-hadoop-part-1-introduction/] I described how I made several sets of data relating to the Rittman Mead blog from various sources available through Hive. This included blog hits from the Apache webserver log, tweets, and metadata from