Hive - Rittman Mead

Using SparkSQL and Pandas to Import Data into Hive and Big Data Discovery

Big Data Discovery [https://www.oracle.com/big-data/big-data-discovery/index.html] (BDD) is a great tool for exploring, transforming, and visualising data stored in your organisation’s Data Reservoir. I presented a workshop on it at a recent conference [https://speakerdeck.com/rmoff/unlock-the-value-in-your-big-data-reservoir-using-oracle-big-data-discovery-and-oracle-big-data-spatial-and-graph] , and got an interesting question from

Big Data

Replicating Hive Data Into Oracle BI Cloud Service for Visual Analyzer using BICS Data Sync

In yesterday’s post on using Oracle Big Data Discovery with Oracle Visual Analyzer in Oracle BI Cloud Service [https://www.rittmanmead.com/blog/2015/06/combining-oracle-big-data-discovery-and-oracle-visual-analyzer-on-bics/] , I said mid-way through the article that I had to copy the Hadoop data into BI Cloud Service so that Visual Analyzer could

Big Data

Using HBase and Impala to Add Update and Delete Capability to Hive DW Tables, and Improve Query Response Times

One of our customers is looking to offload part of their data warehouse platform to Hadoop, extracting data out of a source system and loading it into Apache Hive tables for subsequent querying using OBIEE11g. One of the challenges that the project faces though is how to handle updates to

Big Data

OBIEE 11.1.1.9 Now Supports HiveServer2 and Cloudera Impala

As you all probably know I’m a big fan of Oracle’s BI and Big Data products [http://www.rittmanmead.com/biforum2015/optional-masterclass-delivering-the-oracle-information-management-big-data-reference-architecture/] , but something I’ve been critical of [https://www.rittmanmead.com/blog/2014/12/connecting-obiee11g-on-windows-to-a-kerberos-secured-cdh5-hadoop-cluster-using-cloudera-hiveserver2-odbc-drivers/] is OBIEE11g’s lack of support for HiveServer2 connections to Hadoop

Big Data

Oracle Data Integrator Enterprise Edition Advanced Big Data Option Part 1- Overview and 12.1.3.0.1 install

Oracle recently announced Oracle Data Integrator Enterprise Edition Advanced Big Data Options [https://www.oracle.com/middleware/data-integration/enterprise-edition-big-data/index.html] as part of the new 12.1.3.0.1 release of ODI. It includes various great new functionalities to work on an Hadoop ecosystem. Let's have

Technical

OBIEE and ODI on Hadoop : Next-Generation Initiatives To Improve Hive Performance

The other week I posted a three-part series (part 1 [https://www.rittmanmead.com/blog/2014/12/going-beyond-mapreduce-for-hadoop-etl-pt-1-why-mapreduce-is-only-for-batch-processing/] , part 2 [https://www.rittmanmead.com/blog/2014/12/going-beyond-mapreduce-for-hadoop-etl-pt-2-introducing-apache-yarn-and-apache-tez/] and part 3 [https://www.rittmanmead.com/blog/2014/12/going-beyond-mapreduce-for-hadoop-etl-pt-3-introducing-apache-spark/] ) on going beyond MapReduce for Hadoop-based ETL, where I l

Technical

Analytics with Kibana and Elasticsearch through Hadoop - part 3 - Visualising the data in Kibana

In this post we will see how Kibana can be used to create visualisations over various sets of data that we have combined together. Kibana is a graphical front end for data held in ElasticSearch, which also provides the analytic capabilities. Previously we looked at where the data came from

Technical

Analytics with Kibana and Elasticsearch through Hadoop - part 2 - Getting data into Elasticsearch

Introduction In the first part of this series [https://www.rittmanmead.com/blog/2014/11/analytics-with-kibana-and-elasticsearch-through-hadoop-part-1-introduction/] I described how I made several sets of data relating to the Rittman Mead blog from various sources available through Hive. This included blog hits from the Apache webserver log, tweets, and metadata from

Technical

Analytics with Kibana and Elasticsearch through Hadoop - part 1 - Introduction

Introduction I’ve recently started learning more about the tools and technologies that fall under the loose umbrella term of Big Data [http://cdn.meme.am/instances/500x/47510205.jpg], following a lot of the blogs that Mark Rittman has written, including getting Apache log data into Hadoop [https://www.

Technical

Using rlwrap with Apache Hive beeline for improved readline functionality

rlwrap is a nice little wrapper in which you can invoke commandline utilities and get them to behave with full readline [http://www.gnu.org/software/bash/manual/html_node/Readline-Interaction.html#Readline-Interaction] functionality just like you’d get at the bash prompt. For example, up/down arrow keys to

Technical

Adding Oracle Big Data SQL to ODI12c to Enhance Hive Data Transformations

An updated version of the Oracle BigDataLite VM [http://www.oracle.com/technetwork/database/bigdata-appliance/oracle-bigdatalite-2104726.html] came out a couple of weeks ago, and as well as updating the core Cloudera CDH software to the latest release it also included Oracle Big Data SQL [http://www.oracle.com/us/