Tagged

Big Data

A collection of 112 posts

Technical

Connecting OBIEE11g on Windows to a Kerberos-Secured CDH5 Hadoop Cluster using Cloudera HiveServer2 ODBC Drivers

In a few previous posts and magazine articles [http://www.oracle.com/technetwork/issue-archive/2014/14-sep/o54ba-2279189.html] I’ve covered connecting OBIEE11g to a Hadoop cluster [https://www.rittmanmead.com/blog/2014/01/obiee-11-1-1-7-cloudera-hadoop-hiveimpala-part-2-load-data-into-hivehcatalog-analyze-using-impala/] , using OBIEE 11.1.1.7 and Cloudera CDH4 and CDH5 as the examples. Things

Technical

OBIEE and ODI on Hadoop : Next-Generation Initiatives To Improve Hive Performance

The other week I posted a three-part series (part 1 [https://www.rittmanmead.com/blog/2014/12/going-beyond-mapreduce-for-hadoop-etl-pt-1-why-mapreduce-is-only-for-batch-processing/] , part 2 [https://www.rittmanmead.com/blog/2014/12/going-beyond-mapreduce-for-hadoop-etl-pt-2-introducing-apache-yarn-and-apache-tez/] and part 3 [https://www.rittmanmead.com/blog/2014/12/going-beyond-mapreduce-for-hadoop-etl-pt-3-introducing-apache-spark/] ) on going beyond MapReduce for Hadoop-based ETL, where I l

Technical

Analytics with Kibana and Elasticsearch through Hadoop - part 2 - Getting data into Elasticsearch

Introduction In the first part of this series [https://www.rittmanmead.com/blog/2014/11/analytics-with-kibana-and-elasticsearch-through-hadoop-part-1-introduction/] I described how I made several sets of data relating to the Rittman Mead blog from various sources available through Hive. This included blog hits from the Apache webserver log, tweets, and metadata from

Technical

Analyzing Twitter Data using Datasift, MongoDB and Pig

If you followed our recent postings on the updated Oracle Information Management Reference Architecture [https://www.rittmanmead.com/blog/2014/06/introducing-the-updated-oracle-rittman-mead-information-management-reference-architecture-pt1-information-architecture-and-the-data-factory/] , one of the key concepts we talk about is the “data reservoir”. This is a pool of additional data that you can add to your data warehouse, typically

Technical

Introducing the Updated Oracle / Rittman Mead Information Management Reference Architecture Pt2. - Delivering the Data Factory

In my previous post on our updated Oracle Information Management Reference Architecture [https://www.rittmanmead.com/blog/2014/06/introducing-the-updated-oracle-rittman-mead-information-management-reference-architecture-pt1-information-architecture-and-the-data-factory/] , jointly-developed with Oracle’s Enterprise Architecture team, we went through a conceptual and logical view of the information architecture, introducing new concepts like the Raw Data Reservoir, the Data Factory

Technical

Introducing the Updated Oracle / Rittman Mead Information Management Reference Architecture Pt1. - Information Architecture and the "Data Factory"

One of the things at Rittman Mead that we’re really interested in, is the architecture of “information management” systems and how these change over time as thinking, and product capabilities, evolve. In fact we often collaborate with the Enterprise Architecture team within Oracle, giving input into the architecture designs