Oracle "BigDataLite" VM Now Available for Download on OTN

Oracle released a new developer VM for download on OTN yesterday called “bigdatalite” - if you’re interested in big data, Hadoop and some of the SQL-on-Hadoop technologies I’ve been looking at recently on the blog, this is something you’ll want to download as soon as possible and play around with. I’ve had access to an earlier version of this VM back from 2012 because of some development work I did with ODI these technologies, but up until now there’s not been a publicly downloadable version I could point people to. Now there is, so I just wanted to walk through what in it, and how you can start to play around with some of the features.

Once you’ve downloaded the various archive files and imported the VM into Virtualbox, log in as oracle/welcome1 and you’ll see a (strangely militaristic-looking) desktop and some links to start an Oracle database, open a browser and so on:

NewImage

Give the various services a few seconds to start up, and then click on the “Start Here” link on the desktop to open your browser.

The getting started page lists out the various products that are installed on the VM, which you can group as:

  • Hadoop and big data products from Cloudera - Cloudera Manager, their equivalent to Enterprise Manager; Cloudera’s distribution of Hadoop (similar to how Red Hat and SuSE distribute their own versions of Linux); and Cloudera Impala and Search, their add-ons to Hadoop that make querying and searching faster
  • Oracle’s Big Data Connectors, a set of technologies that link the Oracle database to Hadoop, allowing you to query Hadoop from Oracle, and load and unload data between the two platforms
  • Oracle Data Integrator 12c, with a couple of Hadoop integration examples pre-created
  • Oracle Database 12c, to use with the Big Data Connectors and ODI
  • Oracle NoSQL database, a key/value database similar to Apache HBase
  • A bunch of other related Oracle tools such as Jdeveloper, SQL Developer, and Oracle’s R Distribution - with R Studio and additional R packages separately installable

So a great place to start playing around with Hadoop in-general, a way to get some experience with Impala and Hive if you’re an OBIEE developer, and also a great way to try out the integration pieces between the Oracle Database and Hadoop including ODI’s capabilities in this area.

If you click on the Cloudera Manager link (http://localhost:7180/cmf/login) you’ll be taken to Cloudera Manager. This web UI allows you to see the state of the various services managed by Cloudera Manager, including

  • HDFS (the distributed filesystem that holds the datafiles then typically analysed using Hive and Impala); 
  • Hive and Impala (two technologies for issuing SQL-type queries over HDFS files); 
  • MapReduce (the core data-processing technology within Hadoop that splits operations into mapping, shuffling and reducing (aggregating) data and automatically parallelises it over nodes in the Hadoop cluster)
  • Sqoop (for loading data into and out of Hadoop from relational databases)
  • Hue (a web UI for all of the above, that we’ll look at in a moment)

NewImage

Hue is the other main web interface you’ll want to look at, and this is more of a developer-focused web app that allows you to create and view HDFS files, create Hive tables and then query them using Hive and Impala.

NewImage

I covered Hue and the process of uploading files to create Hive tables in the two blog posts below the other week, and once you’ve done that you can query them from tools such as OBIEE using the 11.1.1.7 release’s Hive connectivity:

If you’re more from the database side, there’s some tutorials available on the big data connectors and so forth - there doesn’t appear to be any separate tutorials for ODI though so you’ll need to “reverse-engineer” the two examples in ODI Studio to work through how they’ve been created. I’ll try and do this soon and post it on the blog, if anyone’s interested.

NewImage

Anyway, the VM is downloadable now with supporting materials available on OTN here. I’ve added some links below to earlier posts on our blog that might be of interest to you if you’re looking to try OBIEE and ODI with this platform: