News and Updates from Oracle Openworld 2014

October 4th, 2014 by

It’s the Saturday after Oracle Openworld 2014, and I’m now home from San Francisco and back in the UK. It’s been a great week as usual, with lots of product announcements and updates to the BI, DW and Big Data products we use on current projects. Here’s my take on what was announced this last week.

New Products Announced

From a BI and DW perspective, the most significant product announcements were around Hadoop and Big Data. Up to this point most parts of an analytics-focused big data project required you to code the solution yourself, with the diagram below showing the typical three steps in a big data project – data ingestion, analysis and sharing the results.

NewImage

At the moment, all of these steps are typically performed from the command-line using languages such as Python, R, Pig, Hive and so on, with tools like Apache Flume and Apache Sqoop used to bring data into and out of the Hadoop cluster. Under the covers, these tools use technologies such as MapReduce or Spark to do their work, automatically running jobs in parallel across the cluster and making use of the easy scalability of Hadoop and NoSQL databases.

You can also neatly divide the work up on a big data project into two phases; the “discovery” phase typically performed by a data scientist where data is loaded, analysed, correlated and otherwise “understood” to provide the initial insights, and then an “exploitation” phase where we apply governance, provide the output data in a format usable by BI tools and otherwise share the results with the wider corporate audience. The updated Information Management Reference Architecture we collaborated on with Oracle and launched by in June this year had distinct discovery and exploitation phases, and the architecture itself made a clear distinction between the Innovation part that enabled the discovery phase of a project and the Execution part that delivered the insights and data in a more governed, production setting.

NewImage

This was the theme of the product announcements around analytics, BI, data warehousing and big data during Openworld 2014, with Oracle’s Omri Traub in the photo below taking us through Oracle’s big data product strategy. What Oracle are doing here is productising and “democratising” big data, putting it clearly in context of their existing database, engineered systems and BI products and linking them all together into an overall information management architecture and delivery process.

NewImage

So working through from ingestion through to data analysis, these steps have typically been performed by data scientists using scripting tools and rudimentary data visualisation engines, making them labour-intensive and reliant on a small set of people conversant with these tools and process. Oracle Big Data Discovery is aimed squarely at these steps, and combines Apache Spark-based data preparation and transformation capabilities with an analysis and visualisation engine based on Endeca Server.

NewImage

Key features of Big Data Discovery include:

  • Ability to analyse, parse, explore and “wrangle” data using graphical tools and a Spark-based transformation engine
  • Create a catalog of the data on your Hadoop cluster, and then search that catalog using Endeca Server search technologies
  • Create recommendations of other datasets that might interest you, based on what you’re looking at now
  • Visualize your datasets to help understand what they contain, and discover new insights

Under the covers it comprises two parts; the data loading, transformation and profiling part that uses Apache Spark to do its work in parallel across all the nodes in the cluster, and the analysis part, which takes data prepared by Apache Spark and loads into the Endeca Server in-memory engine to perform the analysis, aggregation and data visualisation. Unlike the Spark part the Endeca server element runs just on one node and limits the size of the analysis dataset to what can run in-memory in the Endeca Server engine, but in practice you’re going to work with a sample of the data rather than the entire dataset at that stage (in time the assumption is that the Endeca Server engine will be unbundled and run natively on YARN, giving it the same scalability as the Spark-based data ingestion and transformation part). Initially Big Data Discovery will run on-premise with a cloud version later on, and it’s not dependent on Big Data Appliance – expect to see something later this year / early next year.

Another new product that addresses the discovery phase and discovery lab part of a big data project is Oracle Data Enrichment Cloud Service, from the Oracle Data Integration team and designed to complement ODI and Oracle EDQ. Whilst Oracle positioned ODECS as something you’d use as well as Big Data Discovery and typically upstream from BDD, to me there seemed to be a fair bit of overlap between the products, with both tools doing data profiling and transformation but BDD being more focused on the exploration and discovery part, and ODECS being more focused on early-stage data profiling and transformation.

NewImage

ODECS is clearly more of an ETL tool complement and runs natively in the cloud, right from the start. It’s most probably aimed at customers with their Hadoop dataset already in the cloud, maybe using Amazon Elastic MapReduce or Oracle’s new Hadoop-as-a-Service and has more in common with the old Data Quality Option for Oracle Warehouse Builder than Endeca’s search-first analytic interface. It’s got a very nice interface including a mobile-enabled website and the ability to include and merge in external datasets, including Oracle’s own Data as a Service platform offering. Along with the new Metadata Management tool Oracle also launched at Openworld it’s a great addition to the Oracle Data Integration product suite, but I can’t help thinking that its initial availability only on Oracle’s public cloud platform is going to limit its use with Oracle’s typical customers – we’ll have to just wait and see.

The other major product that addresses big data projects was Oracle Big Data SQL. Partly addressing the discovery phase of big data projects but mostly (to my mind) addressing the exploitation phase, and the execution part of the information management architecture, Big Data SQL gives Oracle Exadata the ability to return data from Hive and NoSQL on the Big Data Appliance as well as data from its normal relational store. I covered Big Data SQL on the blog a few weeks ago and I’ll be posting some more in-depth articles on it next week, but the other main technical innovation with the product is its bringing of Exadata’s SmartScan feature to Hadoop, projecting and filtering data at the Hadoop storage node level and also giving Hadoop the ability to understand regular Oracle SQL, rather than the cut-down version you get with HiveQL.

NewImage

Where this then leaves us is with the ability to do most of a big data project using (Oracle) tools, bringing big data analysis within reach of organisations with Oracle-style budgets but without access to rare data scientist-type resources. Going back to my diagram earlier, a post-OOW big data project using the new products launched in this last week could look something like this:

NewImage

Big Data SQL is out now and depends on BDA and Exadata for its use; Big Data Discovery should be out in a few months time, runs on-premise but doesn’t require BDA, whilst ODECS is cloud-only and runs on a BDA in the background. Expect more news and more integration/alignment from the products as 2014 ends and 2015 starts, and we’re looking forward to using them on Oracle-centric Hadoop projects in the near future. 

Product Updates for BI, Data Integration, Exalytics, BI Applications and OBIEE

Other news announced over the week for products we more commonly use on projects include:

Finally, something that we were particularly pleased to see was the updated Oracle Information Management Architecture I mentioned earlier referenced in most of the analytics sessions, with Oracle’s Balaji Yelamanchili for example introducing it in his big data and business analytics general session mid-way through the week. 

NewImage
 

We love the way this brings together the big data components and puts them in the context of the wider data warehouse and analytic processes, and compared to a few years ago when Hadoop and big data was considered completely separate to data warehousing and BI and done by staff completely different to the core business analytics team, this new reference architecture puts it squarely within the world of BI and analytics we work in. It also emphasises the new abilities Hadoop, NoSQL databases and big data can bring us – support for wider sets of data sources with dynamic schemas, the ability to economically work with and analyse much larger datasets, and support discovery-type upfront analysis work. Finally, it recognises that to get true value out of analysis you start on Hadoop, you eventually need to add proper data governance, make the results more widely available using full SQL tools, and use the right tools – relational databases, OLAP servers and the like – to analyse the data once its in a more structured form. 

If you missed our write-up on the updated Information Management Reference Architecture you can can read our two-part blog post here and here, read the Oracle white paper, or listen to the podcast with OTN Archbeat’s Bob Rhubart. For now though I’m looking forward to seeing the family after a week and a half away in San Francisco – thanks to OTN and the Oracle ACE Director Program for sponsoring my visit over to SF for Openworld, and we’ll post our conference presentation slides later next week when we’re back in the UK and US offices.

EPM and BI Meetup at Next Week’s Openworld (and details of our Oracle DI Speakeasy)

September 26th, 2014 by

Just a short note to help publicise the Oracle Openworld 2014 EPM and BI Meetup that’s running next week, organised by Cameron Lackpour and Tim Tow from the ODTUG board.

This is an excellent opportunity for EPM and BI developers and customers to get together and network over drinks and food, and chat with members of the ODTUG board and maybe some of the EPM and BI product management team. It’s running at Piattini, located at 2331 Mission St. (between 19th St & 20th St), San Francisco, CA 94110 from 7pm to late and there’s more details at this blog post by Cameron. The turnout should be pretty good, and if you’re an EPM or BI developer looking to meet up with others in your area this is a great opportunity to do so. Attendance is free and you just need to register using this form.

Similarly, if you’re into data warehousing and data integration you might be interested in our Rittman Mead / Oracle Data Integration’s Speakeasy event, running on the same evening (Tuesday September 30th 2014) from 7pm – 9pm at Local Edition, 691 Market St, San Francisco, CA. Aimed at ODI, OWB and data integration developers and customers and featuring members of the Rittman Mead team and Oracle’s Data Integration product team, again this is a great opportunity to meet with your peers and share stories and experiences. Registration is free and done through this registration form, with spaces still open at the time of posting.

Introduction to Oracle BI Cloud Service : Service Administration

September 26th, 2014 by

Earlier in the week we’ve looked at the developer features within Oracle BI Cloud Service (BICS), aimed at departmental users who want the power of OBIEE 11g without the need to stand-up their own infrastructure. We looked at the process of uploading spreadsheets and other data to the Oracle Database Schema Service that accompanies BICS, how you create the BI Repository that translates the tables and columns you upload into measures, attributes and hierarchies, and then took a brief look at how dashboards and reports are created and then shared with other users in your department. If you’re coming in late, here’s the links to the previous posts in the series:

One of the design goals for BICS was to reduce the amount of administration work an end-user has to perform, and to simplify and consolidate any tasks that they do have to do. Behind the scenes BICS actually comprises a BI environment, and a database environment, with most of the administration work being concerned with the BI one. Let’s start by looking at the service administration page that you see when you first log into the BICS environment as an administrator, with the screenshot below showing the overview page for the overall service.

NewImage

Oracle BI Cloud Service is part of Oracle’s overall Oracle Platform-as-a-Service (PaaS) offering, with BICS being made up of a database service and a BI service. The screenshot above shows the overall availability of these two services over the past two weeks, and you click on either the database service or the BI service to drill into more detail. Let’s click on the BI service first.

NewImage

The BI service dashboard page shows the same availability statuses again, along with a few graphs to show usage over that period. Also on this page are details of the start and end date for the service contract, details of the SFTP user account you’ll need to for some import/archive operations, and a link to Presentation Services for this instance, to launch the OBIEE Home Page.

The OBIEE home page, as we saw in previous posts in this series, has menu items for model editing, data uploading and creating reports and dashboards. What it also has though is a Manage menu item, as shown in the screenshot below, that takes you through to an administration function that lets you set up application roles and backup/restore the system.

NewImage

Application roles are the way that OBIEE groups permissions and privileges and then assigns them to sets of users. With on-premise OBIEE the only way to manage application roles is through Enterprise Manager Fusion Middleware Control, but with BICS this functionality has been moved into OBIEE proper so that non-system administrators can perform this task. The list of users you work with are the ones defined for your service (tenancy) and using this tool you can assign them to existing application roles, create new ones, or group one set of roles within another. Users themselves are created as part of the instance creation process, with the minimum (license) number of users for an instance being 10.

NewImage

The Snapshots tab on this same Service Console page provides access to a new, system-wide snapshot and restore function that provides the means to version your system, restore it from a backup and transport a dev/test environment to your production instance. As I mentioned in previous postings in the series, each tenant for BICS comes with two instances, once for dev/test and one for prod, and the snapshot facility gives you a means to copy everything from one environment into another, for when you’ve completed development and testing and want to put your dashboards into production.

NewImage

Taking a snapshot, as shown in the screenshot above, creates an archive file containing your RPD, the catalog and all the security settings, and you can store a number of snapshots within each environments, giving you a (very coarse-grained) versioning ability. What you can also do is download these snapshots as what are called “BI Archive” files as shown in the screenshot below, and its these archive files that you can then upload into your other instance to give you your code promotion process – note however that applying an archive file overwrites everything that was there before, so you’ll need to be careful doing this when users start creating reports in your production environment – really, it’s just a once-only code promotion facility followed then by a way of backing up and restoring your environments.

NewImage

Note also that you’ll separately need to backup and restore any database elements, as these aren’t automatically included in the BI archive process. Backup and restoration of database elements is done via the separate database instance service page shown below, where you can export the whole schema or just parts of it, and then retrieve the export file via an SFTP transfer.

NewImage

So that’s in in terms of BICS administration, and for our initial look at the BI Cloud Service platform. Rittman Mead are of course offering services around BICS and cloud BI in-general so contact us if you’d like to give BICS a spin, and keep an eye on the blog over the next few weeks where we’ll take you through the example BICS application we built, reporting against Salesforce.com data using their REST API.

Introduction to Oracle BI Cloud Service : Building Dashboards & Reports

September 25th, 2014 by

This week we’ve been looking at the new Oracle BI Cloud Service (BICS), the cloud version of OBIEE11g that went GA at the start of this week. Rittman Mead were part of the beta program for BICS and spend a couple of weeks building a sample BICS application to put the product through its paces, creating a reporting application for Salesforce.com that pulled in its data via the Salesforce REST API and staged it in the Oracle Database Schema Service that comes with BICS. Earlier in the week we looked at how data was uploaded or transferred into the accompanying database schema, and yesterday looked at how the repository was created using the new thin-client data modeller. Today, we’ll look at how you create the dashboards and reports that your users will use, using the Analysis and Dashboard Editors that are part of the service. If you’re arriving mid-way through the series, here’s the links to the other posts in the series:

In fact creating analyses and dashboards is the part of BICS that has least changed compared to the on-premise version. In keeping with the “self-service” theme for BICS there’s an introductory set of guidance notes when you first connect to BICS, like this:
 
NewImage
 
and the dashboard and analysis editors are available as menu options on the Home page, along with a link to the Catalog view, like this:
 
NewImage
 
From that point on though it’s standard Answers and Dashboards, with the normal four-tab editor view within Answers (the Analysis Editor) and the ability to create views, calculations, filters and so on. Anyone familiar with Answers will be at home within the cloud version, and there’s a new visualisation – the heat map view, as shown in the final screenshot later in this article – that hints at other visualisations that we’ll see featured first in the cloud version of OBIEE, expected to be updated more frequently than the on-premise version (one of the major selling points for customers looking to adopt new features as soon as possible with OBIEE).
 
NewImage
 
What’s missing from this environment though are features like Agents and alerts, scorecards and BI Publisher, or the ability to create actions other than links to other web pages or catalog content.
 
NewImage
 
These are features that Oracle are saying they’ll add-back in time though as the underlying infrastructure for BICS builds-out, and of course the whole UI is likely to go through a rev with the 12c release of OBIEE due sometime in 2015. Dashboards are also created in the same way as with on-premise OBIEE, with the same Dashboard Editor and access to features like conditional display of sections and support for presentation variables.

NewImage

So, that wraps-up our quick tour around the analysis and dashboard creation parts of Oracle BI Cloud Service; tomorrow, to finish-up the series we’ll look at the administration elements of BICS including new self-service application role provisioning, tools for administering and monitoring the instance and for backing-up and migrating content from one instance to another.

Introduction to Oracle BI Cloud Service : Creating the Repository

September 24th, 2014 by

Earlier in this series we’ve looked at the overall product proposition for Oracle BI Cloud Service (BICS), and how you upload data to the Database Schema Service that comes with it. Today, we’re going to look at what’s involved in creating the BI Repository that holds the metadata about your logical tables, calculations and dimension hierarchies, using the new thin-client data modeller that like the rest of BICS runs entirely within your web browser. For anyone coming into the series mid-way, here’s the links to the other posts in the series:

So anyone familiar with OBIEE will know that a central part of the product, and the part of it that makes it easy for users to work with their data, is the business-orientated semantic model that you create over your source data. Held within what’s called the “BI Repository” and made-up of physical, logical and presentation layers, the semantic model turns what can be a complex set of source tables, joins and cross-application links into a simple to understand set of subject areas made up of fact tables and dimensions. Regular on-premise OBIEE semantic models can get pretty complex, with joins across different database types, logical tables with several different ways you can provide their data – for example, at detail-level from an Oracle data warehouse whilst at summary level, from an Essbase cube, and to edit them you use a dedicated Windows development tool called BI Administration.

Allowing these complex data models, and having a dependency on a Windows-based development tool, poses two main issues for any consumer-style version of OBIEE; first, if the aim of the service is to attract customers who want to create their systems “self-service”, you’ve got to made the repository development process a lot simpler than it currently is – you can’t expect customers to go on a course or buy my excellent book when they just want to get a dashboard up and running with the minimum fuss. You also can’t realistically expect them to install a Windows-only development tool back at the office as most of their target customers won’t have admin privileges on their workstations, or they might even be using Macs or work out of a browser; and then, even if they get it installed you’ll need to ensure there’s a network connection available to the BI Server in the cloud through their corporate firewall. Clearly, a browser-based repository creation tool was needed, ideally one that did some of the basic work automatically for the user and didn’t need hours or days of training to understand. Of course, the risk to this is that you create a repository editing tool that’s too “dumbed-down” for most developers to find useful, and we’ll consider that possibility later in the article.

So following the data upload process that we covered in yesterday’s post, we’re now in a position where we’ve got a number of tables sitting in Oracle Database Schema Service, and we’re ready to build a repository to report against them. To access the thin-client data modeller you click on the Model menu item on the BICS homepage, as shown in the screenshot below.

NewImage

The modeller itself supports a simplified subset of what you can create with the full BI Administration tool. You’ve got a single source, the Oracle Database Schema Service, and a single business model. Business model tables have a logical table source as you’d normally expect, but just the one LTS is currently supported. Calculations within logical tables are supported, but they’re logical-level only (i.e. post-aggregation) with no current support for physical-level (pre-aggregation) at this point.

NewImage

Level-based hierarchies within the business model are supported, including skip-level and ragged ones, and there’s support for time-series dimensions including their own editor.

NewImage

Where possible, introspection is used when creating the business model components, with table joins and matching column names used to create candidate logical joins. Static and dynamic repository variables, along with session variables are supported, with the front-end also supporting presentation and request variables – so all good there.

NewImage

Under the covers, each tenant within BICS has their own RPD and their own catalog, and any edits to the repository that you perform are effectively “online” edits. To make edits to an existing model the developer therefore has to first “lock” the model, make their changes and add their new entries and then validate them, and then either revert the model or publish the changes. 

NewImage

In the background BICS updates the RPD using the metadata web service API for the BI Server, with the RPD it creates the same format as the ones we create on-premise, just with a smaller set of features supported through the thin-client admin tool.

As I mentioned in the first post in the series, each tenant install of BICS comes with two instances; one for development or pre-prod and one for production. To move a completed repository out of one environment into another a new feature called a “BI Archive” is used, a snapshot of your BICS system that includes both the repository, the catalog and any security objects you create. In this first version of BICS each import is total and overwrites everything that was in the instance beforehand, so there’s no incremental import or ability to selectively import just certain objects or certain reports into a new environment, meaning that you’ll lose any reports or dashboards created in production if you subsequently refresh it from dev/pre-prod – something to bear in-mind.

One other thing to be aware of is that there’s no ability to create alias tables or opaque views in the thin-client modeller, so if you want to create additional copies of dimension table for more than one dimension role, or you want to create a table using an arbitrary SELECT statement you’ll need to go into ApEx and create a database view instead – not a huge imposition as ApEx comes with tools for creating these pretty easily, but something that will lead to a more complex database model in-time. The screenshot below shows one such database view then exposed through the thin-client modeller, where you can see the SELECT statement behind it (but not alter or amend it except through ApEx).

NewImage

Finally, the thin-client modeller supports row-level and subject area security, using filters or object permissions to set up manually or create by reference to application roles granted to your users. We’ll look at what’s involved in setting up security and application roles in the final post in this series, where we look at administering your BICS instance.

So, that’s a high-level view of the repository creation process; in tomorrow’s post, we’ll look at what’s involved in creating reports and dashboards.

Website Design & Build: tymedia.co.uk