March 1st, 2013 by Jon Mead
Over the last 18 months we have seen a huge amount of changes in the world of Oracle Business Intelligence and Analytics.
Everyone wants to talk about Big Data, however doesn’t know where to start in a corporate world, Oracle bought Endeca, which is a reporting/information discovery tool, Cloud delivered services are moving closer and closer and core reporting systems are still fundamental to an organisation’s existence. To make sense of this I wanted to try and relate it the Oracle Business Intelligence world when I first started working in this industry, and to try and look at how the design process and its resultant models are changing.
There used to be a fairly straightforward graphic that was used to explain Business Intelligence.
The bottom of the pyramid was the data in the organisation, there was then a transformation layer that turned it into information, it was then consumed by end users as knowledge which could be used in their decision making process.
For me this has now been replaced by the following graphic.
This model still starts with data, however the source, volume, variety and velocity (deliberately borrowed from standard Big Data definitions) of this data has increased. The organisation now looks at more internal sources of data, plus external sources of data as well, such as social media and third party market research.
The biggest change is the transformation layer. Depending on the source of data and the questions that are being asked of the organisation there are now two different approaches: schema on write and schema on read.
Schema on write
This is the traditional approach for Business Intelligence. A model, often dimensional, is built as part of the design process. This model is an abstraction of the complexity of the underlying systems, put in business terms. The purpose of the model is to allow the business users to interrogate the data in a way they understand.
The model is instantiated through physical database tables and the date is loaded through an ETL (extract, transform and load) process that takes data from one or more source systems and transforms it to fit the model, then loads it into the model.
The key thing is that the model is determined before the data is finally written and the users are very much guided or driven by the model in how they query the data and what results they can get from the system. The designer must anticipate the queries and requests in advance of the user asking the questions.
Schema on read
Schema on read works on a different principle and is more common in the Big Data world. The data is not transformed in any way when it is stored, the data store acts as a big bucket.
The modelling of the data only occurs when the data is read. Map/Reduce is the clearest example, the mapping is the understanding of the data structure. Hadoop is a large distributed file system, which is very good at storing large volumes of data, this is potential. It is only the mapping of this data that provides value, this is done when the data is read, not written.
New World Order
So whereas Business Intelligence used to always be driven by the model, the ETL process to populate the model and the reporting tool to query the model, there is now an approach where the data is collected its raw form, and advanced statistical or analytical tools are used to interrogate the data. An example of one such tool is R.
The driver for which approach to use is often driven by what the user wants to find out. If the question is clearly formed and the sources of data that are required to answer it well understood, for example how many units of a product have we sold, then the traditional schema on write approach is best.
If the question is more open, for example what is causing our sales of a product to drop, why are customers churning or even the clichéd unknown unknowns, then the schema on read is most appropriate.
Decisions, Decisions, Decisions
Whichever approach is taken the end result is that the user, or business, wants to make a decision, or take some action based on making sense of some data. Organisations are becoming increasingly data driven, and despite the evolution of Business Intelligence, the ability of an organisation to derive value from data will be key to its success.