Data-centric BI trends

I managed to miss Mark Rittman's recent keynote addresses in two locations this week, but I did take a look at the slide deck and the blog post. Mark is making five very sound points on the Oracle BI stack. As most regular readers of this blog might expect, I am probably more data driven than Mark (not that Mark isn't focused on data, it's just that we have skills that compliment each other's); I am truly happy modelling data or making a physical database layout fly. So this piece is a slightly data-centric take on current BI trends.

Why is there a 'B' in BI?

Well there needn't be. A lot of what we do is for businesses, so the B makes some sense, but other organisations that are not businesses often use exactly the same techniques and tools to examine their data. The B could be some form of generic term to qualify the word intelligence, perhaps in a way to distinguish what we are doing from Military Intelligence, Criminal Intelligence, Market Intelligence or any other such qualifier. But this is a somewhat bogus division, BI (or any of the other intelligences) is not just a single discipline, and definitely not a discipline unique to a single division.

BI is about finding information

Storing information in a database is without point if you do not have a way to access it again, and access it in a form that suites your business purpose. But the way we need to get at that information varies with purpose:
  • some people have an need to report on the historical, this often requires finding and reading a large amount of data, sorting it and aggregating it
  • others look at historical data in varying amounts of detail, that is drill up, down and through the data and in so doing may exploit pre-built aggregations or OLAP cubes
  • then there are those with an interest in the here and now, the operational reports, perhaps from live (transactional) feeds.
  • finally, in the historical data camp, there are the miners looking for relationships and connections between events.
  • and others are using the past to predict the future, looking at current events and a knowledge of past patterns to apply probabilities of outcomes.
But if BI is all of the above, how can we build a single physical model that encompasses it all. The answer is that we probably can't. To an extent, the needs of co-located data to minimise bulk data reads contraindicates the needs of data mining; partitioning historic data flies in the face of live transactional feeds (it can be done though) and how do we do light-weight speedy predictions in a way that can be used in real-time by an agent using a CRM system?