What I’m Working On

If you're a regular reader of this blog you'll probably have noticed that the frequency of postings has gone down a little over the last few months, part of this is due to our second child coming along, but it's also because I've been working on some more in-depth articles that should surface over the next few months. If you're interested, here's what I'm currently working on:

  • An article where I take an in-depth look at the process of accessing analytic workspaces using regular SQL and the OLAP_TABLE feature. The driver behind this is that when you access AWs using the OLAP_API, the only use that you can put dimension member attributes to is to filter down the list of dimension members (for example, return those customers who's gender is 'Male') whereas most customers want to use these attributes to create their measure crosstabs - to analyze sales, for example, by customer gender and customer income band. The OLAP API doesn't let you do this, but if you access the AWs directly using OLAP_TABLE you get direct access to attributes in the same way as dimension members and can create a crosstab based on these - the only problem is getting tools such as Discoverer to recognize the embedded totals within the AW rather than perform their own (GROUP BY-based) aggregation.
  • A paper I've submitted for the Hotsos event on substituting Analytic Workspaces for Materialized Views. The thinking here is that the things you use MVs for - storing aggregates, and providing a way for queries to transparently use those aggregates - can be better done by AWs. The latter can aggregate data faster (potentially) and more flexibly/programmatically, and you can perform the same function as query rewrite (speeding up queries) using either (a) SQL over OLAP_TABLE or (b) SQL redirected to the AW using query equivalence. The interesting bit here is that I have no idea whether they will perform better, but it's an interesting question and it'll be fun to approach the subject with the rigour of a Hotsos Symposium presentation in mind. If this gets picked for the event I'll write it soon, otherwise it'll go on the list for later in 2006.
  • A paper I've put forward for the ODTUG Desktop 2006 conference on Oracle OLAP performance tuning, based on the DBAZine articles on this subject I've just had published.
  • An article I'm about half-way through for OTN on using OWB10gR2 for data profiling and automatic data correction. This one will have to wait until the product is launched - hopefully December 2005 - before it's published, so it'll probably be out in the new year.
  • An in-depth look at XML Publisher. It's available for download now on http://edelivery.oracle.com and I've downloaded it, and the sample data and report templates, with a view to doing a presentation on the product at the next UKOUG BIRT SIG on January 31st 2006. Once I get it up and running, expect a few articles with a particular emphasis on addressing the issues - usability mostly - that currently bedevil Oracle Reports.
  • An article for Oracle Magazine on the 10gR2 LOG ERRORS feature that I've just proposed to them. I was particularly keen to do this as LOG ERRORS is going to revolutionize the way in which we perform data warehouse ETL - there'll no longer be a need to process potentially dirty data using PL/SQL cursors, you'll just need the LOG ERRORS feature together with INSERT /* APPEND */ or MERGE and data loads should just fly in.
  • A paper for the ODTUG Live! 2006 conference on designing and building an OWB10gR2 data warehouse - basically, what sort of architecture should you use when you've got the ability to do both operational (ODS) and analytic (DW) layers, where you should using MOLAP as opposed to ROLAP, how you can now do change data capture and so on. It'll be partly based on our experiences with the beta, and with bit of luck, if Paris has been out for long enough, I'll include some feedback on our initial OWB "Paris" implementations.
  • An article where we compare the characteristics of a relational and a multidimensional OLAP cube - the time they take to roll up, how robust they are (i.e. how often the metadata falls apart), how fast they are to query and so on; also - at what points do either one of the two architectures take over, with regard to size of cube, number of concurrent queries, nature of queries and so on
  • And last but not least - a book proposal that I'm working with on someone you all probably know (well at least within the Oracle data warehousing area) on Oracle data warehousing best practices - I'm doing the sections on the OLAP server and on ETL tools. More on this at a later date.