Indexing the unusual

Monday, March 10th, 2008 by Peter Scott

For many years I had an interest in non-standard indexing and exotic data types, that is things that weren’t NUMBER or VARCHAR2. In fact before I came in to data warehousing I was involved in indexing free text such as conversation transcripts and and narrative reports; some of this was pushing the technology of the […]

Oracle Data Integration Suite

Tuesday, February 5th, 2008 by Peter Scott

Keeping my head down building Oracle 11g OLAP cubes for research and self-education meant the I missed yesterday’s product announcement from Oracle, but with the wonders of Blog aggregators (and in particular Beth’s) I spotted a mention on Vincent McBurney’s blog of the newly announced (and available) Oracle Data Integration Suite.
This is one of […]

Virtually there

Wednesday, November 21st, 2007 by Peter Scott

I have a couple of days between assignments so I am spending a little time doing research. Ultimately, there is a piece to write on Oracle 11g OLAP, and some course notes on aspects of data warehousing. But do this sort of thing I need to get myself set up with a database (Enterprise Edition […]

Looking at the unstructured

Saturday, October 6th, 2007 by Peter Scott

Curt Monash has been talking about text mining a lot this week, he also notes that, from a text point of view, that the four preeminent database vendors (data store not query tool) are Oracle, Microsoft, Teradata and Netezza, this seems to reflect my experiences of what is going on in the BI space as […]

Super models

Wednesday, July 18th, 2007 by Peter Scott

No, not the size-zero (or below) fashion-sticks that pogo across the catwalks of the world enticing the somewhat-larger to buy, but the modeling of all of the information within an organisation in a single unified form. Said like that, it’s simple, but in reality there are lots of complexity buried away that need to be […]

More on fraud and money laundering

Saturday, February 24th, 2007 by Peter Scott

Yesterday I mentioned a small analytic project we are doing for a retail customer around money laundering. Joel Gary commented and linked to an article in his local press about the rise of anti-fraud analytic companies in his neck of California.
Money laundering is a big issue in the UK, not necessarily because it is a […]

Thoughts on extremely large databases and searching the unstructured

Monday, January 15th, 2007 by Peter Scott

Nuno Souto posts an interesting set of thoughts on Extremely Large Databases. As usual, it is a well thought through post from someone who is probably scarred for life from actually working with large databases. In a data warehouse (or even a very large transactional system) context the reader is lead to an inevitability of […]

Is “Data Warehouse” a future-proof term?

Sunday, August 13th, 2006 by Peter Scott

A while back it was quite feasible to draw circles around discrete databases in an organisation’s IT structure and say ‘this is the data warehouse, here is the billing system and that is the blinkity-boo system. But now those circles are pretty defuse. It is harder to differentiate between where document storage diverges from data […]

Big, bad disk.

Saturday, April 22nd, 2006 by Peter Scott

Over on Doug Burn’s blog there is a link to an interesting piece on large disks. Some people would think that the data warehousing community would welcome large disks. But probably for the majority (those of us that use conventional relational databases) this is not the case. An exception may be for those people that […]

Token databases

Monday, October 17th, 2005 by Peter Scott

Mark Rittman recently posted on the subject of column orientated databases such Sybase IQ and various SAND Technology Inc products. One aspect Mark mentioned was the use of tokens to store attribute data.
In a previous role (before moving back into the mainstream world of Oracle data warehouses) I worked on 'free-text' information systems and in […]