Ralph Kimball On The 38 Subsystems Of ETL

Ralph Kimball, "The 38 Subsystems Of ETL" : "The extract-transform-load (ETL) system, or more informally, the "back room," is often estimated to consume 70 percent of the time and effort of building a data warehouse. But there hasn't been enough careful thinking about just why the ETL system is so complex and resource intensive. Everyone understands the three letters: You get the data out of its original source location (E), you do something to it (T), and then you load it (L) into a final set of tables for the users to query ... I have spent the last 18 months intensively studying ETL practices and ETL products. I have identified a list of 38 subsystems that are needed in almost every data warehouse back room. That's the bad news. No wonder the ETL system takes such a large fraction of the data warehouse resources. But the good news is that if you study the list, you'll recognize almost all of them, and you'll be on the way to leveraging your experience in each of these subsystems as you build successive data warehouses."