Taking a Look at the Oracle Database 12c In-Memory Option

The In-Memory Option for Oracle Database 12c became available a few weeks ago with the database patchset, adding column-store and in-memory capabilities to the Oracle Database. Unlike pure in-memory databases such as Oracle TimesTen, the in-memory option adds an in-memory column-store feature to the regular row-based storage in the Oracle database, creating in-memory copies of selected row-store tables in a compressed column-based storage format, with the whole process being automatic and and enabled by issuing just a couple of commands – to the point where my feedback at the end of beta testing was that it was “almost boring” – said slightly tongue-in-cheek…


But of course adding in-memory capabilities to the Oracle database is anything but boring – whilst TimesTen on Exalytics has given us the ability to store aggregate tables, and “hot data” tables  in a dedicated in-memory database co-located with the BI mid-tier, TimesTen is another set of skills to learn and another product to manage. Moreover, you’ve got to somehow get the data you want to aggregate or otherwise cache out of your data warehouse and into TimesTen, and then you’ve got the even more difficult task of keeping that TimesTen dataset in-sync with your main data warehouse data. Most importantly though, even with the 2TB and 4TB versions of Exalytics many data warehouses won’t fit entirely into memory, so you’re going to end-up with some of your data in TimesTen and some in your data warehouse, and with only the TimesTen dataset benefiting from in-memory speeds.

So what if you could enable the in-memory option on your data warehouse, and then just run it all in-memory there? To see how this worked, I thought it’d be interesting to take the flight delays dataset that comes with the latest OBIEE 11g v406 SampleApp, and copy that across to an Oracle database to test it out with the in-memory option. This dataset comes with a set of dashboard pages and analyses and contains around 6m rows of data, and in SampleApp is accelerated by an aggregate schema, also stored in Oracle, that generally reduces query times down to just a few seconds.


So what I thought would be interesting to check would be first, whether enabling the in-memory option sped-up the dashboards, and second, whether having the in-memory option removed the need for the aggregate tables altogether. I was also interested to see whether putting the aggregate tables into memory improved their access time significantly, as anecdotally I’d heard that there wasn’t much clock-time difference between accessing aggregates in a TimesTen database vs. just creating them in the same data warehouse database as the source data. To set this all up then I created the new database,. exported the BI_AIRLINES and BI_AIRLINES_AGG schemas out of the 12c database that comes with SampleApp v406, and then imported them into the same schema names in the new instance, on the new VM (rather than on the SampleApp VM), and then amended the connection pool details in the SampleApp RPD to point to this new, external Oracle database instance. The database at this point had the following memory allocation:

So I don’t end-up comparing the larger database instance on my VM with the smaller one the airlines data came from on the VM, I created my baseline response time measurements on this new VM and with none of the in-memory features enabled, and ran some of the dashboard pages and clicked-on some of the prompt values – and the response time wasn’t bad, with just the odd analysis that took an excessive time to return. Mostly though, the results for each dashboard came back straight-away, what you’d expect when summary tables have been deployed to speed-up reporting.


Looking at the SQL being generated by the BI Server, you can quickly see why this is the case; the BI Server is using aggregate navigation and actually querying the BI_AIRLINES_AGG schema to return the aggregated results the dashboard, as you can see from one of the SQL statements retrieved from the NQQuery.log file below:

OBIEE will always use these aggregate tables if they’re available in the repository, so to enable me to test the reports with these aggregates disabled I borrowed the technique Robin introduced in his blog on testing aggregate navigation, and added a request variable prompt to the dashboard page that allows me to pass an INACTIVE_SCHEMAS value to the logical SQL queries issued for the dashboard analyses, and therefore tell the BI Server to ignore the aggregate schema and just use the detail-level BI_AIRLINES schema instead.


I therefore test the dashboard again, this time with the BI_AIRLINES_AGG schema disabled; this time, every dashboard took a while to display properly, with every analysis showing the “spinning clock” for twenty, thirty seconds a time. Comparing the response times to the ones a moment ago when aggregates were enabled, and you can see the difference.


So let’s set-up the in-memory option and see how it affects these two scenarios. The in-memory option for Oracle Database 12c is installed by default as part of the core database, but doesn’t start working until you set a value for the INMEMORY_SIZE initialisation parameter – as you can see from the SQL*Plus output below, it’s currently set to zero, effectively disabling the feature:

The memory Oracle allocates to the in-memory option, via this initialisation parameter, comes out of the SGA and therefore before I set the INMEMORY_SIZE parameter, it’d be a good idea to increase the SGA_TARGET value to accommodate the 1G I’m then going to assign to INMEMORY_SIZE. I do both from the SQL*Plus command-prompt and then bounce the database to bring the new values into use:

Now we’re at the point where we can enable the tables for in-memory access. Tables to be copied into memory can either be created like that at the start, or you can ALTER TABLE them afterwards and specify that they go into memory (or you can configure the in-memory settings at the tablespace level, or even specify particular columns or partitions to go into memory). The underlying data still gets stored row-wise on disk, but enabling a table for in-memory access tells the Oracle database to create column-store in-memory representations of the table and direct suitable queries to those copies, all the time in the background keeping the copy in-sync with the row-store base data.

I’ll start by enabling all of the BI_AIRLINES schema for in-memory access, as the biggest pay-off would be if then meant we didn’t need to maintain the aggregate tables. After running a SELECT table_name FROM user_tables to list out the table names, I then run a script to enable all the tables for in-memory access, with the in-memory copies being populated immediately:

Looking at USER_TABLES from this schema, I can see all of the tables I selected now marked for in-memory access, for immediate loading:

And I can track the progress of the tables being copied into memory using the V$IM_SEGMENTS v$ view, like this:

Note that most of the tables went into memory immediately, but one (PERFORMANCE) is taking a while because it’s so big. Also note that not all tables are listed in the v$ view yet, as the database hasn’t got around to adding them into memory yet, or it might choose not to populate them if it feels the memory could be used better elsewhere. You can alter the priority of these in-memory copy processes if you want, and decide whether the copying happens immediately, or when the table is first scanned (accessed).

Running the dashboards again, with the request variable prompt set to disallow the aggregate schema, gave me the response times below – the in-memory queries were much faster than the row-based non in-memory ones, but they weren’t down to the response time of the dashboards right at the start, where all data is stored row-wise but we’ve got aggregate tables to speed up the queries (note I’ve reordered the bars so the non in-memory queries with no aggregate tables are on the left of each series, as the slowest of all approaches)


Taking a look at one of the physical SQL queries for a cross-tab (hierarchical columns) analysis, you can see that in-memory table access is happening:

Indeed, looking at the queries the BI Server is sending to the database it’s not too surprising the in-memory difference in this case wasn’t too dramatic. According to the docs (emphasis mine):

Storing a database object in the IM column store can improve performance significantly for the following types of operations performed on the database object 

  • A query that scans a large number of rows and applies filters that use operators such as the following: =, <, >, and IN
  • A query that selects a small number of columns from a table or materialized view with a large number of columns, such as a query that selects five columns from a table with 100 columns
  • A query that joins a small table to a large table
  • A query that aggregates data

The IM column store does not improve performance for the following types of operations:

  • Queries with complex predicates
  • Queries that select a large number of columns
  • Queries that return a large number of rows
  • Queries with multiple large table joins

and our query certainly has complex predicates, returns a fair few rows, has large and lots of joins etc. Taking a more simple query that you’d likely write yourself if querying a data warehouse, you can see the in-memory table access being used again but a much simpler, an cheaper explain plan:

If I then turn-off the in-memory feature and regenerate the execution plan, you can see without in-memory the plan is around 5x as expensive:

Running the actual queries in this case gives me a wall-time of around 4 seconds for the in-memory version, and 14 seconds when in-memory query is disabled … but the response time isn’t anywhere near the initial run where we had data stored row-wise but with aggregate tables, so let’s finish-off the testing by putting the aggregate table in-memory too, and see if that makes a difference.

And of course, it absolutely flew:


So to conclude from my look at the Oracle Database 12c In-Memory option with OBIEE11g v406 SampleApp, I’d say the following based on my initial tests:

  • For BI-type reporting where you’re typically summarising lots of data, the in-memory option doesn’t remove the need for aggregate tables – you’ll still benefit significantly from having them, in my observation
  • Where the in-memory option does benefit you is when you’re querying the detail-level data – it helps with aggregation but it’s main strength is fast filtering against subsets of columns
  • Some of the more complex SQL queries issued by OBIEE’s BI Server, for example when creating lots of subtotals and totals against a dataset, reduce the effectiveness of the in-memory option – you’ll get the biggest speed improvement, at least at the moment, with queries with simpler predicates and not so complex joins

There might be more to the aggregation story in the end, though. Looking at the Oracle Database 12c In-Memory Option Technical White Paper, the in-memory option should in-fact help with aggregation through a new optimiser transformation called “vector group by”, a transformation that’s likened to a star transformation that uses CPU-efficient algorithms and a multi-dimensional array created on-the-fly in the PGA called an “in-memory accumulator”.


In fact, what we’ve heard is that many of the old Oracle OLAP team have moved over to the in-memory option team and were responsible for this feature, so I’ll be taking a closer look at in-memory aggregation in this new feature over the next few months. In my examples though, I didn’t see any examples of vector group by in the query execution plans, so I’m assuming either conditions weren’t right for it, or like star transformations there’s some combination of setting and query factors that need to be in place before it’ll appear in the execution plan (and queries presumably run that much faster).

For now though – that’s my first run-through of the 12c In-Memory Option in the context of a typical BI workload. Check back later in the year for more in-memory option postings, including hopefully something more on the in-memory aggregation feature.