Bind Variables Explained

If you’ve been developing applications on Oracle for a while, you’ve no doubt come across the concept of ‘Bind Variables’. Bind variables are one of those Oracle concepts that experts such as Tom Kyte and Jonathan Lewis frequently cite as being key to application performance, but it’s often not all that easy to pin down exactly what they are and how you need to alter your programming style to use them. With this in mind, I’ve tried to pull together the key information about bind variables and why they are a ‘good thing’ when building Oracle applications.To understand bind variables, consider an application that generates thousands of SELECT statements against a table; for example,
SELECT first_name, last_name, postcode FROM customers WHERE id = 674; SELECT first_name, last_name, postcode FROM customers WHERE id = 234; SELECT first_name, last_name, postcode FROM customers WHERE id = 332;
Each time the query is submitted, Oracle first checks in the shared pool to see whether this statement has been submitted before. If it has, the execution plan that this statement previously used is retrieved, and the SQL is executed. If the statement cannot be found in the shared pool, Oracle has to go through the process of parsing the statement, working out the various execution paths and coming up with an optimal access plan before it can be executed. This process is know as a 'hard parse' and for OLTP applications can actually take longer to carry out that the DML instruction itself.

When looking for a matching statement in the shared pool, only statements that exactly match the text of the statements are considered; so, if every SQL statement you submit is unique (in that the predicate changes each time, from 'id = 674' to 'id=234' and so on) then you'll never get a match, and every statement you submit will need to be hard parsed. Hard parsing is very CPU intensive, and involves obtaining latches on key shared memory areas, which whilst it might not affect a single program running against a small set of data, can bring a multi-user system to it's knees if hundreds of copies of the program are trying to hard parse statements at the same time. The extra bonus with this problem is that contention caused by hard parsing is pretty much immune to measures such as increasing available memory, numbers of processors and so on, as hard parsing statements is one thing Oracle can't do concurrently with many other operations, and it's a problem that often only comes to light when trying to scale up a development system from a single user working on subset of records to many hundreds of users working on a full data set.

The way to get Oracle to reuse the execution plans for these statements is to use bind variables. Bind variables are 'substituion' variables that are used in place of literals (such as 674, 234, 332) and that have the effect of sending exactly the same SQL to Oracle every time the query is executed. For example, in our application, we would just submit

SELECT first_name, last_name, postcode FROM customers WHERE id = :cust_no; SELECT first_name, last_name, postcode FROM customers WHERE id = :cust_no;
and this time we would be able to reuse the execution plan every time, reducing the latch activity in the SGA, and therefore the total CPU activity, which has the effect of allowing our application to scale up to many users on a large dataset.

If you're like me, you probably new this bit already; you probably came across bind variables (a.k.a. 'substitution variables) when first learning SQLPlus. For example, using SQLPlus, you might have issued the first set of statements as thus:

SQL> SELECT first_name, last_name, postcode FROM customers WHERE id = 674;

FIRST_NAME LAST_NAME POSTCODE


RITTMAN MARK BN1 1NH

SQL> SELECT first_name, last_name, postcode FROM customers WHERE id = 674;

FIRST_NAME LAST_NAME POSTCODE


BROWN SONIA BN2 3WF


If you then wanted to change this to use bind variables, you could have done the following instead

SQL> variable cust_no number SQL> exec :cust_no := 674 PL/SQL procedure successfully completed.

SQL> SELECT first_name, last_name, postcode FROM customers WHERE id = :cust_no;

FIRST_NAME LAST_NAME POSTCODE


RITTMAN MARK BN1 1NH

SQL> exec :cust_no := 234;
PL/SQL procedure successfully completed.

SQL> SELECT first_name, last_name, postcode FROM customers WHERE id = :cust_no;

FIRST_NAME LAST_NAME POSTCODE


BROWN SONIA BN2 3WF


What we've done to the SELECT statement now is take the literal value out of it, and replace it with a placeholder (our bind variable), with SQLPlus passing the value of the bind variable to Oracle when the statement is processed. This bit is fairly straighforward (you declare a bind variable in SQLPlus, then reference the bind variable in the SELECT statement) but the confusing bit to me was, how does this affect my PL/SQL programs. In addition, putting PL/SQL aside for a moment, how does it affect programs written in Java, or VB, or .Net or whatever, that access Oracle databases?

Taking PL/SQL first of all, the good news is that PL/SQL itself takes care of most of the issues to do with bind variables, to the point where most code that you write already uses bind variables without you knowing. Take, for example, the following bit of PL/SQL:

SQL> ed Wrote file afiedt.buf

1 create or replace procedure double_salary(p_empno in number)
2 as
3 begin
4 update emp
5 set sal=sal2
6 where empno = p_empno;
7 commit;
8
end;
SQL> /


Now you might be thinking that you've got to replace the p_empno with a bind variable. However, the good news is that every reference to a PL/SQL variable is in fact a bind variable - this is just the way that PL/SQL works.

In addition, if your PL/SQL block uses a literal instead of a variable:

3 begin 4 update emp 5 set sal=sal*2 6 where empno = 500;
this doesn't need converting to a bind variable, as the SQL being submitted is actually static - once the PL/SQL is compiled, the value of the predicate will never change - meaning that the SQL will qualify for execution plan sharing. In fact, replacing the literal with a bind variable might actually degrade performance, as the execution plan won't have access to the predicate value when working out whether to use an index vs. perform a full table scan.

In fact, the only time you need to consciously decide to use bind variables when working with PL/SQL is when using the tactical nuclear option of PL/SQL, 'Dynamic SQL'.

Dynamic SQL, introduced in its current form with Oracle 8i, allows you to execute a string containing SQL using the 'EXECUTE IMMEDIATE' command. For example;

execute immediate 'update emp set sal = sal*2 where empno = '||p_empno ;
would always require a hard parse when it is submitted, as the actual SQL submitted by the application would look like:
update emp set sal = sal*2 where empno = 500; update emp set sal = sal*2 where empno = 533;
and so on. The way to use bind variables instead is to change the EXECUTE IMMEDIATE command as follows:
execute immediate 'update emp set sal = sal*2 where empno = :x1' using p_empno;
And that's all there is to it. One thing to bear in mind, though, is that you can't substitute actual object names (tables, views, columns etc) with bind variables - you can only subsitute literals. If the object name is generated at runtime, you'll still need to string concatenate these parts, and the SQL will only match with those already in the shared pool when the same object name comes up. However, whenever you're using dynamic SQL to build up the predicate part of a statement, use bind variables instead and you'll reduce dramatically the amount of latch contention going on.

So for PL/SQL, things are actually quite straightforward. Normal PL/SQL variables are infact bind variables, and literal values 'hard coded' into the package code are static anyway, and won't benefit from turning into bind variables. The only time you need to specifically use bind variables is when you're putting together dynamic SQL.

The next question I had, though, was what about VB, Java and other applications that fire SQL queries against an Oracle database. How do these use bind variables? Do you have to in fact split your SQL into two statements, one to set the bind variable, and one for the statement itself?

In fact, the answer to this is actually quite simple. When you put together an SQL statement using Java, or VB, or whatever, you usually use an API for accessing the database; ADO in the case of VB, JDBC in the case of Java. All of these APIs have built-in support for bind variables, and it's just a case of using this support rather than just concatenating a string yourself and submitting it to the database.

For example, Java has PreparedStatement, which allows the use of bind variables, and Statement, which uses the string concatenation approach. If you use the method that supports bind variables, the API itself passes the bind variable value to Oracle at runtime, and you just submit your SQL statement as normal. There's no need to separately pass the bind variable value to Oracle, and actually no additional work on your part. Support for bind variables isn't just limited to Oracle - it's common to other RDBMS platforms such as Microsoft SQL Server, so there's no excuse for not using them just because they might be an Oracle-only feature.

UPDATE 10/7/04: There's been a useful discussion on comp.databases.oracle.server around this article, and whether the above two paragraphs imply that bind variables are 'automatically taken care of' if you use languages such as VB or Java.

According to Jim Kennedy,

"I think the author means well, but he is wrong in that you don't have to doanything to use them. You do.  In Java's prepared statements you have to setthe parameters. Other languages are similar.  It is true that it is notdifficult.  He should have given a VB or Java example to clarify."
with Ed Stevens (who originally picked up on the article) adding
"I think if I were editing the article, I'd eliminate statements that say or imply that the programming environment (whatever it is) "takes care of it for you." It would be more accurate to say that the programming environment "supports the use . . . .". (And I'm not aware of any that don't). The Big "IF" (tm) is "IF you write your code properly" and don't assume they are writing it properly. Imagine a freshly-minted programmer who thinks the world revolves around Java and databases are an evil imposed on him by 'unenlightened' dinasaurs - "I can do it all in my java code, thank you very much . . . ".
Reading the two paragraphs through again, I'd still stand by my assertion that PL/SQL 'takes care of it for you', as normal PL/SQL variables are in fact bind variables, and you have to pretty much go out of your way to not use them, i.e. by using dynamic PL/SQL. However I take Jim's and Ed's points about the VB and Java statements being a bit ambiguous, and therefore have added this update to add their feedback. In a nutshell - VB and Java (and other languages) *support* bind variables, but you have to make sure you specifically use them, as they're not automatically used when accessing the database. Now, back to the article...
Lastly, it's worth bearing in mind that there are some instances where bind variables are probably not appropriate, usually where instead of your query being executed many times a second (as with OLTP systems) your query in fact actually takes several seconds, or minutes, or hours to execute - a situation you get in decision support and data warehousing. In this instance, the time taken to hard parse your query is only a small proportion of the total query execution time, and the benefit of avoiding a hard parse is probably outweighed by the reduction in important information you're making available to the query optimizer - by substituting the actual predicate with a bind variable, you're removing the ability for the optimiser to compare your value with the data distribution in the column, which might make it opt for a full table scan or an index when this isn't appropriate. Oracle 10i helps deal with this using a feature known as bind variable peeking, which allows Oracle to look at the value behind a bind variable to help choose the best execution plan.

Another potential drawback with bind variables and data warehousing queries is that the use of bind variables disallows the potential for star transformations, taking away this powerful option for efficiently joining fact and dimension tables in a star schema.

For most of this information, I've got Tom Kyte to thank - he's a constant advocate of using bind variables; indeed, he's on record as saying that they're the single most important factor in building scalable applications. If you want to read more about bind variables, a good place to look would be;

as well as Tom Kyte's books, "Effective Oracle By Design" (chapter 5) and "Expert One-On-One Oracle" (chapter 10).