An Explanation for Higher Than Expected I/O on Hot Tables-Concurrency and Multiversioning

Another situation where it is vital that you understand read consistency and multiversioning is when you are faced with a query that in production, under a heavy load, uses many more I/Os than you observe in your test or development systems, and you have no way to account for it. You review the I/O performed by the query and note that it is much higher than you have ever seen—much higher than seems possible. You restore the production instance on test and discover that the I/O is way down. But in production, it is still very high (but seems to vary: sometimes it is high, sometimes it is low, and sometimes it is in the middle). The reason, as we’ll see, is that in your test system, in isolation, you do not have to undo other transactions’ changes. In production, however, when you read a given block, you might have to undo (roll back) the changes of many transactions, and each rollback could involve I/O to retrieve the undo and apply it.

This is probably a query against a table that has many concurrent modifications taking place; you are seeing the reads to the undo segment taking place, the work that Oracle is performing to restore the block back the way it was when your query began. You can see the effects of this easily in a single session, just to understand what is happening. We’ll start with a very small table:

SQL> insert into t values ( 1 );1 row created.
SQL> exec dbms_stats.gather_table_stats( user, ‘T’ ); PL/SQL procedure successfully completed.
SQL> select * from t;X

Now we’ll set our session to use the SERIALIZABLE isolation level, so that no matter how many times we run a query in our session, the results will be “as of” that transaction’s start time:
SQL> alter session set isolation_level=serializable; Session altered.

So, that query took seven I/Os (consistent gets) in order to complete. In another session, we’ll modify this table repeatedly:

SQL> beginfor i in 1 .. 10000loop
PL/SQL procedure successfully completed.
And returning to our SERIALIZABLE session, we’ll rerun the same query:

It did 10,004 I/Os that time—a marked difference. So, where did all of the I/O come from? That was Oracle rolling back the changes made to that database block. When we ran the second query, Oracle knew that all of the blocks retrieved and processed by that query had to be “as of” the start time of the transaction. When we got to the buffer cache, we discovered that the block in the cache was simply “too new”—the other session had modified it some 10,000 times. Our query could not see those changes, so it started walking the undo information and undid the last change. It discovered this rolled back block was still too new and did another rollback of the block. It did this repeatedly until finally it found the version of the block that was committed in the database when our transaction began. That was the block we may use—and did use.

Note Interestingly, if you were to rerun the SELECT * FROM T, you would likely see the I/O go back down to 7 or so again; it would not be 10,004. The reason? Oracle has the ability to store multiple versions of the same block in the buffer cache. When you undid the changes to this block for the query that did 10,004 I/ Os, you left that version in the cache, and subsequent executions of your query are able to access it.

So, do we only encounter this problem when using the SERIALIZABLE isolation level? No, not at all. Consider a query that runs for five minutes. During the five minutes the query is running, it is retrieving blocks from the buffer cache. Every time it retrieves a block from the buffer cache, it will perform this check: “Is the block too new? If so, roll it back.” And remember, the longer the query runs, the higher the chance that a block it needs has been modified over time.

Now, the database is expecting this check to happen (i.e., to see if a block is “too new” and the subsequent rolling back of the changes), and for just such a reason, the buffer cache may actually contain multiple versions of the same block in memory. In that fashion, chances are that a version you require will be there, ready and waiting to go, instead of having to be materialized using the undo information. A query such as the following may be used to view these blocks:

SQL> select file#, block#, count(*)

In general, you will find no more than about six versions of a block in the cache at any point in time, but these versions can be used by any query that needs them.

It is generally these small hot tables that run into the issue of inflated I/Os due to read consistency. Other queries most often affected by this issue are long-running queries against volatile tables. The longer they run, the longer they run, because over time they may have to perform more work to retrieve a block from the buffer cache.