Saturday, May 29, 2010

Read only tables in 11g is not fully read only.

When any large transaction is committed, it is quite possible that not all blocks it modified are cleaned out and these blocks have to be cleaned out before any data in them can be retrieved. But what happens if you open the database in read only mode, or make table/tablespace read only.

All the blocks that require a clean out will be done in shadow process PGA memory when parallel query option or direct path reads are used.

Blocks will be clean out but not written to disk when the database is opened in read only mode or tablespace in read only mode.

But after the table is put in read only mode, any further select queries against this table that require block cleanout (dirty blocks) will be written to the disk. In this sense, it is a surprise that read only table can have dirty buffers which will be written to disk. It is definitely the right thing to do.

Oracle can safely ignore any block clean out operations for read only tables and tablespaces as it is given that there can't be any other open DML transactions against them. I'll try to open an enhancement request for this.

Until this issue is fixed, make sure to manually clean out the blocks (by running full scans of the table) before you put any tablespace in read only mode to avoid repeated clean out of the same blocks later on which would incur more performance overhead.

Also, be aware that in 11g by default Oracle will use direct path reads for full table scans if the table size is large enough as dictated by _small_table_threshold/_very_large_object_threshold, you can use event 10949 at level 1 to disable direct path reads for serial full table scans (parallel query option will still use direct path reads).

Friday, May 28, 2010

All about ora_rowscn

SCN information has always been maintained at the block level as long as it have some data in it. Ora_rowscn basically piggyback on that fact to expose SCN (or timestamp) conservatively when a block has been updated.

When a transaction is committed Oracle will revisit either all or few blocks, it has modified as part of the transaction, to mark end of transaction as long as those block buffers are available in the buffer cache. Rest of the blocks will be cleaned out during subsequent access of those blocks as part of any future transaction. During block clean out operation, if the transaction information is not available in the undo segments, Oracle will mark transaction in that block with the SCN of first available transaction in the undo segments. It is this SCN that will be reported for ora_rowscn.

Some weirdness of selecting ora_rowscn from dual:

13:58:23 SQL> show rel
release 1102000100
13:58:28 SQL> select ora_rowscn from dual;


13:58:32 SQL> select ora_rowscn, sysdate from dual;

---------- ---------------

13:58:42 SQL> select ora_rowscn, dummy from dual;

---------- -
1329 X

13:58:48 SQL> select * from dual where ora_rowscn > 1;


13:58:58 SQL> select 1 from dual where ora_rowscn > 1;
select 1 from dual where ora_rowscn > 1
ERROR at line 1:
ORA-00600: internal error code, arguments: [qkeIsExprReferenced1], [], [], [], [], [], [], [], [], [], [], []

Opened bug# 9761357 for the above issue.

Problems with ora_rowscn:

1) bug# 9761357 for the above issue
2) bug# 5270479 for not doing ora_rowscn predicate checking in the current version of the block
3) returning null ora_rowscn when rowdependencies is enabled and querying data from the modified data blocks in an open transaction.
4) Different ora_rowscn values can be reported over time when the underlying block has not been modified.
5) Reporting same ora_rowscn for all the records in a given block when not all those records were changed as part of that commit SCN. This is only when rowdependencies is not used.

Problem #2:
This will basically prevent us from using ora_rowscn for optimistic locking unless following work around is used.

Either use one of the columns being modified in the predicate along with using ora_rowscn predicate or use something like (ora_rowscn+length(column_being_modified) = ora_rowscn_value+length(column_being_modified). But this will not work when other columns are updated by a competing transaction which blocked our transaction, in this case we will update the same record again but not the same columns. If this is not desirable, then don't use ora_rowscn for implementing optimistic locking until the current bug is fixed.

problem #3:
Oracle records SCN for each row in the block when ROWDEPENDENCIES is enabled. Since current SCN can only be recorded for each row when the transaction is not committed (lets call this DSCN), ITL transaction entries in the block is updated with the commit SCN later when the transaction is committed (read above for further explanation of block clean out). DSCN recorded for each row in the block will not be cleaned out as part of normal block clean out process.

Commit SCN (or the upper bound SCN) reported in the block ITL enrties is used for reporting ora_rowscn of all the rows corresponding to an ITL entry modified in the past. But if a given ITL entry needs to be reused for other transactions later on, commit SCN of ITL entry will be used to update DSCN of all the rows it modified in the past before that ITL entry can be reused.

Oracle can report current SCN recorded with each row, instead of returning null Ora_rowscn, when the data from the modified blocks is queried as part of an open transaction. I don't see any open bug created for this one yet.

Problem #4:
Different ora_rowscn values can be reported over time even though the underlying block contents have not been modified. There are 2 reasons for this:

1) If that block was modified with direct path operations, like CTAS and insert with append, and the transaction commit SCN information is not available in the undo segments. Direct path loads write above HMW (and hence very little undo unless indexes exist) and commit SCN is only updated in few blocks when the transaction is completed. When the remaining blocks are accessed again and the corresponding direct path transaction information is not available in any UNDO segments, Oracle will report conservative SCN (least SCN available from the undo transaction table) but it will not clean out the blocks (i.e. all these blocks loaded using direct path operations will remain clean with no commit (or upper bound) SCN available in them).

2) Using direct path operations (or parallel query option) to access the data blocks which needs to be cleaned out. This is very evident in 11g since direct path reads are used for serial full table scans when the table size is big enough. You can change this behavior by setting "alter system set events '10949 trace name context forever, level 1'". It will still do the block clean out but it does this in PGA memory of user process. Over time, as the transaction state information is aged out from the undo segments, different ora_rowscn's will be reported.

Problem #5:
It is not a problem per say, but rather the way it is implemented. Use ROWDEPENDENCIES if you want SCN information at row level more accurately.

Use cases:

You can use ora_rowscn for:
1) To implement optimistic locking
2) To report active time window of table data (i.e. how far back in time data in the table is modified). This information can be used for any archival or data purge analysis.
3) To figure out the last time table data is changed.