Wednesday, August 24, 2011

Is Oracle ACID compliant database?

I attended Jonathan Lewis's NOCOUG session on May 18th, 2011. Towards the end of his session, he demonstrated a test case suspending lgwr process using oradebug, issue commit for an open transaction in another session, and read the same data from third session successfully even though the second session's commit record is not yet preserved to disk.

You can read more about it at:
http://tonyhasler.wordpress.com/2011/08/22/why-is-acid-important/
http://jonathanlewis.wordpress.com/2011/08/19/redo-2/
http://www.freelists.org/post/oracle-l/ACID

This basically means Oracle is not fully ACID complaint.

Even though some people are categorizing it as an architectural limitation, it is a bug to the customer.

In reality it is very rare to run into this problem because:

1) Quite often we see that the session (or process) committing a transaction is the one moving on to finish other dependent transactions before returning acknowledgment back to the customer.
2) Session reading commit in flight data usually go on to initiate it's own transaction which will either get blocked by the same underlying redo problem or the whole transaction will fail if redo writes are still blocked.
3) Redo write times are typically under couple of milli seconds and hence the exposure window is very narrow.
4) Distributed databases are not that widely used
5) Event based scheduling with changes spanning more than 1 database is not very common, i.e. batch job waiting for an event to happen (can be simple change in status table) and then run the job.

Ironically, if an application is just reading data from one database and making decisions on transactions happening on other databases, then using active data guard standby (ADG) for the read part will solve the problem naturally, although the same problem can still happen in theory if both primary and ADG standby go down at the same time (It'd be very interesting to test this out).

Here is my take on solving this problem:

When a query is executed, Oracle computes current SCN and construct read consistent data as of that SCN. This is very fundamental to this problem, do they really have to run the query as of current SCN?

I think it won't add too much overhead, if lgwr can post the SCN, to SGA location, up to which it has written redo to the disk from log buffer. Every session executing sql statements can check this memory location for determining on disk SCN up to which redo was flushed and then execute the query as of that SCN. This will solve the problem.

It should not be done for DML's, including select for updates, all of which need to read the data as of current SCN.

By default Oracle should still maintain the current behavior but provide session/system/query level parameter to change the behavior of executing read only queries as of disk SCN.

Since Oracle can write redo out of scn order from log buffer to redo log files (recovery process does sort it by SCN before applying the redo), determining the correct on disk SCN may pose a challenge.

20 Comments:

At August 24, 2011 at 2:04:00 AM PDT, OpenID smanroy said...

Hi Sai,

Wouldn't this solution create problems... Say 1000 selects start at same time and each one waiting for its turn to read the address space in 'SGA' to get the 'SCN' value published by lgwr. In addition to that, the published SCN to SGA would change every 3 seconds.
Am I missing something here ?

 
At August 24, 2011 at 3:11:00 AM PDT, OpenID Jonathan Lewis said...

Sai,

Nice idea, but I think smanroy has picked up an important point. Every session starting a select would have to get a latch (shared read) to get the cached SCN, I think, competing with lgwr which would have to get the latch exclusive to update the cached SCN.

Logically, though, it's a better solution than the one I had sketched out on my blog.

 
At August 24, 2011 at 8:48:00 AM PDT, Anonymous Tony hasler said...

Just a comment about select for update. You would still need to wait for the queried data to be on disk if you propagated it to another system. Also, DML can include a RETURNING clause and that information could be propagated. So your proposed solution is not quite complete but otherwise sound.

I don't follow the thing about latches. At the start of a query (or transaction depending on isolation mode) the current SCN needs to be determined. The current SCN is updated more frequently than than LGWR writes (usually) so your idea would seem to reduce contention not increase it.

 
At August 24, 2011 at 3:28:00 PM PDT, Blogger Saibabu Devabhaktuni said...

Hi smanroy,

Thanks for your comment. Yes that would be a problem if every session has to read one address space in SGA for every single execution.

Actual implementation needs to resolve that contention, Oracle can probably do something like this:

1) Similar to what Jonathan Lewis mentioned on his blog, all sessions issuing commit can either maintain transaction state in SGA (i.e. one address space for each session, implemented either using latches or mutexes) or make it part of undo transaction table to indicate that redo for commit record is being written. Once the redo commit write is completed, transaction state can be changed to indicate that commit is preserved to disk.

2) All read only sql queries can continue to execute the statement as they are now, but for a given query execution when there is any ITL transaction encountered in any data block which happened within the last 3 seconds (assuming that redo writes never take more than 3 seconds and Oracle can easily determine SCN as of 3 seconds ago), Oracle can then scan transaction state table (or any other SGA structure which has this information) to find the minimum SCN that is still awaiting lgwr write to disk. Oracle can then consider this minimum SCN from transaction table as on disk SCN and then use it to construct read consistent version for rest of the query execution.

With this, I think there won't be any major contention introduced and lgwr doesn't have to do more work.

 
At August 24, 2011 at 3:39:00 PM PDT, Blogger Saibabu Devabhaktuni said...

Hi Jonathan,

Thanks for your comment. I've added my comment to smanroy's post.

Thanks,
Sai.

 
At August 24, 2011 at 3:45:00 PM PDT, Blogger Saibabu Devabhaktuni said...

hi Tony Hasler,

Thanks for your comment. For select for update (SFU) query, transaction has already began as SFU is being executed and that transaction will get rolled back if the redo is not yet written to disk for other committed transactions and the instance crashes. So, I think it is okay. Can you explain if there is any other situation that you think would create a problem for SFU's also?

Same think apply to DML statement also, when returning clause is used as part of DML statement, that transaction is not yet committed and the session will anyway wait for logfile sync to complete before passing the control to user.

Thanks.

 
At August 25, 2011 at 4:19:00 AM PDT, OpenID Jonathan Lewis said...

@Tony Hasler,

The thing about latches is really the question of who is going to notice. Every session already does "get snapshot SCN" at the start of a query, but that SCN is public memory that is updated by lots of different sessions when they do "get and advance SCN" - typically on commit - so there is a lot of risk of contention but no one session is likely to wait much time when it advances the SCN.

If we switch to "get on-disc SCN", lgwr is the only process that will be updating the desired SCN in memory, so the latch contention that lgwr suffers could become significant as a fraction of the lgwr time. And, of course, when lgwr starts losing time, everyone does.

 
At September 6, 2011 at 2:06:00 AM PDT, OpenID tonyhasler said...

Sorry for the delay in responding Sai,

Yes you are right about the rollback. I hadn't thought of that. The only thing that remains is that a regular SELECT following a SELECT...FOR UPDATE might generate different results. It would be difficult for Oracle to introduce this for compatibility reasons.

--Tony

 
At September 6, 2011 at 10:59:00 PM PDT, Blogger Saibabu Devabhaktuni said...

Hi Tony,

Since SFU (select for update) is a DML statement by itself, it'll have corresponding ITL spot in the block header for the current transaction, and any subsequent select statements will have to return the data as seen by SFU for the same records.

You can see this behavior by running SFU in one session and run just the select query in other session, you will see that the second session is constructing read consistent image by applying any applicable undo records even though no data was changed by SFU transaction yet.

Thanks,
Sai.

 
At July 26, 2013 at 2:05:00 AM PDT, Anonymous Oraclesolutions said...

the roll back point is right. samnroy has a valid point

 
At November 28, 2013 at 11:13:00 PM PST, Blogger vasudha dharani said...

Oracle Enterprise Manager Grid Control 12C
http://www.21cssindia.com/courses/oracle-enterprise-manager-grid-control-12c-228.html
Introduction
Course overview
Key Challenges for Administrators
What Is Enterprise Manager Cloud Control?
Built-in and Integrated Manageability
Configuration Management
Reviewing Enterprise Manager Core Concepts
Review Oracle Enterprise Manager architecture
Agent Installation and Target Discovery
Describing Different Target Types
Monitoring Cloud Control
Security (very high level)
Managing Securely with Credentials
21st Century providing Online training and support on All Technologies. If you are seeking training and support you can reach me on 91-9000444287. Online training by real time Experts. Call us 001-309-200-3848 for online training

 
At January 6, 2014 at 3:43:00 AM PST, Blogger Archana se said...

vhg

 
At January 26, 2014 at 8:06:00 PM PST, Anonymous hariharan said...

Its really helpful for me to understand where we i lost in my previous interview. Thanks.
If anyone wants to Learn Java in Chennai go to the Besant Technologies which is No.1 Training Institute in Chennai.

 
At February 11, 2014 at 4:44:00 AM PST, Blogger SAP Training In Chennai said...

Hi,
Excellent. Worth Sharing & Guiding. Hats Off.

Oracle Training In Chennai

 
At April 2, 2014 at 10:42:00 PM PDT, Anonymous Kumar said...

Hey,

Is there any other way to get answer like this? I tried without success. Anyway thanks for your help.

I learned a lot from Besant Technologies in my college days. They are the Best Oracle Training Institute in Chennai

http://www.oracletrainingchennai.in

 
At May 5, 2014 at 2:55:00 AM PDT, Blogger Path Infotech said...

Path Infotech is in the field of oracle training program from past several years.

For more info : Oracle Certification Courses

 
At August 18, 2014 at 9:45:00 PM PDT, Blogger karthick besant said...

thanks for sharing.....

Best Hadoop Training in Chennai

 
At September 29, 2014 at 12:02:00 AM PDT, Blogger ramya parvathaneni said...

Hi,
good appearence of content data it's use ful to us and best institute provides best training on
hadoop online training

 
At October 27, 2014 at 7:15:00 AM PDT, Blogger Joseph A. Wallace said...

Thanks for Information Oracle Online Training

 
At October 31, 2014 at 7:03:00 AM PDT, Blogger David Williams said...

Thanks for Information Our Online-Training-Informatica proven expert in all Hyperion Modules like Hyperion Financial Management, Hyperion Financial Data Quality, Hyperion Financial Reporting, Hyperion Essbase, Hyperion Planning, Smart view and Data Relationship management.Hyperion Essbase Online Training

 

Post a Comment

Links to this post:

Create a Link

<< Home