Tony’s Oracle Tips

July 24, 2011

‘Log file sync’ and ‘log file parallel write’ – part 1

Filed under: Uncategorized — tonyhasler @ 8:15 pm

Preamble
This is the first part of a three part blog about LGWR. If you want a thorough understanding of LGWR, ‘log file sync’, and ‘log file parallel write’ read on. If you are an Oak Table member you might want to skip to the more advanced topics in parts 2 and 3 (unless you want to point out errors and omissions of course :-)).

You are a programmer or a DBA and you are trying for the first time in your life to analyze an Oracle performance problem. What is the most important concept you need to understand to get started?

Well, there are a number of right answers to this question but rightly or wrongly the ‘The Oracle wait interface’ would be many people’s answer. A very basic approach would be to look at the highest wait event on the system and focus on that. So for example, if you see a lot of ‘db file sequential read’ wait events you might conclude that faster disks might be the answer. You may be a bit smarter and realise that these events are generally associated with indexed access and try and work out what indexes are being accessed and figure out if they (or the queries that use them) can be redesigned.

There are, of course, numerous books and web articles that explain the deficiencies of this type of approach but in many cases it seems to work. The success of this type of basic approach will be quite high in simple systems where performance has been neglected for a long time and there are one or two ‘quick wins’ that can have a significant benefit. In complex systems, or ones that have been properly managed over the years this sort of basic analysis is unlikely to be enough.

Cary Millsap wrote a book with Jeff Holt many years ago entitled ‘Optimizing Oracle Performance’ in which he explained the limitations of any system wide tuning approach. Even though the book was written many years ago, I would say that this is still mandatory reading for any Oracle performance specialist. The next most important book, in my opinion, is a more modern work by Christian Antognini http://antognini.ch/top/ but I digress.

Is ‘LOG FILE SYNC’ (LFS) relevant?

Arguably the single most misleading wait event in the Oracle database is ‘log file sync’ (LFS). Let me explain why. Suppose you have a single client process and it spends 1% of its time in LFS. Would you gain a lot by engaging in a tuning exercise that aimed exclusively at reducing this wait event? Well, obviously the upper limit on the gain you might make is 1% even if you eliminated these events in their entirety; the answer is, therefore, NO. You are unlikely to waste your time though because your Statspack or AWR report will probably not highlight the event in the first place.

On the other hand, suppose you have 10,000 sessions connecting to your database and 300 of these are active at any one time. Suppose that 1% of the active time of each session is spent in LFS would you gain much by tuning the event? The answer is the same as before: NO. However, you might be misled by Statspack or AWR because now the total wait time for the database for LFS will be 3 hours in any one hour! This is because at any one time we have 300 active sessions and on average three of them will be in LFS. That comes out to be 3 hours per hour waiting! So before you conclude that LFS is a major performance issue first check out what the average wait time per active session is. Of course, what you should really do is find out the percentage of wait time for your business critical processes but I digress again. Let us get back to the technicalities of LFS.

‘Log File Sync’ (LFS) versus ‘log file parallel write’ (LFPW)

When a transaction commits it will usually wait for its redo data to be recorded on disk before returning control to the application. That wait is LFS. There are three main exceptions to this:

  • When the commit is issued within a PL/SQL block. In this case the wait will be deferred until the block returns. If the PL/SQL block contains multiple commit statements then only one wait will occur when the block returns.
  • When the commit statement is actually COMMIT WORK NOWAIT. In this case, there is no LFS wait. See part 3 in this series for an explanation of why this is dangerous.
  •  When the session parameter COMMIT_WAIT (COMMIT_WRITE in earlier releases) is set to NOWAIT. This parameter can be set at the instance level but this is dangerous practice and should be set at the session level only with care. See part 3 in this series for an explanation of why this is dangerous.

The waiting client process does not do the I/O itself. LGWR does. When LGWR does the I/O it waits on ‘log file parallel write’ (LFPW). So does this mean that the number of LFPW events should equal the number of LFS events? NO! This is because LGWR employs a tactic known as ‘group write’ and may write out the redo data for multiple transactions and multiple sessions in one go. Suppose you have 5 client processes each of which is waiting for its transaction to complete. These transactions will be waiting for LFS. LGWR will be waiting for LFPW. If the I/O takes 10ms then:

  • The system will accumulate 50ms of time in LFS (10ms for each client process). 5 wait events will be recorded.
  • LGWR will accumulate 10ms. One wait event will be recorded.

That is 6 wait events totalling 60ms clocked up for one 10ms I/O! You see the danger in oversimplifying?

In the simplified example above I have suggested that the average wait time for LFS should be about the same as the average wait time for LFPW as they are both associated with waiting for the redo data to be written to disk. In fact, this is itself a misleading simplification. Onto part 2.

6 Comments »

  1. Very nice series indeed!

    Based on your explanation regarding LFS vs. LFPW, does this mean that, as long as the NOWAIT option is not used, LFS wait times should be larger than LFPW wait times for an interval? Thing is, I see systems with LFPW wait times much larger than LFS wait times and I wonder why.

    Comment by Marcus Mönnig — July 25, 2011 @ 9:25 am | Reply

  2. Marcus,

    I do not understand how it is possible to have LFS lower than LFPW as the I/O is part of the LFS time. Could you post the scripts you are using to measure the times?

    Thanks

    Comment by tonyhasler — July 25, 2011 @ 4:14 pm | Reply

  3. Basically, I am looking at Statspack data (with my own queries and with the spreport.sql script). I can post a sample tomorrow from work.

    But after thinking about it… as far as I know LGWR flushes when…

    1) user commits a transaction
    2) redo log buffer becomes 1/3 full
    3) redo buffer contains 1MB of changed records
    4) switch of the log files
    5) when a DBWn process writes modified buffers to disk, if necessary

    So, when there are very little commits (low LFS wait times), there might be higher LFPW wait times due to flushes for reason 2) and 3).

    My sample is from a very active system though and the LFPW wait times are really much higher, so I don’t think that this is the case for me.

    Regarding reason 5, my understanding is that when you would change a lot of data and all this data is going to the buffer cache and then has to go to disk again to make space for the next blocks to be changed, so DBWn has to flush constantly and thus constantly asks LGWR to flush, it would result in much higher LGPW than LFS wait times. This might the problem in my case and I have an idea what’s happening (changes of lots of records in a very large index-organized table in one transaction).

    Comment by Marcus Mönnig — July 25, 2011 @ 5:25 pm | Reply

  4. Below are some relevant sections from a SP report. This is a 3rd party application, PL/SQL source wrapped with lots of batch processing through database jobs.

    Regarding the five reasons above for LGWR flushing to disk, I don’t know why it happens here. 5) doesn’t seem to be the reason, because there are no high DBW wait event times

    ****************************
    Host Name: ***** Num CPUs: 16 Phys Memory (MB): 36,854
    ~~~~

    Snapshot Snap Id Snap Time Sessions Curs/Sess Comment
    ~~~~~~~~ ———- —————— ——– ——— ——————-
    Begin Snap: 191997 19-Jul-11 00:01:01 59 15.1
    End Snap: 193527 25-Jul-11 16:01:01 101 67.6
    Elapsed: 9,600.00 (mins) = 576,000 seconds

    Top 5 Timed Events Avg %Total
    ~~~~~~~~~~~~~~~~~~ wait Call
    Event Waits Time (s) (ms) Time
    —————————————– ———— ———– —— ——
    CPU time 2,651,260 59.7
    db file sequential read 194,893,208 975,291 5 22.0
    PL/SQL lock timer 934 226,764 ###### 5.1
    log file parallel write 213,434,578 176,057 1 4.0
    db file scattered read 13,179,953 126,571 10 2.9
    ————————————————————-

    ….

    Wait Events DB/Inst: *****/***** Snaps: 191997-193527
    -> s – second, cs – centisecond, ms – millisecond, us – microsecond
    -> %Timeouts: value of 0 indicates value was Only events with Total Wait Time (s) >= .001 are shown
    -> ordered by Total Wait Time desc, Waits desc (idle events last)

    Avg
    %Time Total Wait wait Waits
    Event Waits -outs Time (s) (ms) /txn
    ——————————— ———— —— ———- —— ——–
    db file sequential read 194,893,208 0 975,291 5 0.8
    PL/SQL lock timer 934 94 226,764 ###### 0.0
    log file parallel write 213,434,578 0 176,057 1 0.9
    db file scattered read 13,179,953 0 126,571 10 0.1
    enq: UL – contention 52,608 38 63,565 1208 0.0
    enq: TM – contention 120,150 12 53,202 443 0.0
    Log archive I/O 2,402,811 0 49,144 20 0.0
    log file sequential read 2,452,672 0 48,039 20 0.0
    enq: TX – row lock contention 23,120 58 42,090 1821 0.0
    log buffer space 48,104 2 5,999 125 0.0
    control file sequential read 13,879,761 0 3,981 0 0.1
    log file sync 348,914 6 3,785 11 0.0
    direct path read 1,567,464 0 3,018 2 0.0
    log file switch completion 50,431 0 1,504 30 0.0
    direct path read temp 3,706,026 0 1,284 0 0.0
    read by other session 154,679 0 1,173 8 0.0
    ****************************

    Comment by Marcus Mönnig — July 26, 2011 @ 8:40 am | Reply

    • Marcus,

      Your statspack report has made me realise that my earlier statement was unclear: a specific LFPW wait take less time than the LFS event or events with which it is associated. If the vast majority of LFPW events are associated with user commits then comparing averages is reasonably legitimate. In your case, however, very few of the LFPW events are associated with user commits and so as you said in your first comment, this means that a comparison of averages is meaningless.

      Your statspack report covers a week of time. There is is surely a mixture of quiet and busy times in here. I would suggest that a report based on about 15 minutes during a period of interest would be more meaningful.

      If I have read your report correctly, it seems that there are a total of 348,914 LFS events taking an average of 11ms and 213,434,578 LFPW events taking an average of 1ms. So the average LFS time is actually much higher than the average LFPW time anyway. A key point of this blog entry is to highlight the fact that total wait times for LFS are relatively meaningless.

      By the way, there is an average of 370 LFPW events per second 24 hours a day almost all associated with reasons other than user commit! This is certainly an extreme situation that I would think is precipitated by something very specific to your environment.

      Comment by tonyhasler — July 26, 2011 @ 10:27 am | Reply

  5. Ok, I found my answer and it was right there in your initial blog post.

    “When the commit is issued within a PL/SQL block. In this case the wait will be deferred until the block returns. If the PL/SQL block contains multiple commit statements then only one wait will occur when the block returns.”

    The Statspack data confirms this:

    Statistic Total per Second per Trans
    ——————————— —————— ————– ————
    user commits 242,976,434 421.8 1.0
    redo writes 213,434,650 370.6 0.9
    redo synch writes 669,310 1.2 0.0

    So, by far most of the commits result in an asynchron redo write which won’t collect LFS wait time, but of course will collect LFPW wait time.

    So easy once you see it. 🙂

    Thanks for your replies, Tony!

    Comment by Marcus Mönnig — July 27, 2011 @ 12:47 pm | Reply


RSS feed for comments on this post. TrackBack URI

Leave a reply to tonyhasler Cancel reply

Blog at WordPress.com.