Feedback
Did this article resolve your question/issue?

   

Article

Basic guide to determine how many APWs are required for a database?

Information

 
TitleBasic guide to determine how many APWs are required for a database?
URL Name000045471
Article Number000134485
EnvironmentProduct: OpenEdge
Version: 10.x,11.x
OS: All Supported Operating Systems
Question/Problem Description
How to determine how many Asynchronous Page Writers (APWs) are required for a database?
Do I have enough APWs running ?
How to improve long Checkpoints?
 
 
Steps to Reproduce
Clarifying Information
Error Message
Defect/Enhancement Number
Cause
Resolution
Using APWs require an Enterprise Database license. A minimum of 1 APW and 1 BIW should always be used, unless the database has no write activity.
 
At the time of a BI Checkpoint, the database buffer pool is scanned for modified buffers and placed onto a queue. Refer to Article:
These buffers must be written to the database before the next Checkpoint begins.  These are effectively the number of blocks that were not written during the checkpoint and that had to be written all at once at the end of the checkpoint remaining from the last checkpoint (ie Dirty Buffers). If they have not all been written, then write activity must stall until they have been.  Those writes made at the end of the checkpoint have various names but are all the same thing:
"Buffers Flushed", "Flushes or DB writes/BI writes, "Database Writes Flushed" at checkpoint.
 
Amongst other things, the APWs job is to scan the queue that was created at checkpoint and write out those modified buffers.  When Buffers Flushed are occurring then the APWs are not keeping up. The detail of APW tasks are explained in the Documentation:
          
OpenEdge Data Management: Database Administration > Maintaining and Monitoring Your Database > Managing Performance > Server performance factors > Database I/O > Using APWs to improve performance
 

Bi Cluster Size

The problem may not be caused by a lack of APWs specifically, but that Checkpoints occur too frequently due to:
  • The BI Cluster Size used is set too small to allow APW's to do their work between Checkpoints
  • Not enough or too many -bibufs adding to BI Write overhead
1.  First check whether the BI Cluster Size is set appropriately and adjust it if required.  Ideally, checkpoints should occur (on average) every 2 to 5 mins (++ or longer) and no less than 60 seconds during normal daily activity. Refer to the following Article on how to check and set the BI Cluster Size: 

Buffers Flushed

2.  Once the BI Cluster Size has been set appropriately, check Buffers Flushed to determine if more APWs are needed.  Only add 1 at a time and re-monitor.
 
There are various screens within PROMON that show the "Buffers Flushed" value:
 
PROMON -> 5. Activity    "Buffs Flushed" & "Writes by APW (%)"
PROMON -> R&D -> 2. Activity Displays -> 1. Summary       "Flushed at chkpt" 
PROMON -> R&D -> 2. Activity Displays -> 4. Page Writers  "Flushed at checkpoint"
PROMON -> R&D -> 3. Other Displays   -> 4. Checkpoints.  "Flushes" or DB writes and BI writes
PROMON > R&D > Status: Buffer Cache "Modified buffers"

A value of zero is best, but sometimes when a database starts up there may be a few that occur that are unavoidable.  The goal is to obtain a value that is not increasing.

The description of the PROMON metrics presented are described under the relevant section in the Documentation:
OpenEdge Data Management: Database Administration > Reference > PROMON Utility 
https://docs.progress.com/bundle/manage-database/page/PROMON-Utility.html  
 

Disk Bandwidth

Another reason the APW's may not be able to keep up is that the disk bandwidth for the devices holding the data extents may be insufficient to handle the read/write workload. For example, a database with all extents on one disk will have very low write performance and APWs may routinely fall behind. Under these conditions, starting one more APW process will effectively suffer from the same problem (being able to flush the blocks) and one extra APW is yet another i/o contender.

Pre-formatting the bi cluster chain before starting the database can alleviate a small part of the problem. In this way it saves time with having to add new clusters to the bi chain at Checkpoints:
For example: With a 16 MB BI Cluster size, "bigrow 12" will give you a 256 MB cluster ring at startup instead of 64 MB
$   proutil dbname -C bigrow 12

This problem needs to be solved by investigating bottlenecks on I/O or CPU that are impacting our ability to write changes to disk in a timely fashion.

While not the be-all end-all diagnostic, using the BIGROW utility as an independent test, provides a reliable indicator of storage implementation and/or configuration issues which lead back to database/application performance problems.  A benchmark widely accepted over the years and across many different storage systems is that it should take no longer than 10 seconds to write a 100 MB BI file.

When compared with another BIGROW which is run with the non-raw (-r) parameter, by not needing to sync the bi files when the grow/format is complete, this gives an indication of approximately the difference that could be achieved with/without the additional time it takes to sync the bi files on completion.  For further information review the "Furgal Test" in Article
Workaround
Notes
Last Modified Date1/27/2020 4:08 PM
Attachment 
Files
Disclaimer The origins of the information on this site may be internal or external to Progress Software Corporation (“Progress”). Progress Software Corporation makes all reasonable efforts to verify this information. However, the information provided is for your information only. Progress Software Corporation makes no explicit or implied claims to the validity of this information.

Any sample code provided on this site is not supported under any Progress support program or service. The sample code is provided on an "AS IS" basis. Progress makes no warranties, express or implied, and disclaims all implied warranties including, without limitation, the implied warranties of merchantability or of fitness for a particular purpose. The entire risk arising out of the use or performance of the sample code is borne by the user. In no event shall Progress, its employees, or anyone else involved in the creation, production, or delivery of the code be liable for any damages whatsoever (including, without limitation, damages for loss of business profits, business interruption, loss of business information, or other pecuniary loss) arising out of the use of or inability to use the sample code, even if Progress has been advised of the possibility of such damages.