Using APWs require an Enterprise
Database license. A minimum of 1 APW and 1 BIW should always be used, unless the database has no write activity.
At the time of a BI Checkpoint, the database buffer pool is scanned for modified buffers and placed onto a queue. Refer to Article:
These buffers must be written to the database before the next Checkpoint begins. These are effectively the number of blocks that were not written during the checkpoint and that had to be written all at once at the end of the checkpoint remaining from the last checkpoint (ie Dirty Buffers). If they have not all been written, then write activity must stall until they have been. Those writes made at the end of the checkpoint have various names but are all the same thing:
"Buffers Flushed", "Flushes or DB writes/BI writes, "Database Writes Flushed" at checkpoint.
Amongst other things, the APWs job is to scan the queue that was created at checkpoint and write out those modified buffers. When Buffers Flushed are occurring then the APWs are not keeping up. The detail of APW tasks are explained in the Documentation:
OpenEdge Data Management: Database Administration > Maintaining and Monitoring Your Database > Managing Performance > Server performance factors > Database I/O > Using APWs to improve performance
Bi Cluster Size
The problem may not be caused by a lack of APWs specifically, but that Checkpoints occur too frequently due to:
- The BI Cluster Size used is set too small to allow APW's to do their work between Checkpoints
- Not enough or too many -bibufs adding to BI Write overhead
1. First check whether the BI Cluster Size is set appropriately and adjust it if required. Ideally, checkpoints should occur (on average) every 2 to 5 mins (++ or longer) and no less than 60 seconds during normal daily activity. Refer to the following Article on how to check and set the BI Cluster Size:
2. Once the BI Cluster Size has been set appropriately, check Buffers Flushed to determine if more APWs are needed. Only add 1 at a time and re-monitor.
There are various screens within PROMON that show the "Buffers Flushed" value:
PROMON -> 5. Activity "Buffs Flushed" & "Writes by APW (%)"
PROMON -> R&D -> 2. Activity Displays -> 1. Summary "Flushed at chkpt"
PROMON -> R&D -> 2. Activity Displays -> 4. Page Writers "Flushed at checkpoint"
PROMON -> R&D -> 3. Other Displays -> 4. Checkpoints. "Flushes" or DB writes and BI writes
PROMON > R&D > Status: Buffer Cache "Modified buffers"
A value of zero is best, but sometimes when a database starts up there may be a few that occur that are unavoidable. The goal is to obtain a value that is not increasing.
The description of the PROMON metrics presented are described under the relevant section in the Documentation:
Another reason the APW's may not be able to keep up is that the disk bandwidth for the devices holding the data extents may be insufficient to handle the read/write workload. For example, a database with all extents on one disk will have very low write performance and APWs may routinely fall behind. Under these conditions, starting one more APW process will effectively suffer from the same problem (being able to flush the blocks) and one extra APW is yet another i/o contender.
Pre-formatting the bi cluster chain before starting the database can alleviate a small part of the problem. In this way it saves time with having to add new clusters to the bi chain at Checkpoints:
For example: With a 16 MB BI Cluster size, "bigrow 12" will give you a 256 MB cluster ring at startup instead of 64 MB
$ proutil dbname -C bigrow 12
This problem needs to be solved by investigating bottlenecks on I/O or CPU that are impacting our ability to write changes to disk in a timely fashion.
While not the be-all end-all diagnostic, using the BIGROW utility as an independent test, provides a reliable indicator of storage implementation and/or configuration issues which lead back to database/application performance problems. A benchmark widely accepted over the years and across many different storage systems is that it should take no longer than 10 seconds to write a 100 MB BI file.
When compared with another BIGROW which is run with the non-raw (-r) parameter, by not needing to sync the bi files when the grow/format is complete, this gives an indication of approximately the difference that could be achieved with/without the additional time it takes to sync the bi files on completion. For further information review the "Furgal Test" in Article