Feedback
Did this article resolve your question/issue?

   

Article

How Does -directio Improve Performance?

« Go Back

Information

 
TitleHow Does -directio Improve Performance?
URL Name20688
Article Number000156411
EnvironmentProduct: Progress
Version: 8.x, 9.x
Product: OpenEdge
Version: 10.x, 11.x, 12.x
OS: All supported platforms
Question/Problem Description
How Does -directio Improve Performance?
What is -directio?
How is -directio implemented?
How to use the -directio Database Startup Parameter
Direct io explained.
Steps to Reproduce
Clarifying Information
Error Message
Defect Number
Enhancement Number
Cause
Resolution
Using the -directio Database Startup Parameter

The Progress RDBMS for UNIX has had two different implementations of the - directio parameter. The original implementation was used in Version 6.3 and Version 7.X and is now only of historical interest. Version 8 and later uses a new implementation, which is described below.

Using the -directio option may improve database performance through more effective regulation of the disk write workload of a UNIX system hosting a Progress Database Server.

Background: Database I/O

In Progress 8 through 9.1B, the default method the Progress database storage manager uses to perform random-access database reads and writes on UNIX systems is to use buffered i/o via either the lseek()/read() and lseek()/write() system calls.

In Progress 9.1C and later, the default method is pread64() and pwrite64() system calls, combined with a sync() system call at the end of each checkpoint to ensure that data will eventually be forced to disk. These system calls use the operating system's file buffer cache when possible.

The overall disk write workload from database writes will sometimes become quite uneven or "bursty", especially when the database update workload is heavy. Database blocks written by the Asynchornous Page Writers are not actually written to disk when the Page Writer issues a write() or pwrite64() system call. Instead database blocks are copied by the operating system into the filesystem cache in memory. They are usually written to disk some time later, when the filesystem decides to do so. The filesystem may delay actually writing the database blocks to disk until it needs to make room for reading in a new disk block or after the next sync() call. So all the careful work that the Page Writers do to plan their activities to smooth out database writes is wasted.

Database I/O with -directio - Theory

When the -directio server startup parameter is specified, the database storage manager uses a different method for writing database blocks. This method is called "synchronous write" and is activated by specifying the O_SYNC (or D_SYNC if available) option when the database files are opened with the open() system call. Reading and writing the database is performed with the pread64() (or read()) and pwrite64 (or write()) system calls.

In this mode, all database I/O operations will still use the filesystem buffers, but writes are handled in a different manner than without the -directio option in effect. A pwrite64() (or write()) system call does not complete until after the data have been transferred to the disk by the operating system's device drivers. As a result, writes take longer and the Page Writers take longer to do their job, but the overall disk write workload should be more evenly distributed and have fewer spikes.

When using synchronous writes, the storage manager does not need to use the sync() system calls at the end of each checkpoint. In modern UNIX systems, this can be quite important as sync() calls can be expensive.

Modern systems often have large amounts of memory, multiple disk drives and  often have very large filesystem buffer caches. When the database storage  manager makes a sync() call, the operating system writes all modified file pages  to disk. This includes all database pages present in the filesystem cache as  well as other pages. Flushing the filesystem cache can be time consuming and can  cause noticeable delays in system and database activity.

By using the -directio option, one gains the following beneficial effects:
  • Expensive sync() calls are eliminated, along with unnecessary i/o caused by flushing data that has nothing to do with the database.
  • Overall disk write scheduling is more even because writes occur when the Page Writers need them to and they try to organize their activities to provide as even a write rate as they can.
To get these benefits, more Page Writers are needed than when the -directio  option is not in effect. If -directio is used without increasing the number of  Page Writers, none at all, you will probably see a decrease in overall  performance. A "rule of thumb" is to use one Page Writer per disk that contains  database data files, and one extra. More or less, may be needed depending on the  operating system, number of users, and application. A system with a light update workload (one in which the application does not update the database very much) will need fewer Page Writers because fewer database writes need to be done. With a Workgroup or Personal database license, since APWs are not available the duration of the checkpoints for example would increase as the first process that needed to move to a new BI cluster would first have to write all database buffers that were referenced in the BI cluster. Consider upgrading to an Enterprise Database License if performance is a priority.

Database I/O With -directio - Practice

The previous section describes how -directio works in theory. In theory, there  is no difference between theory and practice, but in practice, there is. Along  with the advantages, there are a few disadvantages as well. 

When -directio is not being used, the filesystem schedules write operations at  times of its choosing and also tries to coalesce writes to adjacent filesystem  pages when possible. This coalescing can reduce the number of disk seeks and  disk writes. When -directio is used, the filesystem's write coalescing is  largely eliminated for database writes and this may result in lower performance.

The -directio option gives different results with different versions of the UNIX  operating system, different filesystems and occasionally with different releases  of the same operating system. The -directio option is not suitable in all cases.  In some cases, there is no benefit, and in others severe performance  degradation.  In particular:
  • On AIX systems (release 4.3 and later), -directio has been beneficial in many situations and has not been known to cause problems.
  • On Linux RedHat systems up to RedHat 9, -directio often provides no benefit.
  • The use of -directio on HP-UX is not recommended. On HP-UX systems release 11.0  and later, customers have experienced a variety of problems, caused by defects  in the implementation of the pread64() and pwrite64() system calls. There are  several patches available from HP to correct these problems. The effect of these  defects is a severe degradation in write performance, even when -directio is not  being used.
While the -directio option can be very beneficial, it is not always. In all  cases, -directio should only be implemented in production after performing tests  to determine whether it is helpful in a specific environment. For example, if large write-heavy update proceedure consistently runs for some duration X without -directio, and it consistently runs with -directio in a duration that is less than X by a statistically-significant amount, then -directio may be helpful.

In the above discussion, (fdatasync() on UNIX, FileFlushBuffers() is used on Windows
Workaround
Notes
Last Modified Date11/20/2020 7:30 AM
Attachment 
Files
Disclaimer The origins of the information on this site may be internal or external to Progress Software Corporation (“Progress”). Progress Software Corporation makes all reasonable efforts to verify this information. However, the information provided is for your information only. Progress Software Corporation makes no explicit or implied claims to the validity of this information.

Any sample code provided on this site is not supported under any Progress support program or service. The sample code is provided on an "AS IS" basis. Progress makes no warranties, express or implied, and disclaims all implied warranties including, without limitation, the implied warranties of merchantability or of fitness for a particular purpose. The entire risk arising out of the use or performance of the sample code is borne by the user. In no event shall Progress, its employees, or anyone else involved in the creation, production, or delivery of the code be liable for any damages whatsoever (including, without limitation, damages for loss of business profits, business interruption, loss of business information, or other pecuniary loss) arising out of the use of or inability to use the sample code, even if Progress has been advised of the possibility of such damages.