Feedback
Did this article resolve your question/issue?

   

Article

Suggestions for investigating and identifying the cause of unexpected bi growth

Information

 
TitleSuggestions for investigating and identifying the cause of unexpected bi growth
URL NameP29202
Article Number000137287
EnvironmentProduct: OpenEdge
Version 10.x, 11.x, 12.x
Product: Progress
Version: 8.x, 9.x
OS: All supported platforms
Question/Problem Description
Suggestions for investigating unexpected bi growth
Investigating why bi files reach the 2 GB limit
Isolating the cause of bi file growth above 2GB when largefiles are enabled
How to narrow down the root cause of why a bi file is growing exponentially large ?
How do I find out which client process is causing the BI file to grow unexpectedly?
How to stop the bi file from growing?
How to use Client Database Statement Cache to identify the cause of long running transactions?
 
Steps to Reproduce
Clarifying Information
Error MessageSYSTEM ERROR: Attempted to exceed maximum size on file dbname.b1. (9452)
** Insufficient disk space to extend the before-image file. (854)
** Insufficient disk space to extend the before-image file. (2292)
Insufficient disk space. (4163)
Defect/Enhancement Number
Cause
Resolution
The next actions are preventative and corrective, after recovering the database from an unexpected bi growth situation: Transaction backouts are achieved by reading the Before-image (BI) file - a file that records transaction notes during run-time. What causes the .bi to grow exponentially can be manifold, but from the database perspective it is always due to long running transaction(s) which lock the bi cluster from re-use at checkpoint time.  

Corrective actions are therefore directed towards:
  1. Restricting bi growth and monitoring the database until the root cause has been addressed
  2. Identifying which transaction(s) contribute to the .bi growth and addressing the application code responsible.

A. Force the BI file to not grow beyond a specific size

Use the database startup parameters -bithold to restrict how large the bi file can grow and -bistall to allow immediate investigation into likely candidates. 
  • When bi file approaches or eventually reaches the restricted size, the database engine prevents further growth by stalling the database until manual intervention. The -bithreshold can then be raised to allow forward OLTP processing to continue, while runtime investigation proceeds.  Refer to Article,  How can -bithold -bistall and the proquiet bithreshold be implemented?  
  • It is inadvisable not to use -bistall while seeking to solve what is causing bi growth. The database engine prevents further growth by shutting the database down instead. This action no longer presents the opportunity to isolate open transactions and eventually the application code in use at runtime. If the database is enabled for after-imaging, clues can be extracted from the ai file content. But this is no-where near as efficient as being able to collate information while the situation is happening with the database online.  Refer to Article, How to use an After Image file to find the user running a long transaction  
B. Add more bi file space

When largefiles cannot be enabled on databases running with WorkGroup or Personal (or Development licenses) consider:
  1. Since Progress 9.1C, the 2GB limit applies to individual bi extents
  2. For Progress 8 and earlier, the 2GB limit applies to the sum of bi extents. While more bi extents can be added to split the bi extents across discs for example, it will not help in overcoming this limit.
  3. Even when largefiles have been enabled for the database, consider adding more fixed bi extents while the cause is being isolated.  In this way a quick and dirty alert will be when the variable bi file extent starts to grow.
C. Isolate long running transactions at runtime

Since long transaction duration triggers excessive BI growth, these can be found through a PROMON investigation against the running database:

$   promon dbname > R&D > 1. Status display > 4. Processes/Clients > 3. Active Transactions
$   promon dbname > R&D > 4. Administrative Functions > 1. Check Active Transaction Status


The results presented provide slightly different information, however they both provide the details required to begin isolating potential candidates :
 
11/22/19        Status: Active Transactions by user number for all tenants      
16:36:53                                                                        
Usr:Ten   Name   .. Login time     Tx start time  Trans id .. Trans State       
                 ..                                        ..                   
  6       Ubatch .. 11/22/19 13:12 11/23/19 13:30    34673 .. Active  FWD   
  • A list of users with open active transactions will be displayed, pay special attention to the client's start times.
  • Start with the those users that show the longest active transaction start time. It is these Active transactions which are locking the bi cluster their transaction start notes were recorded in, that is preventing them from being re-used at Checkpoint time. The database engine then has to add more bi clusters to the bi chain in order for forward transaction processing to be recorded, thereby extending the bi file size.
  • As a general rule, a transaction time of over 10 minutes is considered unusual and should be investigated.
  • Once the application code has been identified (see below), it may be that some piece of code is revealed to be poorly scoped for transactions, record scope or is running in a recursive loop for example. On the other hand, transaction or record scope could be locking records needed by other client sessions leading to transaction times in this application code area being longer than needed over-all.  Tuning the -lkwtmo can alleviate this situation while the source of the problem is still being isolated.
Transaction information can also be found by interrogating the _Trans VST table:

Example 1: The longest running transaction will be the one at the top of the results list:
FOR EACH _trans NO-LOCK WHERE _Trans-State <> ? BY _Trans._Trans-Duration DESCENDING:

    FIND FIRST _Connect WHERE _Connect._Connect-Usr = _Trans._Trans-Usrnum.
         DISP STRING(TIME,"HH:MM:SS") LABEL "TIME NOW"
         _Connect._Connect-Name
         _Connect._Connect-PID FORMAT 99999
         _Trans._Trans-Usrnum
         _Trans._Trans-Num 
         _Trans-Duration
         _Trans._Trans-Txtime FORMAT "X(40)".
END.

Example 2:  Report on all transactions that have been running longer than x-time
 
DEFINE VARIABLE iTimeInSeconds AS INTEGER NO-UNDO INITIAL 600.

    FOR EACH _Trans WHERE _Trans._Trans-State EQ "Active" AND
     _Trans._Trans-Duration GT iTimeInSeconds NO-LOCK:

       FIND FIRST _Connect WHERE _Connect._Connect-Usr = _Trans._Trans-UsrNum NO-LOCK NO-ERROR.
       DISPLAY _Trans-Num _Trans-UsrNum _Trans-Duration _Connect._Connect-Name.

    END.

Once the long running transaction number can be determined. The easiest way to match that transaction to the application code is to find out from the user what they were running at the time, otherwise further Diagnostics are listed below.

D. Isolate which sessions are making heavy writes to the bi file

When unexpected BI growth is currently being observed against a running database, the I/O Operations by Process screen in PROMON can be interrogated:

$   promon dbname -> R&D -> 3. Other Displays -> 2. I/O Operations by Process  
  • A list of the user sessions currently connected to the database and the number of reads and writes to the database, BI and AI files for each user.
  • By entering 'R' at the "Enter <return>, R, U, P, T, or X (? for help): " prompt, the values will be refreshed,
  • By repeatedly refreshing the screen, users that have larger than expected growth in the number of writes to the BI file(s) should become apparent.
In OpenEdge 11.7, this information has been added to the same view to identify long running transactions:
"BI RReads BI RWrites":
$   promon dbname > R&D > 1. Status display > 4. Processes/Clients > 3. Active Transactions

This information can also be obtained by querying the _UserIO VST table:
FOR EACH _UserIO NO-LOCK WHERE _UserIO-usr >= 1:
    DISPLAY _UserIO-Usr _UserIO-Name _UserIO-BiWrite.
END.

E. Isolate the Application code responsible:

Once an ABL client process that is causing more BI writes than expected can be isolated:

Dump a protrace:

It is possible to determine what ABL program(s) that client process is currently running by executing the proGetStack command on the same machine where the client process is running since OpenEdge 10.1C:

$   proGetStack <pid> Client Database-Request Statement Cache:

It is possible to see what the application code is doing on the database-side. Use the user number of the user holding the oldest transaction to lookup

$   promon dbname > R&D > 1. Status Displays > 18 Client Database-Request Statement Cache F. Isolate Application Code Transaction scope

Again, as it is not necessarily "a single procedure" that needs identifying, it is one or more transaction(s) that are causing the .bi file to grow.  When the code that is responsible for bi growth is isolated, the transaction scope in that code needs to be addressed.

1. The COMPILE with "Listing File" helps to identify where the transaction blocks start and end. Compile the Application code in the Application Compiler with the listing file:

Select OPTIONS -> compiler -> "Listing File" = (say) "listing.txt" + check "append".

Or instead of using the Application Compiler, simply execute the COMPILE statement with the LISTING option:
 
COMPILE <program-name> LISTING <program-name>.lst.

From the output of this action, a map of the code will be created. By analysing this map, where the blocks start and end will be apparent and from this detail, investigation of the transaction scoping of the application. Any large transactions that are identified are likely candidates.

At runtime, (in versions since 10.1B) use the 4GLTrans log entry type to log the beginning and end of transactions that may have started or spread dynamically.

For example:
-clientlog <logfilename> -logentrytype 4GLTrans -logginglevel 3

G. Gather Application Runtime Metrics

Consider implementing the "use statistics with cross reference (-yx)" startup parameter for a period of time which will give a fair idea of where the busiest code is in terms of:
  • Long execution times and large number of calls, which imply the necessity of revisiting code considering internal procedures and include files
  • Re-reads, where increasing the -mmax client memory could help but will not resolve the transaction times.
Workaround
Notes
Progress Articles:

For further information on the how bi files grow:
Why is my .bi file growing so large ?   
Why the .bi file grows larger than original bi with roll forward?   

For further information controlling bi file growth:
How can the -bithold -bistall and the proquiet bithreshold be implemented?   

To test the -bistall, -bithold and proquiet bithreshold implementation.
How to simulate a .bi grow to 2GB for testing purposes  
How to monitor before-image growth with Virtual System Tables   

Consider enabling largefile support if your OS acomodates largefiles and your Progress version supports this on your OS:
How to allow database extents larger than 2GB? (embedded video)   

For further information on the investigative tools mentioned:
How to identify users with long running transactions.   
How to find if a transaction is still open and who owns it with VST's ?   
The Code Use Analysis -yx -> proc.mon, Explained   
What is proGetStack?   
How to produce a stack trace for a running OpenEdge process without killing it   
What is Client Database-Request Statement Caching?   
Is there a VST for client statement cache?   

For further information on how to disconnect a user running a long-running transaction:
How to use PROSHUT to disconnect a user from command line?

To assist with determining how long bi recovery will take before production can be accessed:
How to estimate how long BI Recovery will take to complete   
 
Last Modified Date8/28/2020 4:08 PM
Attachment 
Files
Disclaimer The origins of the information on this site may be internal or external to Progress Software Corporation (“Progress”). Progress Software Corporation makes all reasonable efforts to verify this information. However, the information provided is for your information only. Progress Software Corporation makes no explicit or implied claims to the validity of this information.

Any sample code provided on this site is not supported under any Progress support program or service. The sample code is provided on an "AS IS" basis. Progress makes no warranties, express or implied, and disclaims all implied warranties including, without limitation, the implied warranties of merchantability or of fitness for a particular purpose. The entire risk arising out of the use or performance of the sample code is borne by the user. In no event shall Progress, its employees, or anyone else involved in the creation, production, or delivery of the code be liable for any damages whatsoever (including, without limitation, damages for loss of business profits, business interruption, loss of business information, or other pecuniary loss) arising out of the use of or inability to use the sample code, even if Progress has been advised of the possibility of such damages.