Article

Why deleting records in a table leads to index fragmentation ?

Information

 
Article Number000010229
EnvironmentProduct: Progress
Version: 9.x
Product: OpenEdge
Version: 10.x, 11.x
OS: All supported platforms
Question/Problem Description
Why deleting records in a table leads to index fragmentation ?
Why IDXANALYS shows large fragmentation after a large record delete?
When to run IDXCOMPACT?
Why are index entries occupying up to 10 times more physical space after mass record deletion?
Is it possible to view the contents of the index tree?
Steps to Reproduce
Clarifying Information
Error Message
Defect/Enhancement Number
Cause
Resolution
When records are deleted, index-trees are not rebuilt. The entire B-tree is left intact. This was a change introduced in Progress 9.x, where index entry locks are not removed at the end of transaction:  
  • During a transaction, when a record is deleted, if there is a unique index, a delete place holder is put in the place of the RECID for the deleted record until the transaction commits. This is done so that another transaction running at the same time cannot add the same unique entry, which would thereby prevent the deleting transaction from rolling back.
  • At the end of the transaction, it is converted to a negative number and put it in place of the RECID in the index as a placeholder.  i.e.  Progress treats the index entry locks as index entries during subsequent read operations.  
  • The specified blocks are not read because RECID is higher than any possible High Water Mark (HWM) for that Area. But it's still a logical request.
  • Having deleted a sufficient amount of records of a table in a short time span, would leave a situation of high fragmentation and query performance issues when read queries traverse the entries in an index bracket, the placeholders are skipped over. When there are a large number of placeholders within a bracket, possibly in multiple index blocks (which often occurs when one deletes many rows in a table), this can cause noticeable delays and extra index block reads.
  • These placeholders are cleaned out the next time the index block that contains them is used and written to disk. However: if there are no modifications to this index bracket, the placeholders remain
  • As a result, corrective actions should be considered after delete operations in order to reduce the number of index blocks in an effort to increase index utilization and therefore performance:
    1. Ideally, an IDXBUILD operation on the affected table(s) would correct this immediately, but the utility has to be run offline.  
    2. If the database cannot be taken offline, after the delete operation:
    • The "proutil -C idxcompact" operation, can be run both online and offline to reduce the number of blocks in the B-Tree and possibly the number of B-Tree levels, or  
    • Use "proutil -C idxmove" to another area and optionally back again to the current area the indexes are stored. This action will rebuild the indexes of the remaining records in this table.
How index compact would help matters is as follows:
  1. If the index is a unique index, the delete chain is scanned and the index blocks are cleaned up by removing deleted entries.
  2. All non-leaf levels of the B-tree are compacted starting at the root working toward the leaf level.
  3. The leaf level is compacted.
In other words, in addition to compacting an index, the idxcompact utility additionally clears dead entries left after entries have been deleted from unique indexes.

It is strongly recommended that OpenEdge 10.1C04 or 10.2A01 or later is in use before running IDXCOMPACT online.  Previously there were a number of issues with running the IDXCOMPACT utility online, in particular an issue that resulted in excessive bi growth due to the scope of the idxcompact transaction being handled as a single transaction.

The command line argument for IDXCOMPACT is:

$  proutil dbname -C idxcompact tablename.indexname n

Where:
n, Specifies the degree of index compaction and takes an integer range >=50 and <=100.
The default value is 80. If not specified, 80 is used.

Command line arguments can be created and then edited to run through a shell script, for example, as follows:
OUTPUT TO idxcomp.out.
 FOR EACH _file WHERE _file-number > 0 AND
     NOT _file-name BEGINS "SYS" NO-LOCK:
     FOR EACH _index OF _file NO-LOCK: 
         PUT UNFORMATTED "proutil dbname -C idxcompact " + _file-name + "." + _index-name "95" SKIP.
     END.
END.
OUTPUT CLOSE.
​DISPLAY "file is ready for editing".

** As with running any update database utility, it is suggested a valid database backup beforehand ** 

Further Considerations:

If this "delete operation" is a regular feature of the application resulting in full table content deletion:  the contents of a complete table are deleted at the end of every business day:
  • Consider moving this table and it's associated indexes to a separate Storage Area and then save time deleting the entire table content by instead running the offline "proutil dbname -C truncate area" as an alternate method. Indexes are disabled as part of this operation and need to be rebuilt afterwards which is fast because there are no records remaining.
Workaround
Notes
Attachment 
Last Modified Date1/11/2019 12:21 PM


Feedback
 
Did this article resolve your question/issue?

   

Your feedback is appreciated.

Please tell us how we can make this article more useful. Please provide us a way to contact you, should we need clarification on the feedback provided or if you need further assistance.

Characters Remaining: 1025