Feedback
Did this article resolve your question/issue?

   

Article

DBDOWN when reported PHYSICAL index block corruption is reported.

« Go Back

Information

 
TitleDBDOWN when reported PHYSICAL index block corruption is reported.
URL Name000040449
Article Number000163063
EnvironmentProduct: OpenEdge
Version: 10.1B02, 10.2x, 11.x
OS: All Supported Operating Systems
Other: -DBCheck
Question/Problem Description
DBDOWN when reported PHYSICAL index block corruption is reported.
INDEX corruption 4423 2816 14037 14031 14036
Index corruption causes the compression check to fail with error 4423
Database started with -DBCheck.
Database contains data whose recid are larger than 32-bit (2147483647) and indexes over 32-bit boundary.
IDXCHECK with physical consistency check only (option 1 in Validation (o/O)), as the index corruption has always reported on PHYSICAL index block corruption not content, reports no errors.  
After IDXBUILD the same corruption re-appears as soon as application processing is resumed.
When database failure occurs, the first transaction and sub-transaction are still open.
Steps to Reproduce
Clarifying Information
PROTRACES from _pro* read:

cxRemoveEntry
cxRemoveBlock
cxRemove
cxDeleteNL
cxDeleteEntry
dsmKeyDelete
proixdel

cxRemoveEntry
cxRemoveBlock
cxRemove
cxDelLocks
cxRemoveFromDeleteChain
cxPut
cxAddNL
cxAddEntry
Error MessageSYSTEM ERROR: Index <index-num>, block <dbkey>, element no. <element-num>: bad compression size. (4423)
SYSTEM ERROR: Index 2102, block 2151312032, element no. 1: bad compression size. (4423)

prev size = , cs = , ks = , is = , key count = . (2816)
prev size = 1, cs = 128, ks = 58, is = 63, key count = 1 (2816)

Index <Arg1> block validation error data: nment is <Arg2>, nlength is <Arg3>, level is <Arg4>, current key is <Arg5>, offset is <Arg6>, func is <Arg7> (14037)
Index 2086 block validation error data: nment is 107, nlength is 1460, level is 0, current key is 1, offset is 8, func is cxDoDelete (14037)

Invalid Index Block Detected (14031)
SYSTEM ERROR: Invalid Index Block FATAL (14036)

Database block consistency check: Enabled (-DBCheck) (14016)
Defect NumberDefect PSC00254283 / OE00232267
Enhancement Number
Cause
When -Dbcheck is enabled at database startup, it incorrectly validates an index structure that contains both 32- and 64-bit key components during the processing of a delete while that structure was in a transition - therefore at an inappropriate time for the check to be performed.

If an index block contains both 32-bit and 64-bit dbkeys, the DBCheck routine miscalculates the compression by 2 bytes off, triggering a false alert. 
The problem is due to DBCheck performing it's validation while the block is still in a partial update state.
The validation happens during swapping entries between keys that store a 32 bit and 64 bit value and deleting one of them, since the validation was in the deletion process, the wrong key size was used that caused the error. The resulting validation itself is invalid.

Other than throwing fatal message, the validation neither corrupts the block nor is there block corruption despite the error to the contrary. The -Dbcheck database startup parameter is to prevent corruption prior to it being committed. In this case it reacts to a false alert causing the emergency shutdown as a consequence. It did not create corruption, merely falsely reporting the fatal error when it encounters an index with keys with 32 and 64 bit values, in the deletion process.

Since 64bit dbkey support was introduced in OpenEdge 101B, at the time there was no -DbCheck feature, which was introduced in a 101B service pack, this problem has been in the product since this database startup parameter was introduced.
Resolution
If using the -DBCheck database startup parameter, upgrade to OpenEdge 10.2B08,11.2.1,11.3.0 where this validation check has been fixed.
Workaround
It is not easy to absolutely identify databases which will be affected by this problem, but since the consequence is a production down situation as a result, don't use -DBCheck database startup parameter until the fix is in place if if the database contains recids over the 2 billion boundary. It is not necessary to run IDXBUILD unless other corruption are detected.

One method would be to run dbrpr to display cluster links (by block dbkeys) of index objects  in TYPEII, an index that contain blocks with 64 dbkeys could hit the problem, but this is not 100% deterministic as the report doesn’t show which blocks have the first and second entry size difference.

Data over 32-bit boundary does not imply that indexes are as well. For example if the indexes are located on different areas they could still be under 32-bit boundary. It’s not about the dbkey of the block itself. It’s the block whose first and second entries have different dbkey bit sizes, one 32 and other one 64. Those blocks, whether their dbkeys are 64bit or not, can suffer from the false validation check. This problem only affects 32<->64 dbkey boundary of index blocks and it only affects non-leaf blocks. Rowid values stored in entries in leaf blocks point to records, thus record blocks with 32->64 dbkey boundary do not suffer from the problem. The problem happens with those blocks whose first and second key entry have a different size, iow: one is 32 bit and the other is 64 bit. Other entries in the block can be either 32bit and 64 bit. When the problem happens, the offset is incorrectly shifted, causing validation to fail on all entries, resulting in the error.
Notes
Last Modified Date3/10/2014 12:15 PM
Attachment 
Files
Disclaimer The origins of the information on this site may be internal or external to Progress Software Corporation (“Progress”). Progress Software Corporation makes all reasonable efforts to verify this information. However, the information provided is for your information only. Progress Software Corporation makes no explicit or implied claims to the validity of this information.

Any sample code provided on this site is not supported under any Progress support program or service. The sample code is provided on an "AS IS" basis. Progress makes no warranties, express or implied, and disclaims all implied warranties including, without limitation, the implied warranties of merchantability or of fitness for a particular purpose. The entire risk arising out of the use or performance of the sample code is borne by the user. In no event shall Progress, its employees, or anyone else involved in the creation, production, or delivery of the code be liable for any damages whatsoever (including, without limitation, damages for loss of business profits, business interruption, loss of business information, or other pecuniary loss) arising out of the use of or inability to use the sample code, even if Progress has been advised of the possibility of such damages.