Feedback
 
Did this article resolve your question/issue?

   

Your feedback is appreciated.

Please tell us how we can make this article more useful. Please provide us a way to contact you, should we need clarification on the feedback provided or if you need further assistance.

Characters Remaining: 1025

 


Article

Exceeding index key size limitations when migrating database to UTF-8

Information

 
Article Number000042387
EnvironmentProduct: OpenEdge
Version: Progress 9.x
OpenEdge 10.x, 11.x
OS: All Supported Operating Systems
Other: N/A
Question/Problem Description
Index key values whose size is currently pushing the limit may cause errors to be returned when running idxbuild on the database after converting it to UTF-8, or when saving a value to the index in a UTF-8 database.
 
Steps to Reproduce
Clarifying Information
Error MessageThe total length of the fields in an index exceeds max key size. (129)
ERROR: The length of key exceeds the maximum size. (11494)
SYSTEM ERROR: Failed to build key for recid = <RECID> index number <index-num> table number <table-num>. (4431)
Defect/Enhancement Number
Cause

This is expected behavior.  The size of the key is measured in bytes.  When converting data from one codepage to another, the size of the data may increase due to the codepage and collation used.

For a single-byte codepage, one character is represented in an index key with one byte.  For UTF-8 with BASIC collation, which contains some non-ASCII data, each character could be longer than one byte.

When using ICU collations then there is a much larger impact on key size, to store multi-level comparison data in the key.

When an index key is created, the limit is 2000 bytes (this used to be 200 bytes, but was raised within the OE10 timeframe).  How many characters this is depends on the codepage and collation, and how many components (fields) there are in the index.

 
Resolution

Depending upon how the data are searched in the index and the number of fields, the indexing strategy should be modified to compensate.  Below is a list of possibilities for how this could be handled.  This list does not account for all possible index usage strategies however.

If the index is comprised of a single character field, consider:
- Using a word index
- Base64 encoding the index value and changing the index to use the base64 encoded version of the string for comparison
- Use some other derived key value to identify the record

If the index is comprised of multiple fields, consider breaking them out into multiple indices.

 
Workaround
Notes
References to Other Documentation:
Progress Article(s):
000010885, What are the index limits in OpenEdge?
000067998, What are the size limits for indexing and sorting?
 
Attachment 
Last Modified Date11/11/2019 6:52 PM