Row Identifiers, RECID and ROWIDTo understand how Progress retrieves rows, it is important to understand what
row identifiers are and how they are used as part of an index.
A "rowid" is a row identifier that uniquely identifies that row (or data record) in a database.
- As soon as a row is created, a row identifier is assigned to it by allocating a suitable storage location (dependent on its size as well as other factors) for the record in its home storage area.
- The record is written to that data block at that time.
- Subsequent changes to the record do not change its location or ROWID. In other words this record is identified by this assigned identifier and stays the same for the row's entire life.
- Even when the primary index key values are changed, the ROWID for the record remains the same.
- Once a row is deleted, a new row will re-use the rowid identifier. Likewise, deleting and re-creating a row can result in the row getting a new ROWID as will a dump and load.
- ROWIDs should never be relied on for referential integrity, using an INT64 data type instead allows the field value stored to be preserved
A RECID is a 32-bit/64-bit positive integer which contains an encoding of the storage address of a record within a Progress database storage area.
- A RECID is essentially a pointer to the first record fragment of the record, the rowid described above. When this record spans more than one database block, the first record fragment reference holds a reference to the next fragment, the next fragment holds a reference to the following fragment and so forth.
- RECID's can be converted to integers and vice versa. How to change a RECID Data Type to INT64
- OpenEdge 10.1B introduced 64-bit keys to allow for more records per storage area. For further information see Article When does the 2 billion rowid limit per Storage Area apply?
A ROWID is a variable-length identifier whose semantics and length vary depending on the database type.
- For OpenEdge databases, a ROWID is a hexadecimal representation of the RECID and was introduced in Progress 8.
- For foreign databases accessed through a DataServer, the ROWID is a representation of the foreign database's rowid implementation. RECID's are too small to hold Oracle ROWID values and similar constructs in other data managers.
- ABL Temp-tables follow the same rules as the OpenEdge database, as for the most part they use the same storage mechanisms.
- The distinction between RECID and ROWID only applies in ABL. In Progress/OpenEdge SQL, the value returned by the ROWID function is the one referred to here as RECID. When using the getValue() function in a stored procedure or trigger, it should be retrieved as an INTEGER into a Java Integer object. The value referred to here as ROWID is not used in Progress/OpenEdge SQL.
- ROWID's are not integers. They are variable length binary strings which cannot be converted to integers, except by chicanery.
- ROWID's and RECID's are not interchangeable in ABL code even when they are converted to integers. For further information refer to Article 4GL. How to convert a ROWID to a RECID and Vice Versa
How Progress retrieves rows ?In order for Progress to retrieve any row efficiently, it needs its row identifier. These are located via the table's indexes, which will store the index key(s) and for every key a list of row identifiers for the records that contain that key field.
When no index is added to a table, Progress automatically creates a default primary index in the Schema Area with the row identifier as its component. Essentially the default index is a unique index on the row identifier. As soon as an index is added to the table’s schema, the default index is deleted.
If the row identifier is already known reading the index can be skipped, which can often be used to access a record faster. This is done by using the RECID/ROWID function in the query's WHERE clause, example:
FIND FIRST CUSTOMER WHERE ROWID(CUSTOMER) = cust-rowid-variable.
For Progress databases, while a record's ROWID never changes during the lifetime of a record, the ROWID only lasts as long as the record does. If the record gets deleted the ROWID becomes available for re-use for another new record. And if that record is then re-created there is no guarantee it will get the same ROWID.
For DataServer environments the rules depend on the foreign database implementation. The safe assumption is that over longer periods of time the ROWID can change as well. Therefore as best practice the ROWID values can be used within a single session, but should not be stored beyond that session, The first read of a record should always be by it's key fields in the data to fetch the current ROWID for the session.
With the implementation of table partitioning in OpenEdge 11.4:
- RECID can no longer be used to uniquely reference data in a partitioned table as they do not contain a partition-Id component and always pertain to partition 0.
- The existing ROWID information plus the table number is no longer sufficient to uniquely identify a specific record in the database.
- Uniqueness is obtained by also including the partition id associated with the record.
- The internal format of a ROWID has been extended to include the partition information.
- The information associated with a RECID only contains the encoded block and record within block information.
- An example is provided in Article: How to recover records in a damaged partitioned table
The implementation of the ROWID representation may be different in future releases, also depending on database type. The changes would not affect functionality but would be aimed at improving performance and scalability