Restart the entire server to confirm that the corruption was not in RAM or cache by running DBSCAN or performing a full offline PROBKUP for block verification without a backup volume by using the null-bucket instead: The advantage of:
- PROBKUP over DBTOOL is that all blocks will be read by the backup, whereas DBTOOL Option 5 only scans data blocks.
- PROBKUP over DBSCAN is that after-imaging is not disabled
- DBSCAN over PROBKUP is that dbscan will report more than the first corrupt block encountered and also record corruption
$ probkup dbname NUL -verbose (WINDOWS) or probkup dbname /dev/null -verbose(UNIX)
$ proutil dbname -C dbscan > dbscan1.out
1. If no corruption are reported, this verifies that the issue was caused by the underlying memory or cache on the system which will then require further investigation by the IT Administrators of the system before restarting the database.
2. If corruption are reported, this verifies that the block corruption was propagated to disk. It is advised that the disk subsystems are checked before proceeding with restore operations on this system, otherwise it is plausible the instructions below will not be able to make the data resident corrections on disk.
To resolve database block corruption
The first recommendation is to restore the last full backup of the database and then if running after-imaging (AI), roll-forward to the point just before the corruption occurred. Another option is to fail-over to the target replication database as block corruption cannot be propagated with ai note forward processing.
0. Make an OS backup of the damaged database to another location.
If this is not possible, then try to repair the corruption. This action is not recommended because there will be data loss as a result of the repair options selected and will only repair physical integrity not logical integrity (for example parent -child relationships (order - orderline) or business logic (invoice without delivery note). Additionally, if the corruption is in the database "Schema Area" when meta-schema blocks / records are deleted the database is likely to be unrecoverable.
This method is however sometimes necessary, for example when going back to a good backup is not possible.
1. On Windows, Set the screen buffer to the maximum of 9,999
Open a PROENV shell: START > PROGRAM FILES > PROGRESS > PROENV
To change the value of the screen buffer, click the upper left corner of the PROENV window and go to PROPERTIES > LAYOUT Section > SCREEN BUFFER SIZE box, and modify HEIGHT to 9,999.
This way all the output of DBRPR will be able to appear on the screen.
2. Try to truncate the BI file:
On Unix or Windows the output can otherwise be re-directed, refer to NOTE B below. The importance is being able to review all output information (which is not written to the .lg file)
$ proutil <db name> -C truncate bi
3. When bi recovery is not possible - skip crash recovery
If this step was successful then move to Step 4.
Reconsider reverting to backup or before proceeding, understand what skipping crash recovery means:
Open the DBRPR repair menu by forcing access with -F:
$ proutil <db name> -C DBRPR -F
If this step was used, skip Step 4 and go to Step 5.
4. Run the DBRPR repair utility:
$ proutil <db name> -C DBRPR
5. Select Option 1: Database Scan Menu
Then select the following Options to scan all areas, otherwise refer to Note B
ON 1. Report Bad Blocks
ON 3. Reformat Bad Blocks
ON 4. Report Bad Records
ON 5. Delete Bad Records
ON 7. Rebuild Free Chain
ON 8. Rebuild RM Chain
ON A: Apply scan to all areas <---- V9 and later, see Note B below
6. Run another DBSCAN
Scan Backward (Yes/No)? n
To verify there are no further block corruption in other areas or record fragmentation caused by deleting corrupt blocks. This is step is imperative if crash recovery had to be skipped in Step 3.
$ proutil dbname -C dbscan > dbscan2.out
a. If dbscan is clean, rebuild all indexes:
$ proutil <dbname> -C idxbuild all -B 32000 -TB 64 -TM 32 -TMB 128 -TF 60 -SG 64 -thread 1 -threadnum 4 -mergethreads 4 -datascanthreads 4 -pfactor 90 -rusage
Beginning in 10.2B06 the additional parameters listed above improve the IDXBUILD speed
b. if dbscan reports further errors:
Option 1: an ASCII dump and load will be required to recover data that remains accessible. Tables that fail to ASCII dump will need to be dumped by skipping over corrupted records. Further steps are needed to retrieve the records that were skipped, if it is possible to find them in prior backup copies.
Eliminate corruption by reformatting bad blocks in the current database
Refer to the instruction provided in Article 000016827, Possible method to fix an overlapped record: Bad record size Records overlap
A. It is recommended to try to use the last backup and roll forward. This procedure is not supported and should be used as a last resource for restoring production data. On the other hand, this method can be useful to re-instate data in the restored database by running backend data exports.
B. To automate DBRPR process the menu options can be fed by an input file attached to this Article: dbrpr.in
$ proutil <dbname> -C dbrpr < dbrpr.in > dbrpr.out
It is recommended to proof the input file against a test database with the same OE version as menu option numbers may vary.
Additionally, the instructions above include "A" for all database areas. If block corruption is known to be restricted to a particular area only, time can be saved by modifying the input file can be to not select "A" for all areas and instead first drop to OPTION >> 10. Change Current Working Area, and select the related area number presented. For example, if corruption are only in Area 30:
>> DBRPR: 1 /
>> 10, 30
C. If the database succeeds bi crash recovery and block corruption is limited to an index-only storage area:
- This can be confirmed by running a scan only dbrpr report that only shows block corruption in this area: proutil <dbname> -C dbscan > scandbrpr.out
- Truncate the index area with: proutil dbname -C truncate area "<area name>"
- Run index build on the affected area.