Feedback
 
Did this article resolve your question/issue?

   

Your feedback is appreciated.

Please tell us how we can make this article more useful. Please provide us a way to contact you, should we need clarification on the feedback provided or if you need further assistance.

Characters Remaining: 1025

 


Article

Database Server may crash when a protrace is generated

Information

 
Article Number000097446
EnvironmentProduct: OpenEdge
Version: 10.x, 11.0, 11.1, 11.2, 11.3, 11.4, 11.5, 11.6
OS: Linux
Question/Problem Description
Database can potentially crash through protrace generation with SIGUSR1 on some Linux installations 
protrace files which end before any C stack is shown.
Rocket engine async signal handling ctime malloc when DBTOOL _mprosrv, _sqlsrv2 receive SIGSEGV

protrace on _mprosrv reads
drSigFatal+0x8d  from /dlc/bin/_mprosrv
__restore_rt+0x0  from /lib64/libpthread.so.0
Steps to Reproduce
Clarifying Information
Error Message
Defect/Enhancement NumberDefect PSC00346064 / OCTA-13179
Cause
The rocket engine uses non async signal safe if a signal comes in when the thread that receives the signal is formatting a timestamp. The rocket signal handler calling localtime and malloc while generating protrace (stack) hangs while process was inside critical section of localtime friends or crashes in malloc.
Resolution
Upgrade to OpenEdge 11.6.4.018, 11.7.2, 12.0

The rocket engine uses non async signal safe if a signal comes in when the thread that receives the signal is formatting a timestamp which has been addressed, where:
  1. uttime() instead knows if we're in a signal handler and is async signal safe and if the process was releasing memory (restore_rt function) when it received a signal it no longer memory violates during a backtrace.
  2. rocket engine async signal handling ctime malloc calls have been removed inside dbut_uttrace() utstack.c and utcore.c which affects Replication/SQL/DBTOOL
The rocket engine (dbtool, _sqlsrv2, _rpagent/_rpserv executable) uses a different signal handler to the avm engine (clients). Other executable's all use the core's signal handler for protraces/cores (fixed in previous versions, refer to Article 000064858, OpenEdge processes sometimes hang or crash after receiving HANGUP or SIGUSR1  

Additionally, SQL, dbtool, and Replication executables will no longer leak file handles as part of SIGUSR1:
  • A file handle leak in the rocket utstack.c code was discovered by sending SIGUSR1 to dbtool, rpagent/rpserver, _sqlsrv2 by verifying with lsof -p <pid> for those processes not using the C stack print option or explicitly using -cstackPrintopt 0. SQL, dbtool, and Replication executables will no longer leak file handles as part of SIGUSR1, where the open files don't change after sending SIGUSR1.

Upgrade to OpenEdge 11.6.4.018, 11.7.5.0, 12.1.0.0
  1. -cstackPrintopt was additionally made a database startup parameter on Linux.
  2. -cstackPrintopt was added to _DbParams VST. It is also modifiable through the VST and PROMON.
  3. SQL and Replication executable's will honor the database's -cstackPrintopt
  4. When the database is started using the C stack print option, all self-service processes have access to the cstackPrintopt value. This covers replication executables (_rpserver, _rpagent), _dbutil, _mprshut, _sqlsrv2, _mprosrv, self service _progres, dbtool in multi-user/self service connect mode, which will honor the database's -cstackPrintopt.  It is not available for any java related process (AdminServer, ubrokers)
  5.  ABL clients can override the value using -cstackPrintopt when they for example connect to multiple databases.
  6. ABL Clients and ABL App Server clients support C Stack Print Options for SIGUSR1 since 11.6.3 they can connect to 0 databases or more than one database. They can also connect remotely or self-service. If they are connected through self-service, they will get the parameter value from shared memory or can override the value in their connection parameters. If they are connected remotely, the spawned server will get the parameter from shared memory but value needs to be specified for the remote client to have access to it when SIGUSR1 is sent to the remote client pid. Refer to Article 000069810, Clients may crash when trying to generate protrace file via progetstack / kill -SIGUSR1  
  7. If connecting to the database in single user mode, dbtool for example requires the -cstackPrintopt parameter
  8. -cstackPrintopt interacts with the Diagnostics feature introduced in 11.7.4 (-diagEvent,-diagEvtLevel) where protrace generation can be configured for all running processes.
Workaround
Notes
Last Modified Date11/11/2019 3:02 PM