Feedback
 
Did this article resolve your question/issue?

   

Your feedback is appreciated.

Please tell us how we can make this article more useful. Please provide us a way to contact you, should we need clarification on the feedback provided or if you need further assistance.

Characters Remaining: 1025

 


Article

Clients may crash when trying to generate protrace file via progetstack / kill -SIGUSR1

Information

 
Article Number000069810
EnvironmentProduct: OpenEdge
Version: 10.2B0858, 11.6.3, 11.6.4, 11.7.0
OS: Linux
Other: cstackPrintopt
Question/Problem Description
Clients may crash when trying to generate protrace file via progetstack / kill -SIGUSR1.

Expected protrace file is either incomplete or is not generated at all.

If an incomplete protrace is generated, the timestamp in the protrace file matches the timestamp on the SE 49 error when the client crashed.
The client process crashed with SE49 at the same instance it received the USR1 request.

If a core file is generated, and a stack trace can be unwound the backtrace() OS API indicates the point of failure.
 
Steps to Reproduce
Clarifying Information
Any OpenEdge client may crash when SIGUSR1 is sent to request a protrace file from a progress process, including interactive clients, batch clients, WebSpeed and AppServer agents.

This is mostly seen on releases that have the fix for PSC00347434 in place. Releases that do not have the fix will usually deadlock before the crash can happen. For further information refer to Article 000064858, OpenEdge processes sometimes hang or crash after receiving HANGUP or SIGUSR1  
Error Message
Defect/Enhancement Number
Cause
While testing the initial fix for PSC00347434 after the deadlocking in localtime function was resolved, it was found that the backtrace() API from glibc used to unwind the C-level stack trace is also not async signal safe in a number of Linux builds.

 
Resolution
Upgrade to OpenEdge 11.6.3, 11.7.0 or later.

As part of the definitive fix for PSC00347434, a workaround for the Linux 1293594 Bug has been implemented.  The C stack print option parameter applies only to Linux systems. If it is supplied on other platforms, it is silently ignored.

The -cstackPrintopt startup parameter, controls which API to use when printing the C-level stack trace when handling SIGUSR1 signal on Linux:

If -cstackPrintopt 0 is specified, the AVM will use existing Redhat’s “backtrace” API to get C-stack
This is the default setting and is the same behavior when cstackPrintopt is not specified at all. 

If -cstackPrintopt 1 is specified, the AVM will use “gstack” command to generate the C-stack
This will avoid the crashes, generating the C-stack will be a bit slower.
More importantly, this mode has additional requirements to get the C-stack reliably:
  • The gdb debugger packages must be installed on the system (not every Linux distribution has them by default)
  • Permission issues can interfere with generating the C-stack if the OpenEdge clients are not run as root.
If -cstackPrintopt 2 is specified, the AVM will not generate the C-stack

Only ABL-related details will be included in the protrace file (command line arguments, startup parameters, ABL stack) thereby avoid both the Linux bug, and the chance of hitting an unknown USR1 a-sync issue

The gstack command could be used by a user running as root to get the C stack before sending the SIGUSR1 signal to get the ABL Stack. While the ABL and C stacks won't reflect the same point in time, this work around isn't as helpful unless the client is stuck in a very tight loop or waiting. However multiple stacks should always be dumped (rule of 3's ) to show if the stack is changing.

Enabling core files may also help provide evidence if the crash happens again, particularly when associated with a crashing process, understanding that disk space is needed and the core must be unwound on the machine it is generated on. For recommendations on core file generation refer to Article 000039632, Why is a core or protrace file not created when a Progress process abnormally terminates?  

Where to set cstackPrintopt:

 -cstackPrintopt is an undocumented workaround, caseSensitive, Client-Session parameter (CS) only for Linux:
  • It is available for the following ABL process types: _progres (including webspeed), _proapsv 
  • It is only available on Linux, iow it cannot be used on prowin (which would be client-server to the linux database).
  • For AppServer or WebSpeed agents, specify -cstackPrintopt in the ubroker.properties, srvrStartupParam or in the pf if srvrStartupParm specifies a pf
  • For _progres (pro, mpro, mbpro) client session process types, specify -cstackPrintopt on the command line, scripts or in the client's .pf file
  • It is recommended to specify relevant CS parameters in a .pf for the executable that is starting the session. While cstackPrintopt can be set in default "dlc/startup.pf" and should not cause issues with argument parsing, it is not recommended.  
  • To confirm cstackPrintopt has been set, for example: MESSAGE SESSION:startup-parameters.
  • C stack print option (-cstackPrintopt) will interact with the Diagnostics feature introduced in 11.7.4 (where protrace generation can be configured). The feature also sends SIGUSR1 to the other processes to generate their protraces.
  • The -cstackPrintopt parameter only applies to protrace generation from SIGUSR1 and nothing else. For example it does not apply to protrace files generated during crashing due to a fatal signal (SIGSEGV, SIGBUS).
C stack print option:
  • It's not currently available for spawned ABL DB servers (_mprosrv)
  • It is not available for rocket engine utilities, _mprosrv, _sqlsrv2, _rfutil, _dbutil, _rpserver, _rpagent or for any java related process (AdminServer, ubrokers)
  • When addressing similar deadlocking in localtime function in the rocket engine, the -cstackPrintopt was additionally enhanced as a database startup parameter on Linux in OpenEdge 11.7.5.0, 12.1.0.0 This means it no longer has to be specified on shared-memory clients. Refer to Article 000097446, Database Server may crash when a protrace is generated  


 
Workaround
Notes
Attachment 
Last Modified Date11/11/2019 4:05 PM