Feedback
 
Did this article resolve your question/issue?

   

Your feedback is appreciated.

Please tell us how we can make this article more useful. Please provide us a way to contact you, should we need clarification on the feedback provided or if you need further assistance.

Characters Remaining: 1025

 


Article

SE 49 while writing "HANGUP signal received" to the lg file

Information

 
Article Number000086767
EnvironmentProduct: OpenEdge
Version: 10.2B, 11.0, 11.1, 11.2, 11.3, 11.4, 11.5, 11.6
OS: Unix, Linux
Question/Problem Description
Memory violation (49) while writing "HANGUP signal received (562)" messages to database log files

Session received HANGUP signal but got a segmentation fault while writing HANGUP message 562 to the database logfiles
When the session received the HANGUP signal it was waiting for a resource (buffer lock) in shared memory. 
Watchdog shut database down: “User died during microtransaction (2256)”.

PROTRACE indicates the memory violation happened while the session was dealing with db log files rather than with transaction undo

stack trace from _progress:                                  
ncalog  # stub for child log msg
prolgm  # write a message to the log file
fdlgm  # Write a message to all database .lg files
logmsgw  # Redirect msgn %L messages to the database log file
Steps to Reproduce
Clarifying Information
User connected to 4 databases, the first self-service, the remainder client/server
User interrupted a transaction F4 while the program started new phase deleting records, Transaction state: BEGIN
User then closed their terminal session.

Session wrote the 562 message twice to the first self-service connected database (both messages on the same millisecond): no transaction backout is written under these conditions - we're in dbdown already:
16:02:22.472 P-25990  ABL  2841: (562)   HANGUP signal received.
16:02:22.472 P-25990  ABL  2841: (562)   HANGUP signal received.
16:02:22.472 P-25990  ABL  2841: (49)    SYSTEM ERROR: Memory violation.
16:02:22.472 P-25990  ABL  2841: (439)   ** Save file named core for analysis by Progress Software Corporation.
16:02:59.252 P-22783  WDOG 1105: (2256)  SYSTEM ERROR: User 2841 died during microtransaction.

one message (as it should be) to the second remote database: 
16:02:22.472 P-18386  SRV   331: (562)   HANGUP signal received.
16:02:23.917 P-18386  SRV   331: (794)   Usernum 4422 terminated abnormally.
16:02:23.917 P-18386  SRV   331: (739)   Logout usernum 4422, userid <>, on <>

no messages to the last two remote connected databases
    
16:02:23.917 P-18589  SRV   328: (794)   Usernum 4443 terminated abnormally.
16:02:23.917 P-18589  SRV   328: (739)   Logout usernum 4443, userid <>, on <>
    
16:02:23.917 P-2094   SRV   455: (794)   Usernum 3256 terminated abnormally.
16:02:23.917 P-2094   SRV   455: (2252)  Begin transaction backout.
16:02:23.917 P-2094   SRV   455: (2253)  Transaction backout completed.
16:02:23.917 P-2094   SRV   455: (739)   Logout usernum 3256, , userid <>, on <>.
Error MessageHANGUP signal received (562)
Defect/Enhancement Number
Cause
The user died holding a latch, because that latch was raised prior to the (first) hangup on successful return from pthread_cond_wait() which results in the thread holding this mutex 

(18) 0xe0000001901cb480 ---- Signal 1 (SIGHUP) delivered ----
___ksleep [/usr/lib/hpux64/libc.so.1] .. thread waiting on pthread_cond_wait 
__mxn_sleep +0x1190 at /ux/core/libs/threadslibs/src/common/pthreads/sleep.c:1260 [/usr/lib/hpux64/libpthread.so.1]
__pthread_cond_wait +0x3050 at /ux/core/libs/threadslibs/src/common/pthreads/cond.c:3127 [/usr/lib/hpux64/libpthread.so.1]
__pthread_cond_wait +0xd0 at /ux/core/libs/threadslibs/src/common/pthreads/cond.c:2249 [/usr/lib/hpux64/libpthread.so.1] 

The second HANGUP is Signal 11 (SIGSEGV)
The key strace messages are: prolgm & ncalog
These have to do with writing a message to the lg file then finding the shared memory user is no longer there:
  • When a client is waiting (various reasons) and goes into sleep mode,
  • the system signal (HANGUP) disconnects the client,
  • but we're still trying to send the promsg to the server it is no longer connected to.
Resolution
1.  Upgrade to OpenEdge 11.6.4, 11.7 or later, where there have also been a number of improved signal handling fixes, mostly for Linux but also IBM.  For further detail refer to the following Article and related links: 2.   Ensure OS patches are up to date. 

A number of reported incidents during research evolve to OS network layers corrupt the requests between client and server, specifically when a thread is waiting pthread_cond_wait then receives a hangup.  Later OS patches appear to have resolved incidence (no specific details have been provided, only noted).

 
 
Workaround
Notes
Complete stack trace from _progres on HP-UX reads:

(0) uttraceback + 0x60 at /vobs_prgs/src/ut/utstack.c:205 [$dlc/_progres] # system dependent traceback for Unix 
(1) utcore + 0x360 at /vobs_prgs/src/ut/utbuf.c:544 [$dlc/_progres] # Core Dump
(2) drexit + 0x400 at /vobs_prgs/src/drsys/drsetup.c:879 [$dlc/_progres] # called to perform critical shutdown operations prior to abnormal termination of PROGRESS
(3) drSigFatal + 0x150 at /vobs_rkt/src/glue/drsig.c:1290 [$dlc/_progres]
(4) ---- Signal 11 (SIGSEGV) delivered ----
(5) .PLT0 + 0x3fffffffffefe460 [/usr/lib/hpux64/dld.so]
(6) ncssenddll + 0xa0 at /vobs_prgs/src/ncssys/ncs.c:689 [$dlc/_progres]
(7) ncssend + 0x40 at /vobs_prgs/src/ncssys/ncstrn.c:74 [$dlc/_progres]
(8) ncasend + 0x120 at /vobs_rkt/src/glue/nca.c:236 [$dlc/_progres]
(9) ncalog + 0xa0 at /vobs_rkt/src/glue/nca.c:528 [$dlc/_progres] # stub for child log msg
(10) prolgm + 0x60 at /vobs_prgs/src/prsys/procomm.c:664 [$dlc/_progres] # write a message to the log file
(11) fdlgm + 0x4f0 at /vobs_prgs/src/fdsys/fd.c:2296 [$dlc/_progres] # Write a message to all database .lg files
(12) logmsgw + 0x1e0 at /vobs_prgs/src/drsys/drmsgw.c:146 [$dlc/_progres] # Redirect msgn %L messages to the database log file
(13) msgnCB + 0xa60 at /vobs_prgs/src/drsys/drmsg.c:1158 [$dlc/_progres] # like msgn but sets globals from dmsContext
(14) drSigMessage + 0xa0 at /vobs_rkt/src/glue/drsig.c:3550 [$dlc/_progres] # Signal logging activated
(15) drSigClient + 0x280 at /vobs_rkt/src/glue/drsig.c:818 [$dlc/_progres] # Signal handler for UNIX CLIENT
(16) drSigDo1 + 0x1a0 at /vobs_rkt/src/glue/drsig.c:1408 [$dlc/_progres] # Process a pending signal Called from drHdlSignal and drSigDoPending
(17) drSigDispatch + 0x250 at /vobs_rkt/src/glue/drsig.c:1599 [$dlc/_progres] # centralized signal dispatcher to call appropriate handler for a particular signal
(18) ---- Signal 1 (SIGHUP) delivered ----
(19) ___ksleep + 0x30 [/usr/lib/hpux64/libc.so.1]
(20) __mxn_sleep + 0x1190 at /ux/core/libs/threadslibs/src/common/pthreads/sleep.c:1260 [/usr/lib/hpux64/libpthread.so.1]
(21) __pthread_cond_wait + 0x3050 at /ux/core/libs/threadslibs/src/common/pthreads/cond.c:3127 [/usr/lib/hpux64/libpthread.so.1]
(22) __pthread_cond_wait + 0xd0 at /ux/core/libs/threadslibs/src/common/pthreads/cond.c:2249 [/usr/lib/hpux64/libpthread.so.1]
(23) ThreadMain + 0x1cf0 at /build/p701_P/src/lib/cs/unix/hp700_ux90/amqxprmx.c:2709 [/opt/mqm/lib64/libmqmcs_r.so]
(24) __pthread_bound_body + 0x1c0 at /ux/core/libs/threadslibs/src/common/pthreads/pthread.c:4929 [/usr/lib/hpux64/libpthread.so.1]
Attachment 
Last Modified Date11/11/2019 11:31 AM