[FreeNX-kNX] Endless loop in _XWaitForReadable when user's x server dies
Mario Becroft
mb at gem.win.co.nz
Tue Jan 6 12:20:51 UTC 2009
I reported this bug a while ago on this list, but I now have more
information. Running latest nx 3.3.0 libraries retrieved from
nomachine's web site today.
It is very hard to repeat the problem. If you resume a session, but the
Linux host that nxclient is running on is very short of memory, the X
server may get zapped by the oom killer when nxagent starts recreating
the windows etc.
Often this does not cause a problem (in the sense that if you try
logging in again, the session resumes as expected), but sometimes it
results in nxagent getting hung up in such a way that you can never
resume the session.
When this happens, nxagent is stuck inside _XWaitForReadable() while
doing an X protocol call. It is inside the for loop where it calls
Select() with a timeout. Select() periodically returns but it just keeps
looping forever, retrying the Select().
I haven't quite figured out what this code is meant to do. Clearly it is
meant to loop repeatedly under some conditions.
In this case, when Select() returns, result is 0, and
_NXDisplayErrorFunction is not null. The call to NXDisplayErrorFunction
is returning false, which is why it continues rather than returning -1.
--8<---------------cut here---------------start------------->8---
if (result <= 0) {
if ((result == -1 && !ECHECK(EINTR)) ||
(_NXDisplayErrorFunction != NULL &&
(*_NXDisplayErrorFunction)(dpy, _XGetIOError(dpy)))) {
_XIOError(dpy);
return -1;
}
continue;
}
--8<---------------cut here---------------end--------------->8---
It can get to this point through various calling paths. I have seen it
happen inside XCreateWindow() and inside XQueryExtension, for example,
originally from within nxagentReconnectAllWindows().
_NXDisplayErrorFunction was introduced in nxagent-2.0.0-14, according to
the CHANGELOG. It is set by calling NXSetDisplayErrorPredicate(), which
is done in nxagent/Display.c. There is a rather dense comment explaining
this:
/*
* Let Xlib become aware of our interrupts. In theory
* we don't need to have the error handler installed
* during the normal operations and could simply let
* the dispatcher handle the interrupts. In practice
* it's better to have Xlib invalidating the display
* as soon as possible rather than incurring in the
* risk of entering a loop that doesn't care checking
* the display errors explicitly.
*/
Unfortunately I am none the wiser after reading that, and looking at the
nxagentDisplayErrorPredicate() function itself.
It seems to me that if the X server goes away, it would be much better
if rather than hanging, nxagent would either notice immediately and
resume accepting new connections, or at least timeout after a reasonable
length of time.
Surely someone at nomachine must have some idea of what is going on. I
really could do with some help here, please!
Stack trace showing an example of how it came to be inside
_XWaitForReadable():
--8<---------------cut here---------------start------------->8---
(gdb) bt
#0 0x00007f45168052e3 in __select_nocancel () from /lib64/libc.so.6
#1 0x00007f4517b05fb8 in NXTransSelect () from /usr/NX/lib/libXcomp.so.3
#2 0x00007f451849d81e in _XSelect (maxfds=<value optimized out>,
readfds=0x7fff20dfae10, writefds=0x7fff20dfac90,
exceptfds=<value optimized out>, timeout=0x7f451874ab30) at XlibInt.c:333
#3 0x00007f451849dbd1 in _XWaitForReadable (dpy=0x3d4a8f0) at XlibInt.c:791
#4 0x00007f451849e0fc in _XRead (dpy=0x3d4a8f0, data=0x7fff20dfaf60 "\226",
size=32) at XlibInt.c:1510
#5 0x00007f451849eeff in _XReply (dpy=0x3d4a8f0, rep=0x7fff20dfaf60, extra=0,
discard=1) at XlibInt.c:2276
#6 0x00007f4518493092 in XQueryExtension (dpy=0x3d4a8f0,
name=0x7f451875bfff "SHAPE", major_opcode=0x7fff20dfafc4,
first_event=0x7fff20dfafc8, first_error=0x7fff20dfafcc) at QuExt.c:51
#7 0x00007f4518488774 in XInitExtension (dpy=0x26, name=0x7fff20dfae10 "")
at InitExt.c:49
#8 0x00007f451874f65e in XextAddDisplay (extinfo=0x7f451895ea10,
dpy=0x3d4a8f0, ext_name=0x7f451875bfff "SHAPE", hooks=0x7f451895e500,
nevents=1, data=0x0) at extutil.c:108
#9 0x00007f4518752be2 in XShapeCombineRegion (dpy=0x3d4a8f0, dest=14681853,
destKind=0, xOff=0, yOff=0, r=0x5fc5e90, op=0) at XShape.c:74
#10 0x00000000004936fc in nxagentShapeWindow (pWin=0xe1e560) at Window.c:2236
#11 0x0000000000494290 in nxagentReconfigureWindow (
param0=<value optimized out>, param1=<value optimized out>,
data_buffer=<value optimized out>) at Window.c:3029
#12 0x0000000000494c3c in nxagentTraverseWindow (pWin=0xe1e560,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2709
#13 0x0000000000494c90 in nxagentTraverseWindow (pWin=0x176b8e0,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2718
#14 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x1a9e570,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#15 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x1ba2090,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#16 0x0000000000494d8c in nxagentTraverseWindow (pWin=0x1bbd7c0,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#17 0x0000000000494c90 in nxagentTraverseWindow (pWin=0x15422e0,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2718
#18 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x390f760,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#19 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x1faaf50,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#20 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x21a4ba0,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#21 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x39dffa0,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#22 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x447d040,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#23 0x0000000000494c7c in nxagentTraverseWindow (pWin=0xfd0f50,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#24 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x1704290,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#25 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x17ecea0,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#26 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x18123c0,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#27 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x1871330,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#28 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x1856c10,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#29 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x18fa8e0,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#30 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x18f6a70,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#31 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x36c1f90,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#32 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x1596a50,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#33 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x158ee90,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#34 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x19a8610,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#35 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x142d1f0,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#36 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x1c653a0,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#37 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x1f4d770,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#38 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x19335d0,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#39 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x1933f30,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#40 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x36a4c30,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#41 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x1f03ec0,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#42 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x38c4890,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#43 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x35c4750,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#44 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x38dde40,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#45 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x1fb69c0,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#46 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x36611e0,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#47 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x1468e70,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#48 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x194a920,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#49 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x1583b30,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#50 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x150e650,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#51 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x2626070,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#52 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x26388e0,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#53 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x2638ad0,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#54 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x2623e00,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#55 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x243a560,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#56 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x1a4d240,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#57 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x377d030,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#58 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x1295460,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#59 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x2207680,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#60 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x1a4d060,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#61 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x38cbb60,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#62 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x243a2e0,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#63 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x2599dd0,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#64 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x2237c20,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#65 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x243a060,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#66 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x228cb20,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#67 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x243b120,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#68 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x1a70ec0,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#69 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x2117c20,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#70 0x0000000000494c7c in nxagentTraverseWindow (pWin=0xedcf00,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#71 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x3906cb0,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#72 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x19d50c0,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#73 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x23c48d0,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#74 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x3847790,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#75 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x39cf6c0,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#76 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x18ed340,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#77 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x3d2ab60,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#78 0x0000000000494c7c in nxagentTraverseWindow (pWin=0x3191b60,
pF=0x4941f0 <nxagentReconfigureWindow>, p=0x7fff20dfbde4) at Window.c:2713
#79 0x0000000000494fd5 in nxagentReconnectAllWindows (p0=<value optimized out>)
at Window.c:2718
#80 0x00000000004a0239 in nxagentReconnectSession () at Reconnect.c:505
---Type <return> to continue, or q <return> to quit---
#81 0x00000000004a04f0 in nxagentHandleConnectionChanges () at Reconnect.c:782
#82 0x00000000004a067f in nxagentHandleConnectionStates () at Reconnect.c:189
#83 0x0000000000483055 in nxagentWakeupHandler (data=0x26, count=0,
mask=0xaf2140) at Handlers.c:565
#84 0x000000000044b47e in WakeupHandler (result=0, pReadmask=0xaf2140)
at dixutils.c:472
#85 0x0000000000456f95 in WaitForSomething (pClientsReady=0x7fff20dfc140)
at WaitFor.c:389
#86 0x0000000000427071 in Dispatch () at X/NXdispatch.c:610
#87 0x000000000045043c in main (argc=13, argv=0x7fff20dfc7a8,
envp=<value optimized out>) at main.c:450
--8<---------------cut here---------------end--------------->8---
--
Mario Becroft <mb at gem.win.co.nz>
More information about the FreeNX-kNX
mailing list