[FreeNX-kNX] nxagent session gets lost, user gets new session even though one already exists

Mario Becroft mb at gem.win.co.nz
Sun Jan 25 03:32:22 UTC 2009


I still don't fully understand this problem, but I have a solution.

I am not very sure about Marcelo's patch because as far as I can see,
NODE_SUSPEND_STATUS is never set to "Suspending". What is this patch
meant to do exactly?

I found that with slave mode disabled, everything is much easier to
understand, and it does not appear to make it any slower. It did not
exactly fix the problem though, just modified the symptoms.

The key problem is that when the client nxssh is killed, nxserver hangs
in the echo inside server_nxnode_echo(). It attempts to handle this
situation by installing a SIGPIPE handler that sets
SERVER_CHANNEL=0. Unfortunately, SIGPIPE is never received in this
situation; instead the echo hangs forever. This is what causes it never
to process any more commands from nxnode.

It is not entirely clear why it happens in this way.

Anyway, the workaround is to change echo to /bin/echo. /bin/echo returns
immediately if the client is disconnected. Probably it should also check
the status and set SERVER_CHANNEL=0 if /bin/echo failed. However I have
not bothered to do this. It does not seem to matter a great deal.

This solves the problem both with and without slave mode. I think there
may still be some sort of timing related potential problem here, but I
am not sure, it is all rather complicated.

I have also noticed another problem that I thought might be related, but
is probably different. If you unplug the network from the currently
logged in client, it takes about 30 seconds before nxagent notices that
the client is gone and suspends the session. If, in this 30-second
window, you login from another client, everything works, but the session
status incorrectly remains in suspended state. I guess this is because
when the second client logs in, it must suspend the session before
restoring it on the new client. Somehow the suspended state of the
session is set after the resumed state. I am out of time and this
problem is not so serious, so I am ignoring it for now. Maybe someone
else has time to look into this one.

Anyway, for anyone else who has the present problem, please try the
following patch and report back.

See the patch below (the line numbers might be a bit off since my file
has lots of extra instrumentation):

--8<---------------cut here---------------start------------->8---
--- nxserver.foo        2009-01-25 16:07:46.590977440 +1300
+++ nxserver    2009-01-25 16:07:54.498952944 +1300
@@ -967,8 +967,8 @@
 server_nxnode_echo()
 {
        log 6 "server_nxnode_echo: $@"
-       [ "$SERVER_CHANNEL" = "1" ] && echo "$@"
-       [ "$SERVER_CHANNEL" = "2" ] && echo "$@" >&2
+       [ "$SERVER_CHANNEL" = "1" ] && /bin/echo "$@"
+       [ "$SERVER_CHANNEL" = "2" ] && /bin/echo "$@" >&2
 }

 server_nxnode_exit_func()
--8<---------------cut here---------------end--------------->8---

-- 
Mario Becroft <mb at gem.win.co.nz>



More information about the FreeNX-kNX mailing list