[FreeNX-kNX] NX load balancing fails on certain nodes

Simon Gao gao at schrodinger.com
Tue Apr 22 22:26:26 UTC 2008


Simon Gao wrote:
> Simon Gao wrote:
>   
>> Hi,
>>
>> I am trying to set up NX load balancing on a FreenNX gateway machine. It 
>> seems working fine except when routing to some nodes failed with errors 
>> like:
>>
>>
>> NX> 1000 NXNODE - Version 2.1.0-72-SVN OS (GPL)
>> NX> 700 Session id: 
>> nxserver1.example.com-1000-13C3FD265620B8D80CA7FFF1EC6F3856
>> NX> 705 Session display: 1000
>> NX> 703 Session type: unix-kde
>> NX> 701 Proxy cookie: 58a7d86a5845fa299585d1bf219f9154
>> NX> 702 Proxy IP: 192.168.0.1
>> NX> 706 Agent cookie: 58a7d86a5845fa299585d1bf219f9154
>> NX> 704 Session cache: unix-kde
>> NX> 707 SSL tunneling: 1
>> /usr/bin/nxserver: line 1368: 30561 Terminated              sleep 
>> $AGENT_STARTUP_TIMEOUT
>> NX> 105 NX> 596 Session startup failed.
>> NX> 1004 Error: NX Agent exited with exit status 1. To troubleshoot set 
>> SESSION_LOG_CLEAN=0 in node.conf and investigate 
>> "/home/gao/.nx/F-C-nxserver1.example.com-1000-13C3FD265620B8D80CA7FFF1EC6F3856/session". 
>> You might also want to try: ssh -X myserver; /usr/bin/nxnode --agent to 
>> test the basic functionality. Session log follows:
>> Can't open 
>> /var/lib/nxserver/db/running/sessionId{13C3FD265620B8D80CA7FFF1EC6F3856}: 
>> No such file or directory.
>> mv: cannot stat 
>> `/var/lib/nxserver/db/running/sessionId{13C3FD265620B8D80CA7FFF1EC6F3856}': 
>> No such file or directory
>> NX> 1006 Session status: closed
>> NX> 280 Exiting on signal: 15
>>
>> The gateway machine runs nx-3.1.0-r1, nxserver-freenx-0.7.2-r2 with 
>> Gentoo. On the destination server, freenx-0.7.1.svn416-3.el5.centos, 
>> nx-3.0.0-4.el5.centos.
>>
>> Also there are errors about /tmp/.X11-unix/X1000 permission. Is 
>> /tmp/.X11-unix locally on the gateway or the destination NX server machine?
>>
>>   
>>     
> Just found another interesting problem. Even if LOAD_BALANCE_SERVERS 
> does not specify using the gateway machine for load balancing, NX server 
> somehow tried to use it any way. When would NX server try to load 
> balance on 127.0.0.1?
>
> nfo: Load-Balancing (if possible) to 127.0.0.1 ...
> &link=lan&backingstore=1&encryption=1&cache=16M&images=64M&shmem=1&shpix=1&strict=0&composite=1&media=0&session=nxgate&type=unix-kde&geometry=1400x995&client=linux&keyboard=pc101/us&screeninfo=1400x995x24+render&clientproto=2.1.0&user=gao&userip=192.168.0.12&uniqueid=FCD3B4533849372E13E16024A3439C92&display=1001&host=127.0.0.1
> Password:
> NX> 1000 NXNODE - Version 2.1.0-72 OS (GPL, using backend: 3.1.0)
> server_nxnode_echo: NX> 1000 NXNODE - Version 2.1.0-72 OS (GPL, using 
> backend: 3.1.0)
> NX> 700 Session id: nxgate-1001-FCD3B4533849372E13E16024A3439C92
> NX> 705 Session display: 1001
> NX> 703 Session type: unix-kde
> NX> 701 Proxy cookie: b873381ef415b0898d6cb2e48f329058
> NX> 702 Proxy IP: 127.0.0.1
> NX> 706 Agent cookie: b873381ef415b0898d6cb2e48f329058
> NX> 704 Session cache: unix-kde
> NX> 707 SSL tunneling: 1
> server_nxnode_echo: NX> 700 Session id: 
> nxgate-1001-FCD3B4533849372E13E16024A3439C92
> server_nxnode_echo: NX> 705 Session display: 1001
> server_nxnode_echo: NX> 703 Session type: unix-kde
> server_nxnode_echo: NX> 701 Proxy cookie: b873381ef415b0898d6cb2e48f329058
> server_nxnode_echo: NX> 702 Proxy IP: 127.0.0.1
> server_nxnode_echo: NX> 706 Agent cookie: b873381ef415b0898d6cb2e48f329058
> server_nxnode_echo: NX> 704 Session cache: unix-kde
> server_nxnode_echo: NX> 707 SSL tunneling: 1
> NX> 1004 Error: NX Agent exited with exit status 1. To troubleshoot set 
> SESSION_LOG_CLEAN=0 in node.conf and investigate 
> "/home/gao/.nx/F-C-nxgate-1001-FCD3B4533849372E13E16024A3439C92/session". 
> You might also want to try: ssh -X myserver; /usr/bin/nxnode --agent to 
> test the basic functionality. Session log follows:
> NX> 105 server_nxnode_echo: NX> 596 Session startup failed.
> NX> 596 Session startup failed.
> server_nxnode_echo: NX> 1004 Error: NX Agent exited with exit status 1. 
> To troubleshoot set SESSION_LOG_CLEAN=0 in node.conf and investigate 
> "/home/gao/.nx/F-C-nxgate-1001-FCD3B4533849372E13E16024A3439C92/session". 
> You might also want to try: ssh -X myserver; /usr/bin/nxnode --agent to 
> test the basic functionality. Session log follows:
> Error: Aborting session with 'Cannot establish any listening sockets - 
> Make sure an X server isn't already running'.
> Session: Aborting session at 'Tue Apr 22 14:27:17 2008'.
> Session: Session aborted at 'Tue Apr 22 14:27:17 2008'.
> NX> 1006 Session status: closed
> session_close FCD3B4533849372E13E16024A3439C92
> server_nxnode_echo: NX> 1006 Session status: closed
> rm: cannot remove 
> `/home/gao/.nx/C-nxgate-1001-FCD3B4533849372E13E16024A3439C92/.nfs000000000030755000000002': 
> Device or resource busy
>
>   
On destination node, I saw the permission errors like:

Error: Aborting session with 'Could not create server lock file: 
/tmp/.X1002-lock'.
Session: Aborting session at 'Tue Apr 22 14:57:33 2008'.
Session: Session aborted at 'Tue Apr 22 14:57:33 2008'.
NX> 1006 Session status: closed
NX> 105 session_close D346CF5D7233D722D969B5B88C309651
server_nxnode_echo: NX> 1006 Session status: closed
rm: cannot remove `/tmp/.X1002-lock': Operation not permitted
rm: cannot remove `/tmp/.X11-unix/X1002': Operation not permitted
rm: cannot remove 
`/home/gao/.nx/C-nxserver2.example.com-1002-D346CF5D7233D722D969B5B88C309651//.nfs00b25a43000000cc': 
Device or resource busy
rm: cannot remove `/tmp/.X1002-lock': Operation not permitted
rm: cannot remove `/tmp/.X11-unix/X1002': Operation not permitted
NX> 1001 Bye.
server_nxnode_echo: NX> 1001 Bye.

On a machine used by multiple users, these lock or temp files may not be 
cleaned up all the time. In this case, /tmp/.X1002-lock is owned by 
another user:

-r--r--r-- 1 user10 nxuser 11 Apr 18 02:53 /tmp/.X1002-lock

So is the case for /tmp/.X11-unix:
srwxrwxrwx 1 root     root  0 Apr 17 13:50 X0
srwxrwxrwx 1 user7  nxuser 0 Apr 21 00:20 X1000
srwxrwxrwx 1 user11  nxuser 0 Mar 17 10:37 X1001
srwxrwxrwx 1 user10 nxuser 0 Apr 18 02:53 X1002
srwxrwxrwx 1 user4  nxuser 0 Mar 25 21:36 X1003

Why would these locking and temp files being left behind? Did the 
sessions fail or get disconnected without properly closing? Why doesn't 
nx server check and use higher number to avoid permission problem?

Simon










More information about the FreeNX-kNX mailing list