[FreeNX-kNX] load balancing question(s)

Matthew Nicholson nicholson at eps.harvard.edu
Tue Feb 3 16:53:43 UTC 2009


Hello again. I'm trying to get load balancing working on some cluster access
nodes we have, and am hitting a wall right now.

I've got 3 systems; access01, access02, and access03.

access01 will be my load balancer, so its node.conf has:

LOAD_BALANCE_SERVERS="access01 access02 access03"

# The following load_balance_algorithms are available at the moment:
#
# "load", "round-robin", "random"
#
# For "load" you need a script called nxcheckload in PATH_BIN.
#
# A sample script, which you can change to your needs it shipped with
# FreeNX under the name nxcheckload.sample.

LOAD_BALANCE_ALGORITHM="random"

access02 and 03 have:

ENABLE_SERVER_FORWARD="1"
SERVER_FORWARD_HOST="access01"
SERVER_FORWARD_PORT=22
SERVER_FORWARD_KEY="/usr/NX/share/client.id_dsa.key"


All 3 systems have the key in place.

Now, if I connect to access01, it correctly tries to load balance randomly.
When it wants to balance to itself, my connection works just fine. However,
if it tries to load balance to either of the other 2, it timesout. I see
nothing on the other 2's nxserver.log, but in secure.log I can see my users
getting authenticated. On the load balancer's nxserver.log, I get:

HELLO NXSERVER - Version 3.2.0-73 OS (GPL, using backend: 3.3.0)
NX> 105 hello NXCLIENT - Version 3.2.0
NX> 134 Accepted protocol: 3.2.0
NX> 105 SET SHELL_MODE SHELL
NX> 105 SET AUTH_MODE PASSWORD
NX> 105 login
NX> 101 User: nichols2
NX> 102 Password:
Info: Auth method: ssh
NX> 103 Welcome to: access01 user: nichols2
NX> 105 listsession --user="nichols2" --status="suspended,running"
--geometry="1280x1024x24+render" --type="unix-gnome"
NX> 127 Sessions list of user 'nichols2' for reconnect:

Display Type             Session ID                       Options  Depth
Screen         Status      Session Name
------- ---------------- -------------------------------- -------- -----
-------------- ----------- ------------------------------


NX> 148 Server capacity: not reached for user: nichols2
NX> 105 startsession  --link="adsl" --backingstore="1" --encryption="1"
--cache="16M" --images="64M" --shmem="1" --shpix="1" --strict="0"
--composite="1" --media="0" --session="access01" --type="unix-gnome"
--geometry="1280x936" --client="linux" --keyboard="pc102/us"
--screeninfo="1280x936x24+render"

Info: Load-Balancing (if possible) to access03 ...
&link=adsl&backingstore=1&encryption=1&cache=16M&images=64M&shmem=1&shpix=1&strict=0&composite=1&media=0&session=access01&type=unix-gnome&geometry=1280x936&client=linux&keyboard=pc102/us&screeninfo=1280x936x24+render&clientproto=3.2.0&user=nichols2&userip=140.247.105.174&uniqueid=399FA62FA862AC18B9D2FAE9A4813840&display=1001&host=access03

nichols2 at access03's password:
NX> 1000 NXNODE - Version 3.2.0-73 OS (GPL, using backend: 3.3.0)
server_nxnode_echo: NX> 1000 NXNODE - Version 3.2.0-73 OS (GPL, using
backend: 3.3.0)
NX> 700 Session id: iliadaccess03-1001-399FA62FA862AC18B9D2FAE9A4813840
NX> 705 Session display: 1001
NX> 703 Session type: unix-gnome
NX> 701 Proxy cookie: ba8df78ec91e86f436f5bd17382d3155
NX> 702 Proxy IP: 10.242.67.13
NX> 706 Agent cookie: ba8df78ec91e86f436f5bd17382d3155
NX> 704 Session cache: unix-gnome
NX> 707 SSL tunneling: 1
server_nxnode_echo: NX> 700 Session id:
access03-1001-399FA62FA862AC18B9D2FAE9A4813840
server_nxnode_echo: NX> 705 Session display: 1001
server_nxnode_echo: NX> 703 Session type: unix-gnome
server_nxnode_echo: NX> 701 Proxy cookie: ba8df78ec91e86f436f5bd17382d3155
server_nxnode_echo: NX> 702 Proxy IP: 10.242.67.13
server_nxnode_echo: NX> 706 Agent cookie: ba8df78ec91e86f436f5bd17382d3155
server_nxnode_echo: NX> 704 Session cache: unix-gnome
server_nxnode_echo: NX> 707 SSL tunneling: 1
NX> 1009 Session status: starting
server_nxnode_echo: NX> 1009 Session status: starting
NX> 710 Session status: running
NX> 1002 Commit
NX> 1006 Session status: running
server_nxnode_echo: NX> 710 Session status: running
server_nxnode_echo: NX> 1002 Commit
session_status 399FA62FA862AC18B9D2FAE9A4813840 Running
NX> 105 server_nxnode_echo: NX> 1006 Session status: running
bye
Bye
NX> 999 Bye
NX> 1004 Error: NX Agent exited with exit status 1. To troubleshoot set
SESSION_LOG_CLEAN=0 in node.conf and investigate
"/n/home/nichols2/.nx/F-C-access03-1001-399FA62FA862AC18B9D2FAE9A4813840/session".
You might also want to try: ssh -X myserver; /usr/NX/bin/nxnode --agent to
test the basic functionality. Session log follows:
server_nxnode_echo: NX> 596 Session startup failed.

I check the session file, and it has:


NXAGENT - Version 3.3.0

Copyright (C) 2001, 2007 NoMachine.
See http://www.nomachine.com/ for more information.

Info: Agent running with pid '30440'.
Session: Starting session at 'Tue Feb  3 11:44:28 2009'.
Info: Proxy running in server mode with pid '30440'.
Info: Waiting for connection from '10.242.67.13' on port '5001'.
Info: Aborting the procedure due to signal '1'.
Error: Aborting session with 'Unable to open display
'nx/nx,options=/n/home/nichols2/.nx/C-iliadaccess03-1001-399FA62FA862AC18B9D2FAE9A4813840/options:1001''.
Session: Aborting session at 'Tue Feb  3 11:45:29 2009'.
Session: Session aborted at 'Tue Feb  3 11:45:29 2009'.
XIO:  fatal IO error 104 (Connection reset by peer) on X server ":1001.0"
      after 0 requests (0 known processed) with 0 events remaining.

(gnome-session:30860): Gtk-WARNING **: cannot open display:


I can't figure out what the problem might be. All systems are identical
(hardware, software, NX versions, etc), and all of them work if i connect
directly (after load balancing/forwarding has been disabled, of course).

Also (and this is secondary), as I understood the Forwarding part, I should
be able to connect to one of these (02 and 03) systems, and they will
forward me back to 01, where i get load balanced, and then sent along back
to one of them for a "real" session. Is this right, or an I mistaken?

Any ideas would be a big help!

Thanks!





Matthew Nicholson
nicholson at eps.harvard.edu
Harvard University
FAS IT Research Computing
Dept. Of Earth and Planetary Science
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/freenx-knx/attachments/20090203/ed692ee9/attachment.html>


More information about the FreeNX-kNX mailing list