[FreeNX-kNX] Gradual performance decrease of nxagent related to Create/FreePixmap

Mario Becroft mb at gem.win.co.nz
Sun Feb 8 13:44:58 UTC 2009


I have diagnosed this problem and got a workaround. There is probably a
pixmap resource leak bug in the nxagent server. To the nomachine people:
you might like to look at this--full details follow below.

*** Summary ***

Resource leak happens when client's libXcursor creates an RGBA cursor
using RenderCreateCursor. Client creates a pixmap for storing the cursor
image (or many pixmaps for animated cursors). Even though client calls
FreePixmap, the pixmap is never freed, even at client exit.

This bug happens on a system with client libXcursor from xorg 6.8.2, but
not on a different system with libXcursor 1.1.7. The two systems I am
testing on also have different cursor themes, and other X library
version differences, so it is not 100% clear that the libXcursor version
is key. Protocol traces showing the difference are below.

Although the problem appears fixed in newer libXcursor, the fact that
pixmaps are never freed, even on client exit, seems like a server bug,
unless there is a good reason why it cannot free them. Comments please.

The result is that if the linux distro has a RGBA cursor theme, the
number of pixmap resources in the server increases each time a cursor is
allocated, by 1, or many in the case of animated cursors. Once this gets
to around 10000, programs that create a lot of pixmaps become noticably
slow.

The workaround is to update the libXcursor library, or if that is not
possible, force the use of core cursors by setting either the
XCURSOR_CORE environment variable or the Xcursor.core resource to true.

*** Full details ***

The problem can be detected by running xrestop on the nxagent and on the
display that nxagent is a client of. If the number of pixmaps allocated
by nxagent is significantly larger than the total number of pixmaps
allocated by clients of nxagent, the leak has occured. The leak can be
triggered with the following command (assuming there is a RGBA cursor
theme):

wish <<HERE
. configure -cursor arrow
HERE

Each time this command is run, the number of pixmaps allocated by the nx
client increases by the number of frames in the specified cursor.

Forcing the use of core cursors does not trigger the leak:

XCURSOR_CORE=true wish <<HERE
. configure -cursor arrow
HERE

Typical xrestop output for the nx client after leaking has occurred for
some time:

--8<---------------cut here---------------start------------->8---
res-base Wins  GCs Fnts Pxms Misc   Pxm mem  Other   Total   PID Identifier
4e00000  1582  551   21 15444 9446    27024K    292K  27316K   ?   NX - mb at nxhos
--8<---------------cut here---------------end--------------->8---

The actual number of pixmaps allocated by clients of nxagent is only 138
in this case, but nxagent has allocated 15444 pixmaps.

Oprofile output showing the CPU usage when scrolling a window in firefox
when there are around 12000 leaked pixmaps:

--8<---------------cut here---------------start------------->8---
CPU: P4 / Xeon with 2 hyper-threads, speed 2992.5 MHz (estimated)
Counted GLOBAL_POWER_EVENTS events (time during which processor is not stopped) with a unit mask of 0x01 (mandatory) count 100000
samples  %        app name                 symbol name
101081   44.0352  nxagent                  nxagentFindClientResource
73534    32.0345  nxagent                  AddResource
--8<---------------cut here---------------end--------------->8---

(The time in AddResource is nearly all spent in
nxagentSwitchResourceType(), but this is not seen in the oprofile output
because it is inlined.) See the stack traces below for the full call
graph.

The leaked resources are all of type RT_NX_PIXMAP (not RT_PIXMAP) and
have refcnt = 1. See below for complete details on the leaked resources.

Note that on the good system, even if I kill -9 the client, there is no
leak. So it cannot be that the client is cleaning up better. It must be
allocating the cursors differently, so that the leak is not triggered in
the server.

The xtrace output for the good and bad libXcursor versions (included
below) is almost the same. The trace is only from when the cursor is
configured. I can provide the full trace (very long) on request.

I assume this will be enough information for someone familiar with this
area to identify the root cause. I can provide access to a system that
triggers the leak, or provide further xtrace output or other
diagnostics, if this will help anyone to fix the problem.

I don't fully grasp how the resource switching works
(nxagentSwitchResourceType(), and in ProcFreePixmap(),
ProcCloseFont()). Obviously when the resources are allocated it switches
them from RT_NX_* to RT_*, and back again when they are freed, but I
don't understand exactly what this mechanism is for. Comments
appreciated.

Even if the resource leak can be fixed, a key area for performance
optimisation is nxagentSwitchResourceType() and
nxagentFindClientResource, which currently perform linearly with respect
to the number of resources. Assuming there is not a better way of
designing this mechanism, the searched-for attributes need to be
hashed. This will noticably increase performance and reduce system load
for any nxagent with a large number of pixmap-intensive clients. A good
day's project for someone. Any takers?

Detailed information and traces follow.


********** Annotated details of nxagent resource data structures **********

--8<---------------cut here---------------start------------->8---
#### Stack traces from inside the two slow areas
-------------------------------------------------------------------
#0  0x00000000004388d2 in nxagentFindClientResource (
    client=<value optimized out>, type=12, value=0x12151600)
    at X/NXresource.c:475
#1  0x00000000004248c0 in ProcFreePixmap (client=0xbee6f0)
    at X/NXdispatch.c:2036
#2  0x00000000004272b6 in Dispatch () at X/NXdispatch.c:739
#3  0x000000000045043c in main (argc=8, argv=0x7fffa522cc48,
    envp=<value optimized out>) at main.c:450
-------------------------------------------------------------------
#0  0x0000000000439e6c in AddResource (id=24024137, type=3221225474,
    value=0x95807d0) at X/NXresource.c:533
#1  0x0000000000420cb6 in ProcCreatePixmap (client=0xbee6f0)
    at X/NXdispatch.c:2006
#2  0x00000000004272b6 in Dispatch () at X/NXdispatch.c:739
#3  0x000000000045043c in main (argc=8, argv=0x7fffa522cc48,
    envp=<value optimized out>) at main.c:450
-------------------------------------------------------------------

#### Debugging core1, Stopped in nxagentFindClientResource

#0  0x00000000004388d2 in nxagentFindClientResource (
    client=<value optimized out>, type=12, value=0xd7ef00)
    at X/NXresource.c:475
475           if (pResource -> type == type && pResource -> value == value)

#### The server clientTable

(gdb) p clientTable[0]
$1 = {resources = 0x43869d0, elements = 12920, buckets = 2048, hashsize = 11,
  fakeID = 508759, endFakeID = 512520, expectID = 2097152}

#### Look at an arbitrary selection of the huge number of server resources
#### type 12 is RT_NX_PIXMAP

(gdb) p *(clientTable[0].resources[45])
$13 = {next = 0x6434760, id = 569659, type = 12, value = 0x5694280}
(gdb) p *(clientTable[0].resources[46])
$14 = {next = 0x387a980, id = 569656, type = 12, value = 0x3264d10}
(gdb) p *(clientTable[0].resources[47])
$15 = {next = 0x488f440, id = 674150, type = 12, value = 0x57ae0e0}
(gdb) p *(clientTable[0].resources[48])
$16 = {next = 0x9bbd470, id = 569638, type = 12, value = 0xeb1b270}
(gdb) p *(clientTable[0].resources[49])
$17 = {next = 0x96a2d60, id = 674168, type = 12, value = 0xf66d990}
(gdb) p *(clientTable[0].resources[50])
$18 = {next = 0x4061110, id = 524594, type = 12, value = 0x39e7730}
(gdb) p *(clientTable[0].resources[51])
$19 = {next = 0xad9db90, id = 452846, type = 12, value = 0x5fc8b60}
(gdb) p *(clientTable[0].resources[100])
$20 = {next = 0x4da2570, id = 569714, type = 12, value = 0xe19d260}
(gdb) p *(clientTable[0].resources[200])
$21 = {next = 0xd912800, id = 674177, type = 12, value = 0x7e18ce0}
(gdb) p *(clientTable[0].resources[300])
$22 = {next = 0x115f61e0, id = 532520, type = 12, value = 0x469eaa0}

#### Look into these pixmaps

(gdb) p *(PixmapPtr)(clientTable[0].resources[45].value)
$32 = {drawable = {type = 1 '\001', class = 0 '\0', depth = 32 ' ',
    bitsPerPixel = 32 ' ', id = 117442796, x = 0, y = 0, width = 23,
    height = 23, pScreen = 0xf96510, serialNumber = 76963112}, refcnt = 1,
  devKind = 0, devPrivate = {ptr = 0x0, val = 0, uval = 0, fptr = 0},
  devPrivates = 0x56942c8, screen_x = 200, screen_y = 40}
(gdb) p *(PixmapPtr)(clientTable[0].resources[46].value)
$33 = {drawable = {type = 1 '\001', class = 0 '\0', depth = 32 ' ',
    bitsPerPixel = 32 ' ', id = 117442792, x = 0, y = 0, width = 23,
    height = 23, pScreen = 0xf96510, serialNumber = 76963109}, refcnt = 1,
  devKind = 0, devPrivate = {ptr = 0x0, val = 0, uval = 0, fptr = 0},
  devPrivates = 0x3264d58, screen_x = 743, screen_y = 5}
(gdb) p *(PixmapPtr)(clientTable[0].resources[47].value)
$34 = {drawable = {type = 1 '\001', class = 0 '\0', depth = 32 ' ',
    bitsPerPixel = 32 ' ', id = 117455093, x = 0, y = 0, width = 15,
    height = 19, pScreen = 0xf96510, serialNumber = 81183506}, refcnt = 1,
  devKind = 0, devPrivate = {ptr = 0x0, val = 0, uval = 0, fptr = 0},
  devPrivates = 0x57ae128, screen_x = 0, screen_y = 0}
(gdb) p *(PixmapPtr)(clientTable[0].resources[48].value)
$35 = {drawable = {type = 1 '\001', class = 0 '\0', depth = 32 ' ',
    bitsPerPixel = 32 ' ', id = 117442768, x = 0, y = 0, width = 23,
    height = 20, pScreen = 0xf96510, serialNumber = 76963091}, refcnt = 1,
  devKind = 0, devPrivate = {ptr = 0x0, val = 0, uval = 0, fptr = 0},
  devPrivates = 0xeb1b2b8, screen_x = 494, screen_y = 7}
(gdb) p *(PixmapPtr)(clientTable[0].resources[49].value)
$36 = {drawable = {type = 1 '\001', class = 0 '\0', depth = 32 ' ',
    bitsPerPixel = 32 ' ', id = 117455117, x = 0, y = 0, width = 15,
    height = 19, pScreen = 0xf96510, serialNumber = 81183524}, refcnt = 1,
  devKind = 0, devPrivate = {ptr = 0x0, val = 0, uval = 0, fptr = 0},
  devPrivates = 0xf66d9d8, screen_x = 494, screen_y = 7}
(gdb) p *(PixmapPtr)(clientTable[0].resources[50].value)
$37 = {drawable = {type = 1 '\001', class = 0 '\0', depth = 32 ' ',
    bitsPerPixel = 32 ' ', id = 16785129, x = 0, y = 0, width = 12,
    height = 24, pScreen = 0xf96510, serialNumber = 82066748}, refcnt = 1,
  devKind = 0, devPrivate = {ptr = 0x0, val = 0, uval = 0, fptr = 0},
  devPrivates = 0x39e7778, screen_x = 0, screen_y = 0}
(gdb) p *(PixmapPtr)(clientTable[0].resources[51].value)
$38 = {drawable = {type = 1 '\001', class = 0 '\0', depth = 32 ' ',
    bitsPerPixel = 32 ' ', id = 117453977, x = 0, y = 0, width = 15,
    height = 19, pScreen = 0xf96510, serialNumber = 80583768}, refcnt = 1,
  devKind = 0, devPrivate = {ptr = 0x0, val = 0, uval = 0, fptr = 0},
  devPrivates = 0x5fc8ba8, screen_x = 0, screen_y = 0}
(gdb) p *(PixmapPtr)(clientTable[0].resources[100].value)
$39 = {drawable = {type = 1 '\001', class = 0 '\0', depth = 32 ' ',
    bitsPerPixel = 32 ' ', id = 117442869, x = 0, y = 0, width = 15,
    height = 19, pScreen = 0xf96510, serialNumber = 76963167}, refcnt = 1,
  devKind = 0, devPrivate = {ptr = 0x0, val = 0, uval = 0, fptr = 0},
  devPrivates = 0xe19d2a8, screen_x = 65, screen_y = 10}
(gdb) p *(PixmapPtr)(clientTable[0].resources[200].value)
$40 = {drawable = {type = 1 '\001', class = 0 '\0', depth = 32 ' ',
    bitsPerPixel = 32 ' ', id = 117455129, x = 0, y = 0, width = 15,
    height = 19, pScreen = 0xf96510, serialNumber = 81183533}, refcnt = 1,
  devKind = 0, devPrivate = {ptr = 0x0, val = 0, uval = 0, fptr = 0},
  devPrivates = 0x7e18d28, screen_x = 317, screen_y = 10}
(gdb) p *(PixmapPtr)(clientTable[0].resources[300].value)
$41 = {drawable = {type = 1 '\001', class = 0 '\0', depth = 32 ' ',
    bitsPerPixel = 32 ' ', id = 18990145, x = 0, y = 0, width = 19,
    height = 28, pScreen = 0xf96510, serialNumber = 81577132}, refcnt = 1,
  devKind = 0, devPrivate = {ptr = 0x0, val = 0, uval = 0, fptr = 0},
  devPrivates = 0x469eae8, screen_x = 0, screen_y = 0}

#### Look into some of the nxagent private structures

(gdb) p *(nxagentPrivPixmapPtr)((PixmapPtr)(clientTable[0].resources[45].value))->devPrivates[0].ptr
$5 = {id = 15067725, mid = 569659, isVirtual = 0, isShared = 0,
  pVirtualPixmap = 0x12688490, pRealPixmap = 0x5694280, pPicture = 0xce7b540,
  corruptedRegion = 0x88e81e0, corruptedBackground = 0, containGlyphs = 0,
  containTrapezoids = 0, usageCounter = 0, corruptedBackgroundId = 0,
  corruptedId = 0, synchronizationBitmap = 0x0, corruptedTimestamp = 0,
  splitResource = 0x0, isBackingPixmap = 0}
(gdb) p *(nxagentPrivPixmapPtr)((PixmapPtr)(clientTable[0].resources[46].value))->devPrivates[0].ptr
$6 = {id = 15067721, mid = 569656, isVirtual = 0, isShared = 0,
  pVirtualPixmap = 0x12687320, pRealPixmap = 0x3264d10, pPicture = 0xa210f20,
  corruptedRegion = 0xc221860, corruptedBackground = 0, containGlyphs = 0,
  containTrapezoids = 0, usageCounter = 0, corruptedBackgroundId = 0,
  corruptedId = 0, synchronizationBitmap = 0x0, corruptedTimestamp = 0,
  splitResource = 0x0, isBackingPixmap = 0}
(gdb) p *(nxagentPrivPixmapPtr)((PixmapPtr)(clientTable[0].resources[47].value))->devPrivates[0].ptr
$7 = {id = 15473936, mid = 674150, isVirtual = 0, isShared = 0,
  pVirtualPixmap = 0x128a5a60, pRealPixmap = 0x57ae0e0, pPicture = 0xb501370,
  corruptedRegion = 0xb1348c0, corruptedBackground = 0, containGlyphs = 0,
  containTrapezoids = 0, usageCounter = 0, corruptedBackgroundId = 0,
  corruptedId = 0, synchronizationBitmap = 0x0, corruptedTimestamp = 0,
  splitResource = 0x0, isBackingPixmap = 0}

#### The virtual pixmap for one of the above

(gdb) p *(PixmapPtr)((nxagentPrivPixmapPtr)((PixmapPtr)(clientTable[0].resources[45].value))->devPrivates[0].ptr)->pVirtualPixmap
$11 = {drawable = {type = 1 '\001', class = 0 '\0', depth = 32 ' ',
    bitsPerPixel = 32 ' ', id = 0, x = 0, y = 0, width = 23, height = 23,
    pScreen = 0xf96510, serialNumber = 76963111}, refcnt = 1, devKind = 92,
  devPrivate = {ptr = 0x12688558, val = 308839768, uval = 308839768,
    fptr = 0x12688558}, devPrivates = 0x126884d8, screen_x = 0, screen_y = 0}

#### Look at the pixmap data

# 23 * 23 * 4 = 2116 bytes
x/2116xb ((char*)((PixmapPtr)((nxagentPrivPixmapPtr)((PixmapPtr)(clientTable[0].resources[45].value))\
->devPrivates[0].ptr)->pVirtualPixmap)->devPrivate.ptr)
# Raw data omitted

p $a = ((char*)((PixmapPtr)((nxagentPrivPixmapPtr)((PixmapPtr)(clientTable[0].resources[45].value))\
->devPrivates[0].ptr)->pVirtualPixmap)->devPrivate.ptr)

dump memory pixmap1 $a $a+23*23*4-1
# Viewing the data reveals an animated hourglass cursor.
# Same for all the other pixmaps examined.
--8<---------------cut here---------------end--------------->8---


********** xtrace output for configuring cursor with bad libXcursor **********

Note: this is with a single-frame RGBA cursor. Animated cursors result
in a pixmap being created as below for each frame.

--8<---------------cut here---------------start------------->8---
000:<:004f: 16: Request(16): InternAtom only-if-exists=false(0x00)  name='cursor'
000:>:0x004f:32: Reply to InternAtom: atom=0xdc("cursor")
000:<:0050: 16: Request(53): CreatePixmap depth=0x20 pid=0x03000006 drawable=0x000000a7 width=17 height=24
000:<:0051: 16: Request(55): CreateGC cid=0x03000007 drawable=0x03000006  values={}
000:<:0052:1656: Request(72): PutImage format=ZPixmap(0x02) drawable=0x03000006 gc=0x03000007 width=17 height=24 dst-x=0 dst-y=0 left-pad=0x00 depth=0x20
000:<:0053:  8: Request(60): FreeGC gc=0x03000007
000:<:0054: 20: RENDER-Request(149,4): RenderCreatePicture pid=0x03000008 drawable=0x03000006 format=0x00000091  values={}
000:<:0055:  8: Request(54): FreePixmap drawable=0x03000006
000:<:0056: 16: RENDER-Request(149,27): RenderCreateCursor cid=0x03000009 src=0x03000008 x=12 y=2
000:<:0057:  8: RENDER-Request(149,7): RenderFreePicture picture=0x03000008
000:<:0058: 16: Request(2): ChangeWindowAttributes window=0x03000003  value-list={cursor=0x03000009}
000:<:0059: 16: Request(2): ChangeWindowAttributes window=0x03000003  value-list={background-pixel=0x00d9d9d9}
--8<---------------cut here---------------end--------------->8---


********** xtrace output for configuring cursor with good libXcursor **********

--8<---------------cut here---------------start------------->8---
003:<:003d: 16: Request(16): InternAtom only-if-exists=false(0x00)  name='cursor'
003:>:0x003d:32: Reply to InternAtom: atom=0xdc("cursor")
003:<:003e: 16: Request(53): CreatePixmap depth=0x20 pid=0x03400006 drawable=0x000000a7 width=17 height=24
003:<:003f: 16: Request(55): CreateGC cid=0x03400007 drawable=0x03400006  values={}
003:<:0040:1656: Request(72): PutImage format=ZPixmap(0x02) drawable=0x03400006 gc=0x03400007 width=17 height=24 dst-x=0 dst-y=0 left-pad=0x00 depth=0x20
003:<:0041:  8: Request(60): FreeGC gc=0x03400007
003:<:0042: 20: RENDER-Request(149,4): RenderCreatePicture pid=0x03400008 drawable=0x03400006 format=0x00000091  values={}
003:<:0043:  8: Request(54): FreePixmap drawable=0x03400006
003:<:0044: 16: RENDER-Request(149,27): RenderCreateCursor cid=0x03400009 src=0x03400008 x=12 y=2
003:<:0045:  8: RENDER-Request(149,7): RenderFreePicture picture=0x03400008
003:<:0046: 16: Request(98): QueryExtension  name='XFIXES'
003:>:0x0046:32: Reply to QueryExtension: present=true(0x01) major-opcode=148 first-event=105 first-error=166
003:<:0047: 12: XFIXES-Request(148,0): QueryVersion major version=4 minor version=0
003:>:0x0047:32: Reply to QueryVersion: major version=3 minor version=0
003:<:0048: 20: XFIXES-Request(148,23): SetCursorName cursor=0x03400009  name='arrow'
003:<:0049: 16: Request(2): ChangeWindowAttributes window=0x03400003  value-list={cursor=0x03400009}
003:<:004a: 16: Request(2): ChangeWindowAttributes window=0x03400003  value-list={background-pixel=0x00d9d9d9}
--8<---------------cut here---------------end--------------->8---

-- 
Mario Becroft <mb at gem.win.co.nz>



More information about the FreeNX-kNX mailing list