knotify4 going crazy & breeding like rabbits (+ linguistic discussion of the role of "greetings")

Duncan 1i5t5.duncan at cox.net
Wed Feb 22 04:44:09 GMT 2012


gene heskett posted on Mon, 20 Feb 2012 20:29:31 -0500 as excerpted:

> On Monday, February 20, 2012 08:08:28 PM Chuck Burns did opine:
> 
>> On 2/20/2012 5:00 PM, gene heskett wrote:
>> > Greetings;

... and salutations! =:^)

(Completely OT I know, but while I understand the role of greetings IRL, 
at least in part, from the newcomer to signal the event and focus 
attention on the his arrival (among other things, serving notice that 
intended private conversations may need to stop temporarily), I never 
quite understood the role on lists, newsgroups, forums and the like, 
where one presumably /knows/ when one starts a new message or thread, and 
that doing so signals the same functional type of "context switch" that 
"greetings" does IRL.  As such, for lists, newsgroups and the like, I'm 
accustomed to simply starting my question/answer/whatever, no greeting or 
similar redundancies.  I know a lot of others do likewise, while others 
include it as they would IRL or in formal non-electronic written 
correspondence.  But at times I've simultaneously wondered bemusedly at a 
"Hi", "Greetings", etc, opening, and whether my omission thereof 
inadvertently causes mild offense.  This is obviously one of those 
times...

The wictionary entry for "greeting" notes that it's less common in email, 
etc, as well.  So... why /do/ you (plural "you", addressed to anyone who 
wishes to respond) include such an opening in electronic messages such as 
lists, email and news messages?  Have you even ever thought about it 
before?  Do you get offended if others don't as well?  Are these 
questions just really strange and off the wall, making me look crazy?  
"Inquiring minds want to know!" =:^)


>> > I have so far today, killed around 75 copies of /usr/bin/knotify4
>> > which is pegging out all 4 cores of my phenom, and running it up to
>> > 70C+.
>> > 
>> > Killing all copies (which is puzzling because killall can't find them
>> > but htop can) cleans the system up&  brings back normal operation.

Which killall form did you use?  Quoting the killall (1) manpage:

>>>>>
killall  sends  a signal to all processes running any of the specified 
commands[.]  If the command name is not regular expression (option -r) 
and contains a slash (/), processes executing that particular file will 
be selected for killing, independent of their name.
<<<<<

Note that kde uses a special launcher, kdeinit4, to launch many of its 
"core" programs.  The commandline for these will be kdeinit4 <appname> 
<app-parameters>.  The reasoning is that this allows more efficient 
shared-objects loading, so faster launching and more efficient memory 
usage.

I'm not sure that's exactly what's going on here (my single knotify4 
instance appears to be a direct child of init, pid 1, and it doesn't 
appear with the kdeinit4 prefixed on its command line), but it is indeed 
quite possible for applications to be launched such that the name and the 
command-line don't match, such that a killall without -r or / won't see 
it.

As mentioned in the manpage quote above, the absolute executable file 
path (detected by the presence of a / in the name) or a regular 
expression (using -r) can be used instead.  It's possible these would get 
the ones a standard killall misses.

Of course, the other possibility is that killall sends the signal, but 
the process ignores it, especially if the process is hung.  The default 
SIGTERM (-15) would allow this.  SIGHUP/-1, SIGINT/-2, SIGSEGV/-11, and 
finally SIGKILL/-9, in order of increasing severity, can be used instead, 
with SIGKILL being "kill with predjudice", that is, don't give the app a 
chance to clean up or to say no, just kill it.  Of course, this last one 
can leave half-written files and the like around.  The kernel will close 
them and return memory resources to the system, but if it was a config 
file or the like, it could cause problems at the next start, so SIGKILL/-9 
should always be used as a last resort.

Of course signal types is pretty basic Unix, so you probably knew that 
bit, but others reading might not.

But if killall found at least one process to deliver the signal to, it 
returns success whether or not the process actually responded, while if 
it didn't find a process at all, it returns failure status and complains 
to STDERR, and it's likely that complaint that you were indicating with 
the "killall can't find them" bit.  Just covering the bases, in case...

>> > But in half an hour I am back to 4 to 6 copies and a pegged cpu.
>> > 
>> > This seems to go along with an uptime of 10 days or more, currently
>> > at 18 days.

It's likely that few enough people run kde for that long at a time, so 
bugs aren't as likely to be reported.  I happen to run git kernels here, 
and even the kernel merge window (during which I don't normally rebuild 
and test new kernels, just in case there's a crazy "eats filesystems" or 
"eats md/raids" bug during that time, that presumably I'd know about by a 
few days after the merge window closes if I decide to bisect something) 
is only two weeks, so 18 days is likely at the long end for no-reboots, 
here, at least on my main machine.  (My netbook can go much longer, but 
it can spend a month or six weeks in suspend-to-disk hibernation, too; 
it's not like it's actually /running/ more than perhaps 24 hours of that.)

On Linux, that's not an excuse.  On Linux, if it can't handle at least 
six months uptime, it's considered seriously bugged, and rightfully so.  
However, it /is/ a fact.  Long uptime bugs simply won't get as much 
reporting, as fewer people see them, simple as that.

>> > Is there a permanent fix for this other than switching to (I'd rather
>> > just have somebody shoot me) gnome or even (quite a bit better IMO)
>> > xfce?

>> I hate to sound like a smartass..
> 
> Not at all.
> 
>> but have you tried logging out of kde,
>> and back in? Your uptime won't suffer, and KDE will be able to
>> completely refresh.. There may some sort of leak somewhere..
> 
> I suspect there is, but running it down seems nearly impossible when it
> doesn't show up for 2 weeks.
>  
>> AFAIK, no one has reported a bug about this.. perhaps if you have time,
>> you can try to narrow it down to exactly what.

See what I said above about long-uptime-bugs.  If it's possible to do 
something with it... the whole kde ecosystem should thank you, because 
good bug reporting is rare enough, and good bug reporting of long uptime 
bugs even rarer, but getting them fixed helps stability even for shorter 
uptime people, so it's a good thing... where it's at all possible, of 
course.

>> You can also try disabling all notifications.. ymmv
> 
> I do use inotifywait

Entirely different kind of notification! =;^)

Inotify and similar "file event" notifications are often what kernel and 
sometimes app developers mean when they talk about notifications.  They 
notify an APP (not the user, tho the app could potentially notify the 
user if appropriate) about file accesses, changes, etc.  At a similar 
level, udev notifies apps such as kde's device notifier (which in turn 
notifies the user if appropriate) when devices appear or disappear, etc.

knotify4, OTOH, deals entirely with user-level-app-to-user event 
notifications for things like mail delivery, media changes in the media 
player, keyboard shift-level changes, etc.  These "user notifications" 
are most often either sound events or popup dialogs (now often handled by 
the notifier icon in the systray, tho I'm not sure all such dialogs are 
handled that way yet), but can also involve marking/flashing a taskbar 
entry, logging the event to a file, or running some other command.

It's generally that first one, playing a sound, that causes problems, on 
systems where either the sound system isn't properly configured or where 
it otherwise isn't entirely reliable.  This is almost CERTAINLY what's 
happening in your case.

> I note that I found another copy of it the 2nd time I went on
> a killing rampage today, about 75 processes down from the top, killed it
> too, and the problem has not come back, but something has called up 2
> copies of it since I nuked them all.
> 
> If that had a visible link to whatever restarts it, that would help
> considerably in tracking this down, but apparently no one knows what
> (re)starts it.

knotify4 is part of kde's internals.  Any time an event occurs that's 
configured to play a sound, popup a notice window, etc, knotify should 
respond with the appropriate action.  But as I said, sound in particular 
can be somewhat problematic.  If the sound doesn't play as knotify is 
configured to play it, that instance of knotify will hang, waiting for 
the event to finish, and at the next such event, kde (kded, maybe? I'm 
not sure exactly which component; maybe it's invoked by the triggering app 
directly, using a library that's part of kdelibs and thus available to 
any kde app?) will find no responding knotify4 and thus will spawn 
another one.

But if the one is hung waiting on a resource lock it can't get (typically 
it can't open the sound device), and the next one needs it too, guess 
what, the next one gets in line behind the first.

Lather, rinse, repeat.

When you notice and start killing all of them, once you kill the one that 
was originally hung (probably one of the oldest, or as you mention, one 
without a lot of CPU time, as it was hung, while the others were CPU-poll 
spinning, waiting on the resource to become available), the kernel 
releases that resource with the killing of the hung process, pulling the 
plug on the waiting queue of all the others, thus draining it.

And since the problem with the sound device that actually hung the 
original knotify4 often has something to do with it suspending after an 
idle timeout, or with something grabbing the sound device exclusively 
(some hardware can cope with multiple streams, some not, thus the use of 
sound servers or alsa's software stream mixer device, dstream or some 
such, I think it's called), but in some cases an app will apparently 
still try to do an exclusive lock on an otherwise sharable device), thus 
triggering the original problem when the original knotify4 tried to 
access the sound device, by the time the original locked-up knotify4 is 
killed, the intermittent problem has generally gone away, so pulling the 
plug allows all those spawned knotify4s to do their thing one right after 
the other, without the problem reoccurring immediately.

But then later on, when the sound device suspends or something else grabs 
exclusive access again, the whole thing is setup for another go-round.

> But:
> [root at coyote eagle]# lsof |grep knotify4|wc -l 1198
> 
> How the heck can you separate the wheat from the chaff in a list that
> long..  :(

FWIW, 1317 here, and to my knowledge, everything's working fine, here, 
just one pid listed for all those, etc.  So 1000+ open files for knotify 
would seem to be normal.

> Half of that is vlc linked:
> [root at coyote eagle]# lsof |grep knotify4|grep vlc|wc -l 604
> 
> And I haven't specifically used vlc that I know of in months, so I
> assume a news site I have visited must have called it up.

Taking a look thru the 1317 listed files, it seems that most of them are 
*.so shared-objects aka libraries with FD=mem, TYPE=REG.

That many of those shared-objects are vlc related is almost certainly due 
to your use of the phonon-vlc backend -- phonon is how kde handles sound, 
and if it's configured to use the phonon-vlc backend, with all the plugins 
that vlc has, and the fact that knotify4 is responsible for kde's sound 
effects...

FWIW, 699 appear to be vlc related, here.

Then there's the other usual X and kde libraries in the list...

Try this, the grep -v excludes any line with "lib":

lsof | grep knotify4 | grep -v lib | wc -l

FWIW, 87, here.  That look a bit more reasonable? =:^)

There's the current working directory (FD=cwd, TYPE=DIR), the root 
directory (FD=rtd, TYPE=DIR, NAME=/), the executable itself (FD=txt, 
TYPE=REG), several memory-mapped font resources (it's an X app, after all)

Then there's the various filedescriptors (FD=0r 1w 2w... etc, 
filedescriptor, read/write/u=both, TYPE=CHR/REG/0000/FIFO/unix/netlink, 
character-device, regular file, unknown/(anon-inode), first-in-first-out, 
unix socket, netlink socket, respectively).  It's interesting to note 
that STDIN is 
/dev/null and STDOUT and STDERR are mapped to $HOME/.xsession-errors, as 
might be expected for an X app.  15 (numbered 0-14) filedescriptors are 
open in this way.  Other than the first three STD*, the rest are various 
fifos, pipes, anon-inodes, unix sockets, etc.  Of interest are FD=8r, the 
kde system config cache (ksycoca4) regular file, FD=9u, netlink 
KOBJECT_UEVENT, and FD=13r, the /dev/urandom character-device.

Here, the same set (same pid for all) is listed three times, once without 
a "task ID" following the PID, and once each for two different task-IDs.
I don't have much of a clue what task-id is. (??)

But it's worth noting that it's the same PID and the same set of open 
files, three times.  87/3=29.  29 actual non-library files... listed 
three times each.

If the same applies to the 1317, I didn't rigorously check, but it looked 
that way, then it's 439 files, listed three times each, 410 shared 
objects (libraries and plugins), 29 other files.  And the vlc files are 
all shared objects, 699/3=233 of them, 233 of the 410 shared objects.

> ATM, I have an eagle session on a pcb going in another window, pending
> info that I screwed the moose, so I would rather get that fixed before I
> reboot kde.

That's semi-gobbledegook, here, but given that in previous mails you've 
mentioned some sort of CAD/CAM setup, I'll assume that's what you're 
referring to.  Yeah, letting it finish doing whatever it's monitoring/
controlling before a reboot might indeed be useful.

Meanwhile...

Some kde settings experimentation may be useful here, but one or more of 
the following should help.  Some are short term workarounds, some longer 
term potential fixes:

Short term: Under common appearance and behavior, application and system 
notifications, manage notifications, on the player settings tab, try 
setting no audio output.  This may or may not kill the existing locked up 
knotify4s, I'm not sure, but it should prevent the problem from 
reoccurring, assuming I'm correct and it is an audio issue of some sort, 
at least, because it simply no-ops the problematic calls.

Medium/long term:  In the same place, you can try setting an external 
player instead of kde's normal (phonon-based) sound system.  Back in 
kde3, I did this for awhile when arts was hopelessly screwed up, but I've 
not had to resort to it in kde4.

The trick is finding an appropriate player, probably setting it to no-gui 
if it's a gui player, etc.  I tried a couple things before I found a 
solution that "just worked" for me.  It involved the playsound binary 
from the sdl-sound package (installed here for something unrelated), but 
played at full volume, the sound effects overpowered whatever else I 
happened to be playing, so I ended up setting up a script that played it 
at reduced volume.  

Here's $HOME/bin/playsound.sh (vol can be set up or down if necessary, 
but .5 was a good balance for me):

#!/bin/bash
# To play something at a bit lower volume (1=100%, normal volume)
vol=.5
playsound --volume $vol $@

Then I just set playsound.sh as the player, and it worked.


Short term, could be longer term if you like it, or QUICKLY shut off if 
you don't!:  Use a "visual bell" instead of sound effects.  This involves 
two configuration changes:

Under common appearance and behavior, application and system 
notifications, system bell, check use system bell instead of system 
notifications.

Under workspace appearance and behavior, workspace appearance, 
accessibility, on the bell tab:  Check use visible bell, and experiment 
with invert and flash screen, with timing, as desired.  You can set an 
audible bell as well, but you may not avoid the sound lockups, that way.

FWIW, I've used this before.  The effect can be disconcerting at first, 
especially if it happens when you're concentrating on something and have 
forgotten all about setting this up.  But it DOES tend to get your 
attention, as long as you're looking at the screen, of course.  The 
feature is designed for deaf folks (thus accessibility) or for use in 
meetings, etc, where a sound would be disruptive.  But it's a nice option 
to have.  Just don't have a heart attack the first time you're 
concentrating on something and the screen inverts/flashes!  As I said, it 
CAN be disconcerting, but forewarned is forearmed.


The proper (user/admin-level) fix:  Depending on the exact nature of the 
problem and your hardware, this could take several forms.

As mentioned above, one trigger of the problem can be sound device power 
saving mode.

On a laptop that's battery powered much of the time, you probably want to 
keep that on to save power when you're not playing anything.  In that 
case, setting visual bell for notifications as suggested above is a good 
idea, since (a) that way you don't have to wake up the sound device just 
to play a notification ding, and (b), laptop/netbook use is far more 
likely to include use in meetings, etc, where the sound isn't desirable 
anyway.

If you do want to keep sound notifications on a laptop/netbook/etc, but 
still don't want to use too much battery running the sound device when 
there's nothing playing, playing around with its power-saving settings 
may be useful.  Setting a too short (say 1 second) idle-timeout is known 
to be highly problematic on some hardware.  It wasn't kde context, more 
like general alsa and kernel device driver context, but I happened upon 
this I think just yesterday, and it makes sense, 10 seconds is the 
recommended MINIMUM.  I'd actually suggest something like 30 seconds to 
perhaps even five minutes (or even 15, consider how much power those 100% 
CPU cycle apps will use!), since I believe part of the problem is race 
conditions where it gets a wakeup just as it was powering down, 
potentially leaving the app trying to play the sound thinking the device 
is responding, but it just powered down instead, so the app ends up 
waiting forever, especially if the device doesn't signal the app 
correctly when it does wake back up.

On a desktop/workstation that's on A/C power all the time, just disable 
audio device power saving entirely.

Unfortunately, directions for setting/disabling audio device power saving 
aren't something I can deal with here.  If kde deals with it at all, I 
don't have that bit of it installed, and individual device driver 
settings are likely to be just that, individual.  Check the docs or try 
posting to your distro's lists/forums.


Another possible fix is device preference order.  This is in kde 
settings, hardware, multimedia, phonon.  When I first switched to the 
phonon-vlc backend, everything seemed to work great (far better than the 
phonon-xine backend, now not even available on some distros as upstream 
kde dropped support for it).  But somewhere in the 4.7 or 4.8 timeframe 
(I ran the 4.8 betas and rcs and don't remember exactly when it showed 
up), the previous config quit working so well.  Sound continued to work, 
but I'd get popups saying it was falling back to a different device as 
the preferred device wasn't working.

The fix was to select every possible device (unless you have multiple 
physical devices and want some routed differently, do this for audio 
playback itself, not the individual purposes, notifications, music, etc) 
and hit the test button.  If it works well, move it up.  If it doesn't, 
move it down.  Do this when you're having problems (sometimes, like right 
now, all devices test as working here, they didn't when I did the testing 
and reordered them when I had the problem).

My list of devices has four listed for one physical device, Default, 
which will play thru the alsa default device, normally the first one 
detected if there's multiple physical devices, and three different 
listings for the hardware (AMD AMD8111, in my case, one saying /just/ 
that, one with the name twice, with "(Default Audio Device)" in 
parenthesis, one with the name twice, and "(hw:0,0)" in parenthesis).

I ended up with Default (no hardware name) at the top of the list.  The 
twice-listing with (Default Audio Device) next, the single-listing third, 
and the twice-listing with (hw:0,0) last.

Since then, I've had no more phonon-fallback notification popups. =:^)  
But I'm not entirely sure if I really fixed it, or if that was a beta/rc 
problem that was fixed with kde 4.8.0, or if I just haven't hit the 
conditions that triggered it again.  Whatever, I'm just happy to not be 
seeing those popups and thus worrying about sound (tho as I said, it did 
continue to work, I just saw the popups sometimes, and got worried).

Note that I didn't have the "breeding like rabbits" knotify4 problem 
here, at least that I noted (as I said I don't tend to stay up for more 
than a few days at a time, testing kernels, etc), only the irritating 
popup problem.  However, it could still be a device order issue, just 
with a slightly different manifestation than I had.


Finally, it's also possible to switch phonon backends.  You apparently 
have phonon-vlc configured, as do I.  There's also the phonon-gstreamer 
backend, if you wish to try it.  The phonon-gstreamer backend /does/ 
happen to be the kde (and gentoo) default, so it might be worth trying.

But I haven't tried it, mostly because I have bad memories about trying 
to get gstreamer to work a long time ago due to problems that are almost 
certainly long gone, so it'd probably work just fine now, but I've just 
never actually needed to install gstreamer as there have always been 
other alternatives that worked, so I haven't.  (Of course, the fact that 
I'm on gentoo and would thus have to build all those extra components 
just to try gstreamer out... when it would at least at first be for just 
one thing since I have alternatives to gstreamer installed for everything 
else, is part of the picture as well.  That's a big barrier to cross just 
to try it out.  If I were on a binary distro and all trying out gstreamer 
involved was downloading and installing pre-built binaries, the barrier 
would be lower, and there's a fair chance I'd have tried it again by now.)

Anyway, I can't say what the phonon-gstreamer backend might do different, 
but it could be worth trying, if you're having problems with phonon-vlc, 
especially if you're on a binary distro so don't have the high barrier to 
trying it out that I do, and/or if you already have gstreamer itself 
installed, as many binary distro users as well as gnome users, for other 
reasons.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

___________________________________________________
This message is from the kde mailing list.
Account management:  https://mail.kde.org/mailman/listinfo/kde.
Archives: http://lists.kde.org/.
More info: http://www.kde.org/faq.html.




More information about the kde mailing list