Preventing App Crashes before Bug Fixes

Wed Mar 24 08:40:20 GMT 2004

Hi Amir,

[I see the pyQt course is chugging along nicely, though I'm afraid I
haven't done any of the homework and found the discussion on generators a
little overly -- Haskellish, I dunno.]

On Wed, 24 Mar 2004, Amir Michail wrote:
> We are thinking of working on a method for preventing KDE applications
> from crashing before the source code is fixed.
>
> Users would then be warned whenever they are about to enter a sequence
> of commands that is likely going to lead to bad behavior/crash. In
> fact, the warning could include the bug report descriptions.

A tricky usability question indeed. Flag every code path that might cause
a problem? You'll be _swimming_ in "This section of code has caused $n$
bugs in the past. Continue?" dialogs in no time. Which means adding a
"don't remind me" checkbox, (or, as others have missed, make sure there's
a different set of packages with this bug tracking feature enabled, like
Netscape used to do).

> In this way, you would avoid bugs even before they are fixed.

I suspect, as Thiago points out as well as others, that this is going to
be a real bitch to implement. But that's what it is research for (and,
ahem, if you need a formal methods guy for it, um, you know).

I think a big problem will be rollback - and finding the right level of
abstraction for identifying chunks of code. Consider enabling this
technology at a C-statement level. So you get a message box:

	The statement "assert(!attached())" has caused crashes
	63 times in the past with a SIGABRT. Do you want to
	execute this statement?

Per-function might make more sense, but you can invent cases where the
function level is far too broad to avoid bugs in a meaningful way. Block
scope it is, then.

Right, so now halfway through a function it turns out the function has
caused problems in the past and the user wants to bail out of it - what do
you need to roll back? Allocations? What about displaying the dialog
immediately prior to the crash?

Two things arise in my mind:

1) You could probably do statement-level checking (ignoring the rollback
problem for the moment) if you had some kind of suitably instrumented
interpreter which already comes with some more gentle run time checking
that plain SEGVs. If it was an OO-ish scripting language with a powerful
exception mechanism, then it might be much less effort than implementing
this as a hook into the guts of the C language itself. Thinking up which
language satisfies this description is left as an exercise to the reader.

2) Exceptions could be used as a rollback mechanism.

3) It would be useful, with (2), to have a _methodology_ of instrumenting
code for bug discovery and reporting. If a fully automatic approach is too
much, and application-level reporting not specific enough, then how about
getting app authors to instrument apps by hand for improved discovery.

4) I can't count.

5) David Harel (not that one) recently posted about usability tracking and
usage pattern analysis (his company is ergosoft.il, I think). This is a
very similar area: tracking what a user is doing, logging it, and doing
analysis on that usage. It may be useful to talk to him as well about
instrumenting code for tracking purposes.

6) It would be useful already to have some kind of bug density numbers.
_Supposing_ people actually had KDE packages installed that produce viable
backtraces and not just crud, _and_ there was a not-too-inaccurate mapping
of function offsets to the actual C code, we could produce a colored
listing showing "code snippets of doom", basically highlighting where
(approximately) things in the code are crashing.

Now, another problem is the infinite variability of the environment that
KDE apps run in. I have a really cruddy SiS-chipset based machine (my
experience is that "really cruddy" is redundant in that sentence, but
YMMV), where konqueror left unattended will SEGFAULT in a fresh Knoppix
boot. So, supposing that sends in gobs of reports - does that make the bug
any more pressing than, say, kgpg's failure to save files (which doesn't
crash the system, so that's not a fair comparison).

> which one.  Of course, we would need to distinguish typing into a text
> area say vs issuing a command since the latter obviously requires the
> particular character typed to be sent to the server.

Well, it requires the command at least. Is it thinkable that different
KStdAction paths would do different things? I mean, would a toolbar action
Save ever do anything other than the same KStdAction in a menu?

> * allow discussions, chat, blogs, etc. within the context of an application
>   feature

For teaching purposes (in the style of the "Internet Passport" here in the
Netherlands, where high-scool kids learn to click buttons in Excel) that
might be useful, for instance to allow people to look at one instance of
an app and perhaps do a little whiteboard drawing on it. Except that there
are already tools for doing just that, only not totally integrated with
the application. Run ircii in konsole beside kword and you've got your
chat possibility, throw in some VNC and voila!

All in all an interesting idea, but it needs some work to make it work,
and of more immediate use (though possibly still researchable) is
backtrace matching. Get bug reports to send in real bt's and automatically
match the bt with existing ones. That's a nice (approximate) text search
issue, since different versions of an app might have slightly different
names, offsets, etc. in the bt.