[Kst] memory checks...

Thu May 27 20:38:42 CEST 2004

A very real possibility which I have done myself, and made bad things happen.

omega% kst -f 0 /data/rawdir/B2K2/ -y bolo1 -y bolo2 -y bolo3 -y bolo4 -y 
bolo5 -y bolo6 -y bolo7 -m 2

I just asked it to read 7x54million sample vectors (+ INDEX).  I meant to add 
-n 1000000, but forgot.  Each vector is 216 MB, so my 512 MB machine is going 
to run out of memory and start swapping, then run out of swap and die.

(similarly, with the datawizard, reading 0 to end, all bolometers...It is 
*going* to happen!!!)

Swapping then dieing is bad.  Saying 'Hey buddy, you asked for too much 
memory, so we're only going to allocate as much as we can and not do the 
rest....' would be vastly better.

*** I consider it a non-negotiable fact that being able to test memory on 
vector allocation is a very good thing.  ***

The *only* questions are:
	i) under linux, is it possible?
	ii) if so, how.

For (i) the answer is clearly "yes".  I can run 'top' and see how much free 
ram there is, how much ram is being used for cache, and how much swap is 
free.  Top gets this information, so we can get this information.  Getting 
this information before the malloc could allow us to make some intelligent 
decisions about what to do.  (ie, I probably don't want to allocate more than 
free+cache - if I do my machine will likely be pretty much useless or will 
just die).

If course this will neither be portable nor 100% reliable... but it will help 
in the common case!

One idea is this:
i) Implement a function 'bool KST::checkMemory(size_t size)' which is 
config/#ifdef wrapped for various platforms.  The default for otherwise 
unsupported platforms could just return true.  kst would test before 
allocation.  If it comes back false, it should refuse to allocate in some 
pleasantly informative manner.
ii) if it comes back true, allocate the memory, and then check it (useful on 
systems where an allocation error can actually happen and allocation doesn't 
overcommit).

This of course is not bullet proof, (eg, memory gets allocated or freed by 
some other program/thread in between checking and allocating) but this will 
be relativly rare, and I bet it would catch the vast majority of problems.  
Enough problems would be caught to be easily worth the effort.

Barth

On May 26, 2004 05:29 pm, George Staikos wrote:
> On May 26, 2004 16:03, Andrew Walker wrote:
> > To catch every case of a failed memory allocation would involve
> > using try catch around every function that performs a memory allocation,
> > including (presumably) all Qt and KDE code. In my opinion this would
> > be the ideal situation - programs should not crash under low memory
> > conditions. I recognize this is a lot of work but it is something that
> > could be done incrementally.
>
>    The problem is that the code internal to Qt and KDE don't do this
> anyway. Actually I don't think too many toolkits do handle this completely.
>  If an exception is thrown in the middle of a call too, for instance, KMDI,
> I don't think it will be able to recover even if we do.  The best we could
> hope for is to do like KMail, write our files to a hardcoded location on
> disk and exit.   That's not much of an issue in Kst since Kst is generally
> used for reading from disk, not writing to it.  I don't think "recovery" is
> really a possibility, only notification of major problems.  This can be
> dealt with using a signal handler I think, though it could possibly crash
> too.
>
> > Failing that, we should at least do a confirmation of the memory
> > allocation for major allocations; such as new vectors.
>
>    How do you intend to do this?  If you want to implement it and make it
> compile-time enabled, that's fine with me, but I don't know of a reliable
> way to do this without going into kernel space on Linux.
>
> > Requesting a large memory allocation will fail with a std::badalloc
> > exception thrown. You can test this in code. Just request a buffer
> > larger than your memory space (including swap).
>
>    I don't think requesting buffers greater than the total memory plus swap
> is a realistic event.  More likely there will be 5-10 smaller buffers that
> may add up to more than the amount of ram+swap.  It would probably be more
> effective to just determine the amount of memory+swap in the system, remove
> 200MB or so, and keep track of the vectors we allocate.  Then we can just
> tell the user if he tries to load too much data.  I think that would be far
> easier to implement.
>
> > If portability prevents us from using particular features then we
> > just ifdef them out in those cases.
>
>    This makes things very messy for very little gain I think.
>
> > People most certainly will use Kst under low memory conditions.
> > Perhaps not intentionally. Clearly they will not expect Kst to crash,
> > and will be less than happy if it does.
>
>   I don't understand why.  It's used for scientific work.  It is run on
> machines designed to do the work that they are doing.  I don't see why they
> would be using machines that are incapable of doing the work necessary
> except in rare occasions.  Anyway, it's still possible that they could be,
> but that doesn't circumvent the overcommit problem.
>
>    So, the reasons why it won't really work or is too much trouble:
> 1) Linux, primary target platform, overcommits
> 2) Not all techniques for this are supported by the various compilers, or
> they need special switches -> will require hacks to the build
> 3) We will almost certainly get a crash inside Qt or KDE as a result of it
> 4) The userbase generally has equipment designed to handle the amount of
> data they are processing in Kst
> 5) Only protecting the big mallocs is a bit insufficient since we do many
> more mallocs by number inside the libraries.  We could get close to the
> crash limit but not quite, then try to open a dialog, and *poof*, there
> goes Kst 6) It's a big maintenance problem