[Kst] getData flexibility
Barth Netterfield
netterfield at physics.utoronto.ca
Thu May 4 02:38:44 CEST 2006
Your changes are fine - iff it doesn't give us a hit in performance - but I'm
not entirely sure they are critical....
I was thinking less ambitiously : in the comments below, I indicate what I
though was happen...
On Wednesday 03 May 2006 14:50, Eli Fidler wrote:
> George and I have discussed this issue some more. I think that we need to
> change the internals of Kst, particularly RVector. I think that instead of
> containing a range of samples from frame 0 to n, we should be able to store
> a set of frame ranges.
>
> Some situations where this is useful:
> 1. The cache contains frames 10-20 and 100-500 of a field. The server is
> unreachable. Kst can use the stored frames from the cache and add the other
> frames once the server is reachable. Without frame ranges, I would have to
> either say the datasource is empty, allow the user to use the first range
> only or allow the user to retrieve all the frames but 0-9 and 21-99 will be
> filled with NaNs. None of these solutions are very good.
The users can ask for whatever they want. If the requested samples are:
in the cache:
return the data
not in cache, but the remote source is availible:
download to cache, then return the data
not in cache, and the system is offline:
silently return NaN of all non-cached samples
The cache demon would have to figure out if the remote source is availible....
To get frame 10, you have to ask for Frame 10, not frame 0, even if frame 10
was the first sample you had in the cache.
I had not imagined that you could load data that later became availible,
except for data at the end (where NF increases). If the remote data source
again becomes availible, then, the user would just have to hit reload to grab
data that was previously returned as NaN.
> 2. Progressive loading. The user could retrieve frames 0-end skip 10, then
> 1-end skip 10, then 2-end skip 10... The RVector would know which frames
> are missing and still need to be retrieved.
For disk-bound data acess, this would be *way* slower than a straight read.
I don't consider progressive load to be an important feature (though nice),
but if we do implement it, then I am happy with:
do a skip read of 100 samples per vector (max)
display the curves
do a full read
redisplay everything
There would have to be some heuristic to decide if the loading was going to
be slow enough that the 2 step progressive read would be worth while.
> 3. I think frame ranges is a better representation of holes in the data
> than NaNs. It could be useful to visualize missing frame ranges by shading
> in the gui. Frame ranges would work for masking as well.
No promise that INDEX is on the X axis, so there is no general way of shading.
As to using frame ranges rather than masks... In the common case this could
actually be faster. (The NaN-checks could come out of the plot code, and go
into the vector update or something like that).
BUT: the big problem is that NaNs are sample by sample, not frame by frame,
so... the lists would have to be by sample not by frame. Remember that
frames are a datasource/RVectors idea. Classes that use vectors (including
RVectors) in kst don't know a thing about frames - they only know about
samples. So the vectors would want to keep sample lists, not frame lists.
> 4. We could remove the hack in the piolib datasource which moves all the
> vectors to frame 0. I don't know exactly what the dirfile datasource does
> with data that doesn't begin at frame 0 as I don't have such a dirfile, but
> a similar hack likely exists.
getdata takes care of offset frame 0 internally - a line in the format file
says that the first sample of data in the file is at frame XXXX. Asking for
frame XXXX then returns the first frame in the file.
> I have a class in NAD which stores sets of frame ranges (QPair<uint32_t,
> uint32_t>'s actually). It allows you to add ranges, intersect ranges, skip,
> etc. I think it could be used in Kst with some modifications. The header
> file is attached.
>
> I'm interested in people's thoughts on this idea.
>
> Eli
>
> On Tuesday 02 May 2006 18:08, Eli Fidler wrote:
> > I'm in the process of writing the new NAD KstDataSource with uses the
> > local cache. I've run into a case of getData() where the current
> > interface seems insufficient.
> >
> > When using the NAD cache, I may have frames 1 to 50 of a particular field
> > in the cache. If the NAD server connection is unavailable, what should I
> > do if the user requests frames 1 to 100? What about 1 to 50?
> >
> > I assume I should just give them the data in the second case and pretend
> > everything is fine.
Everything is fine, so yes.
> > In the first case, should I return partial data? no
> > data? something else?
Return it all, with NANs where there is no data.
> > What if I have part of the data, but it's not the first part?
If there data is there, return it, if not, get it. If you can't, give NaN.
> > The problem stems from the fact that I can only return -1 for any error,
> > so I can't signal a partial success very easily.
Its not really an error (?).
> > Eli
> > _______________________________________________
> > Kst mailing list
> > Kst at kde.org
> > https://mail.kde.org/mailman/listinfo/kst
More information about the Kst
mailing list